Commit Graph

1353284 Commits

Author SHA1 Message Date
Jakub Kicinski
ab91c140be netlink: specs: remove implicit structs for SNMP counters
uAPI doesn't define structs for the SNMP counters, just enums to index
them as arrays. Switch to the same representation in the spec. C codegen
will soon need all the struct types to actually exist.

Note that the existing definition was broken, anyway, as the first
member should be the number of counters reported.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250506194101.696272-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 18:21:49 -07:00
Jakub Kicinski
6c2422396d netlink: specs: ovs: correct struct names
C codegen will soon support using struct types for binary attrs.
Correct the struct names in OvS specs.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250506194101.696272-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 18:21:49 -07:00
Jakub Kicinski
f22e764d77 netlink: specs: nl80211: drop structs which are not uAPI
C codegen will soon use structs for binary types. A handful of structs
in WiFi carry information elements from the wire, defined by the standard.
The structs are not part of uAPI, so we can't use them in C directly.
We could add them to the uAPI or add some annotation to tell the codegen
to output a local version to the user header. The former seems arbitrary
since we don't expose structs for most of the standard. The latter seems
like a lot of work for a rare occurrence. Drop the struct info for now.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/004030652d592b379e730be2f0344bebc4a03475.camel@sipsolutions.net
Link: https://patch.msgid.link/20250506194101.696272-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 18:21:49 -07:00
Jakub Kicinski
015b5b8ed1 Merge branch 'tools-ynl-gen-split-presence-metadata'
Jakub Kicinski says:

====================
tools: ynl-gen: split presence metadata

The presence metadata indicates whether given attribute was/should be
added to the Netlink message. We have 3 types of such metadata:
 - bit presence for simple values like integers,
 - len presence for variable size attrs, like binary and strings,
 - count for arrays.

Previously this information was spread around with first two types
living in a dedicated sub-struct called _present. The counts resided
directly in the main struct with an n_ prefix.

Reshuffle these an uniformly store them in dedicated sub-structs.
The immediate motivation is that current scheme causes name collisions
for TC.
====================

Link: https://patch.msgid.link/20250505165208.248049-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 18:21:28 -07:00
Jakub Kicinski
d307b9feb8 tools: ynl-gen: move the count into a presence struct too
While we reshuffle the presence members, move the counts as well.
Previously array count members would have been place directly in
the struct, so:

  struct family_op_req {
      struct {
            u32 a:1;
            u32 b:1;
      } _present;
      struct {
            u32 bin;
      } _len;

      u32 a;
      u64 b;
      const unsigned char *bin;
      u32 n_multi;                 << count
      u32 *multi;                  << objects
  };

Since len has been moved to its own presence struct move the count
as well:

  struct family_op_req {
      struct {
            u32 a:1;
            u32 b:1;
      } _present;
      struct {
            u32 bin;
      } _len;
      struct {
            u32 multi;             << count
      } _count;

      u32 a;
      u64 b;
      const unsigned char *bin;
      u32 *multi;                  << objects
  };

This improves the consistency and allows us to remove some hacks
in the codegen. Unlike for len there is no known name collision
with the existing scheme.

Link: https://patch.msgid.link/20250505165208.248049-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 18:21:26 -07:00
Jakub Kicinski
b8ae9f70aa tools: ynl-gen: split presence metadata
Each YNL struct contains the data and a sub-struct indicating which
fields are valid. Something like:

  struct family_op_req {
      struct {
            u32 a:1;
            u32 b:1;
	    u32 bin_len;
      } _present;

      u32 a;
      u64 b;
      const unsigned char *bin;
  };

Note that the bin object 'bin' has a length stored, and that length
has a _len suffix added to the field name. This breaks if there
is a explicit field called bin_len, which is the case for some
TC actions. Move the length fields out of the _present struct,
create a new struct called _len:

  struct family_op_req {
      struct {
            u32 a:1;
            u32 b:1;
      } _present;
      struct {
	    u32 bin;
      } _len;

      u32 a;
      u64 b;
      const unsigned char *bin;
  };

This should prevent name collisions and help with the packing
of the struct.

Unfortunately this is a breaking change, but hopefully the migration
isn't too painful.

Link: https://patch.msgid.link/20250505165208.248049-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 18:21:25 -07:00
Jakub Kicinski
a512be0ecb tools: ynl-gen: rename basic presence from 'bit' to 'present'
Internal change to the code gen. Rename how we indicate a type
has a single bit presence from using a 'bit' string to 'present'.
This is a noop in terms of generated code but will make next
breaking change easier.

Reviewed-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20250505165208.248049-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07 18:21:25 -07:00
David S. Miller
3e52667a9c Merge branch 'lan78xx-phylink-prep'
Oleksij Rempel says:

====================
lan78xx: preparation for PHYLINK conversion

This patch series contains the first part of the LAN78xx driver
refactoring in preparation for converting the driver to use the PHYLINK
framework.

The goal of this initial part is to reduce the size and complexity of
the final PHYLINK conversion by introducing incremental cleanups and
logical separation of concerns, such as:

- Improving error handling in the PHY initialization path
- Refactoring PHY detection and MAC-side configuration
- Moving LED DT configuration to a dedicated helper
- Separating USB link power and flow control setup from the main probe logic
- Extracting PHY interrupt acknowledgment logic

Each patch is self-contained and moves non-PHYLINK-specific logic out of
the way, setting the stage for the actual conversion in a follow-up
patch series.

changes v8 (as split from full v7 00/12 series):
- Split the original series to make review easier
- This part includes only preparation patches; actual PHYLINK
  integration will follow
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-07 12:57:06 +01:00
Oleksij Rempel
ef6a29e867 net: usb: lan78xx: Extract flow control configuration to helper
Move flow control register configuration from
lan78xx_update_flowcontrol() into a new helper function
lan78xx_configure_flowcontrol(). This separates hardware-specific
programming from policy logic and simplifies the upcoming phylink
integration.

The values used in this initial version of
lan78xx_configure_flowcontrol() are taken over as-is from the original
implementation to avoid functional changes. While they may not be
optimal for all USB and link speed combinations, they are known to work
reliably. Optimization of pause time and thresholds based on runtime
conditions can be done in a separate follow-up patch.

The forward declaration of lan78xx_configure_flowcontrol() will also be
removed later during the phylink conversion.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-07 12:57:06 +01:00
Oleksij Rempel
d746e0740b net: usb: lan78xx: Refactor USB link power configuration into helper
Move the USB link power configuration logic from lan78xx_link_reset()
to a new helper function lan78xx_configure_usb(). This simplifies the
main link reset path and isolates USB-specific logic.

The new function handles U1/U2 enablement based on Ethernet link speed,
but only for SuperSpeed-capable devices (LAN7800 and LAN7801). LAN7850,
a High-Speed-only device, is explicitly excluded. A warning is logged
if SuperSpeed is reported unexpectedly for LAN7850.

Add a forward declaration for lan78xx_configure_usb() as preparation for
the upcoming phylink conversion, where it will also be used from the
mac_link_up() callback.

Open questions remain:

- Why is the 1000 Mbps configuration split into two steps (U2 disable,
  then U1 enable), unlike the single-step config used for 10/100 Mbps?

- U1/U2 behavior appears to depend on proper EEPROM configuration.
  There are known devices in the field without EEPROM. Should the driver
  enforce safe defaults in such cases?

Due to lack of USB subsystem expertise, no changes were made to this logic
beyond structural refactoring.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-07 12:57:05 +01:00
Oleksij Rempel
f485849a38 net: usb: lan78xx: Extract PHY interrupt acknowledgment to helper
Move the PHY interrupt acknowledgment logic from lan78xx_link_reset()
to a new helper function lan78xx_phy_int_ack(). This simplifies the
code and prepares for reusing the acknowledgment logic independently
from the full link reset process, such as when using phylink.

No functional change intended.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Thangaraj Samynathan <thangaraj.s@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-07 12:57:05 +01:00
Oleksij Rempel
8ba1f33c55 net: usb: lan78xx: move LED DT configuration to helper
Extract the LED enable logic based on the "microchip,led-modes"
property into a new helper function lan78xx_configure_leds_from_dt().

This simplifies lan78xx_phy_init() and improves modularity.
No functional changes intended.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Thangaraj Samynathan <thangaraj.s@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-07 12:57:05 +01:00
Oleksij Rempel
d39f339d26 net: usb: lan78xx: refactor PHY init to separate detection and MAC configuration
Split out PHY detection into lan78xx_get_phy() and MAC-side setup into
lan78xx_mac_prepare_for_phy(), making the main lan78xx_phy_init() cleaner
and easier to follow.

This improves separation of concerns and prepares the code for a future
transition to phylink. Fixed PHY registration and interface selection
are now handled in lan78xx_get_phy(), while MAC-side delay configuration
is done in lan78xx_mac_prepare_for_phy().

The fixed PHY fallback is preserved for setups like EVB-KSZ9897-1,
where LAN7801 connects directly to a KSZ switch without a standard PHY
or device tree support.

No functional changes intended.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Thangaraj Samynathan <thangaraj.s@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-07 12:57:05 +01:00
Oleksij Rempel
3da0ae5270 net: usb: lan78xx: remove explicit check for missing PHY driver
RGMII timing correctness relies on the PHY providing internal delays.
This is typically ensured via PHY driver, strap pins, or PCB layout.

Explicitly checking for a PHY driver here is unnecessary and non-standard.
This logic applies to all MACs, not just LAN78xx, and should be left to
phylib, phylink, or platform configuration.

Drop the check and rely on standard subsystem behavior.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Thangaraj Samynathan <thangaraj.s@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-07 12:57:05 +01:00
Oleksij Rempel
232aa459aa net: usb: lan78xx: Improve error handling in PHY initialization
Ensure that return values from `lan78xx_write_reg()`,
`lan78xx_read_reg()`, and `phy_find_first()` are properly checked and
propagated. Use `ERR_PTR(ret)` for error reporting in
`lan7801_phy_init()` and replace `-EIO` with `-ENODEV` where appropriate
to provide more accurate error codes.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Thangaraj Samynathan <thangaraj.s@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-07 12:57:05 +01:00
Jakub Kicinski
9daaf19786 Merge tag 'wireless-next-2025-05-06' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:

====================
wireless features, notably

 * stack
   - free SKBTX_WIFI_STATUS flag
   - fixes for VLAN multicast in multi-link
   - improve codel parameters (revert some old twiddling)
 * ath12k
   - Enable AHB support for IPQ5332.
   - Add monitor interface support to QCN9274.
   - Add MLO support to WCN7850.
   - Add 802.11d scan offload support to WCN7850.
 * ath11k
   - Restore hibernation support
 * iwlwifi
   - EMLSR on two 5 GHz links
 * mwifiex
   - cleanups/refactoring

along with many other small features/cleanups

* tag 'wireless-next-2025-05-06' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (177 commits)
  Revert "wifi: iwlwifi: clean up config macro"
  wifi: iwlwifi: move phy_filters to fw_runtime
  wifi: iwlwifi: pcie: make sure to lock rxq->read
  wifi: iwlwifi: add definitions for iwl_mac_power_cmd version 2
  wifi: iwlwifi: clean up config macro
  wifi: iwlwifi: mld: simplify iwl_mld_rx_fill_status()
  wifi: iwlwifi: mld: rx: simplify channel handling
  wifi: iwlwifi: clean up band in RX metadata
  wifi: iwlwifi: mld: skip unknown FW channel load values
  wifi: iwlwifi: define API for external FSEQ images
  wifi: iwlwifi: mld: allow EMLSR on separated 5 GHz subbands
  wifi: iwlwifi: mld: use cfg80211_chandef_get_width()
  wifi: iwlwifi: mld: fix iwl_mld_emlsr_disallowed_with_link() return
  wifi: iwlwifi: mld: clarify variable type
  wifi: iwlwifi: pcie: add support for the reset handshake in MSI
  wifi: mac80211_hwsim: Prevent tsf from setting if beacon is disabled
  wifi: mac80211: restructure tx profile retrieval for MLO MBSSID
  wifi: nl80211: add link id of transmitted profile for MLO MBSSID
  wifi: ieee80211: Add helpers to fetch EMLSR delay and timeout values
  wifi: mac80211: update ML STA with EML capabilities
  ...

====================

Link: https://patch.msgid.link/20250506174656.119970-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-06 19:04:42 -07:00
Jakub Kicinski
a7371be8c8 Merge branch 'devlink-sanitize-variable-typed-attributes'
Jiri Pirko says:

====================
devlink: sanitize variable typed attributes

This is continuation based on first two patches of
https://lore.kernel.org/20250425214808.507732-1-saeed@kernel.org

Better to take it as a separate patchset, as the rest of the original
patchset is probably settled.

This patchset is taking care of incorrect usage of internal NLA_* values
in uapi, introduces new enum (in patch #2) that shadows NLA_* values and
makes that part of UAPI.

The last two patches removes unnecessary translations with maintaining
clear devlink param driver api.
====================

Link: https://patch.msgid.link/20250505114513.53370-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-06 18:21:11 -07:00
Jiri Pirko
88debb521f devlink: use DEVLINK_VAR_ATTR_TYPE_* instead of NLA_* in fmsg
Use newly introduced DEVLINK_VAR_ATTR_TYPE_* enum values instead of
internal NLA_* in fmsg health reporter code.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20250505114513.53370-5-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-06 18:21:11 -07:00
Jiri Pirko
f9e78932ea devlink: avoid param type value translations
Assign DEVLINK_PARAM_TYPE_* enum values to DEVLINK_VAR_ATTR_TYPE_* to
ensure the same values are used internally and in UAPI. Benefit from
that by removing the value translations.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20250505114513.53370-4-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-06 18:21:11 -07:00
Jiri Pirko
429ac62114 devlink: define enum for attr types of dynamic attributes
Devlink param and health reporter fmsg use attributes with dynamic type
which is determined according to a different type. Currently used values
are NLA_*. The problem is, they are not part of UAPI. They may change
which would cause a break.

To make this future safe, introduce a enum that shadows NLA_* values in
it and is part of UAPI.

Also, this allows to possibly carry types that are unrelated to NLA_*
values.

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20250505114513.53370-3-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-06 18:21:11 -07:00
Jiri Pirko
37006af675 tools: ynl-gen: allow noncontiguous enums
in case the enum has holes, instead of hard stop, generate a validation
callback to check valid enum values.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20250505114513.53370-2-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-06 18:21:08 -07:00
David Wei
df6a69bc8f io_uring/zcrx: selftests: fix setting ntuple rule into rss
Fix ethtool syntax for setting ntuple rule into rss. It should be
`context' instead of `action'.

Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20250503043007.857215-1-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-06 17:51:11 -07:00
Paolo Abeni
075001c9eb Merge branch 'net-phy-realtek-add-support-for-phy-leds'
Michael Klein says:

====================
net: phy: realtek: Add support for PHY LEDs

Changes in V7:
- Remove some unused macros (patch 1)
- Add more register defines for RTL8211F (patch 3)
- Revise macro definition order once more (patch 4)

Changes in V6:
- fix macro definition order (patch 1)
- introduce two more register defines (patch 2)

Changes in V5:
- Split cleanup patch and improve code formatting

Changes in V4:
- Change (!ret) to (ret == 0)
- Replace set_bit() by __set_bit()

Changes in V3:
- move definition of rtl8211e_read_ext_page() to patch 2
- Wrap overlong lines

Changes in V2:
- Designate to net-next
- Add ExtPage access cleanup patch as suggested by Andrew Lunn
====================

Link: https://patch.msgid.link/20250504172916.243185-1-michael@fossekall.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-06 13:31:29 +02:00
Michael Klein
708686132b net: phy: realtek: Add support for PHY LEDs on RTL8211E
Like the RTL8211F, the RTL8211E PHY supports up to three LEDs.
Add netdev trigger support for them, too.

Signed-off-by: Michael Klein <michael@fossekall.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250504172916.243185-7-michael@fossekall.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-06 13:31:27 +02:00
Michael Klein
be1cc96ddf net: phy: realtek: use __set_bit() in rtl8211f_led_hw_control_get()
rtl8211f_led_hw_control_get() does not need atomic bit operations,
replace set_bit() by __set_bit().

Signed-off-by: Michael Klein <michael@fossekall.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250504172916.243185-6-michael@fossekall.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-06 13:31:27 +02:00
Michael Klein
8c4d017265 net: phy: realtek: Group RTL82* macro definitions
Group macro definitions by PHY in lexicographic order. Within each PHY
block, definitions are order by page number and then register number.

Signed-off-by: Michael Klein <michael@fossekall.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250504172916.243185-5-michael@fossekall.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-06 13:31:27 +02:00
Michael Klein
12d40df259 net: phy: realtek: add RTL8211F register defines
Add some more defines for RTL8211F page and register numbers.

Signed-off-by: Michael Klein <michael@fossekall.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250504172916.243185-4-michael@fossekall.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-06 13:31:26 +02:00
Michael Klein
7c6fa3ffd2 net: phy: realtek: Clean up RTL821x ExtPage access
Factor out RTL8211E extension page access code to
rtl821x_modify_ext_page() and clean up rtl8211e_config_init()

Signed-off-by: Michael Klein <michael@fossekall.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250504172916.243185-3-michael@fossekall.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-06 13:31:26 +02:00
Michael Klein
f3b265358b net: phy: realtek: remove unsed RTL821x_PHYSR* macros
These macros have there since the first revision but were never used, so
let's just remove them.

Signed-off-by: Michael Klein <michael@fossekall.de>
Link: https://patch.msgid.link/20250504172916.243185-2-michael@fossekall.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-06 13:31:26 +02:00
Paolo Abeni
5b5f1efb72 Merge tag 'nf-next-25-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Apparently, nf_conntrack_bridge changes the way in which fragments
   are handled, dealing to packet drop. From Huajian Yang.

2) Add a selftest to stress the conntrack subsystem, from Florian Westphal.

3) nft_quota depletion is off-by-one byte, Zhongqiu Duan.

4) Rewrites the procfs to read the conntrack table to speed it up,
   from Florian Westphal.

5) Two patches to prevent overflow in nft_pipapo lookup table and to
   clamp the maximum bucket size.

6) Update nft_fib selftest to check for loopback packet bypass.
   From Florian Westphal.

netfilter pull request 25-05-06

* tag 'nf-next-25-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  selftests: netfilter: nft_fib.sh: check lo packets bypass fib lookup
  netfilter: nft_set_pipapo: clamp maximum map bucket size to INT_MAX
  netfilter: nft_set_pipapo: prevent overflow in lookup table allocation
  netfilter: nf_conntrack: speed up reads from nf_conntrack proc file
  netfilter: nft_quota: match correctly when the quota just depleted
  selftests: netfilter: add conntrack stress test
  netfilter: bridge: Move specific fragmented packet to slow_path instead of dropping it
====================

Link: https://patch.msgid.link/20250505234151.228057-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-06 13:19:01 +02:00
Mohsin Bashir
fbaeb7b0f0 eth: fbnic: fix tx_dropped counting
Fix the tracking of rtnl_link_stats.tx_dropped. The counter
`tmi.drop.frames` is being double counted whereas, the counter
`tti.cm_drop.frames` is being skipped.

Fixes: f2957147ae ("eth: fbnic: add support for TTI HW stats")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250503020145.1868252-1-mohsin.bashr@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-06 11:07:36 +02:00
Jakub Kicinski
8f0ae19346 selftests: net: exit cleanly on SIGTERM / timeout
ksft runner sends 2 SIGTERMs in a row if a test runs out of time.
Handle this in a similar way we handle SIGINT - cleanup and stop
running further tests.

Because we get 2 signals we need a bit of logic to ignore
the subsequent one, they come immediately one after the other
(due to commit 9616cb34b0 ("kselftest/runner.sh: Propagate SIGTERM
to runner child")).

This change makes sure we run cleanup (scheduled defer()s)
and also print a stack trace on SIGTERM, which doesn't happen
by default. Tests occasionally hang in NIPA and it's impossible
to tell what they are waiting from or doing.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250503011856.46308-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-06 11:04:58 +02:00
Paolo Abeni
90131a9b06 Merge branch 'net-ibmveth-make-ibmveth-use-new-reset-function-and-new-kunit-testsg'
Dave Marquardt says:

====================
net: ibmveth: Make ibmveth use new reset function and new KUnit testsg

- Fixed struct ibmveth_adapter indentation
- Made ibmveth driver use WARN_ON with recovery rather than BUG_ON. Some
  recovery code schedules a reset through new function ibmveth_reset. Also
  removed a conflicting and unneeded forward declaration.
- Added KUnit tests for some areas changed by the WARN_ON changes.
====================

Link: https://patch.msgid.link/20250501194944.283729-1-davemarq@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-06 10:24:41 +02:00
Dave Marquardt
8a97de243d net: ibmveth: added KUnit tests for some buffer pool functions
Added KUnit tests for ibmveth_remove_buffer_from_pool and
ibmveth_rxq_get_buffer under new IBMVETH_KUNIT_TEST config option.

Signed-off-by: Dave Marquardt <davemarq@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250501194944.283729-4-davemarq@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-06 10:24:38 +02:00
Dave Marquardt
2c91e2319e net: ibmveth: Reset the adapter when unexpected states are detected
Reset the adapter through new function ibmveth_reset, called in
WARN_ON situations. Removed conflicting and unneeded forward
declaration.

Signed-off-by: Dave Marquardt <davemarq@linux.ibm.com>
Link: https://patch.msgid.link/20250501194944.283729-3-davemarq@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-06 10:24:38 +02:00
Dave Marquardt
b309785154 net: ibmveth: Indented struct ibmveth_adapter correctly
Made struct ibmveth_adapter follow indentation rules

Signed-off-by: Dave Marquardt <davemarq@linux.ibm.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250501194944.283729-2-davemarq@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-06 10:24:37 +02:00
Jon Kohler
8c2e6b26ff vhost/net: Defer TX queue re-enable until after sendmsg
In handle_tx_copy, TX batching processes packets below ~PAGE_SIZE and
batches up to 64 messages before calling sock->sendmsg.

Currently, when there are no more messages on the ring to dequeue,
handle_tx_copy re-enables kicks on the ring *before* firing off the
batch sendmsg. However, sock->sendmsg incurs a non-zero delay,
especially if it needs to wake up a thread (e.g., another vhost worker).

If the guest submits additional messages immediately after the last ring
check and disablement, it triggers an EPT_MISCONFIG vmexit to attempt to
kick the vhost worker. This may happen while the worker is still
processing the sendmsg, leading to wasteful exit(s).

This is particularly problematic for single-threaded guest submission
threads, as they must exit, wait for the exit to be processed
(potentially involving a TTWU), and then resume.

In scenarios like a constant stream of UDP messages, this results in a
sawtooth pattern where the submitter frequently vmexits, and the
vhost-net worker alternates between sleeping and waking.

A common solution is to configure vhost-net busy polling via userspace
(e.g., qemu poll-us). However, treating the sendmsg as the "busy"
period by keeping kicks disabled during the final sendmsg and
performing one additional ring check afterward provides a significant
performance improvement without any excess busy poll cycles.

If messages are found in the ring after the final sendmsg, requeue the
TX handler. This ensures fairness for the RX handler and allows
vhost_run_work_list to cond_resched() as needed.

Test Case
    TX VM: taskset -c 2 iperf3  -c rx-ip-here -t 60 -p 5200 -b 0 -u -i 5
    RX VM: taskset -c 2 iperf3 -s -p 5200 -D
    6.12.0, each worker backed by tun interface with IFF_NAPI setup.
    Note: TCP side is largely unchanged as that was copy bound

6.12.0 unpatched
    EPT_MISCONFIG/second: 5411
    Datagrams/second: ~382k
    Interval         Transfer     Bitrate         Lost/Total Datagrams
    0.00-30.00  sec  15.5 GBytes  4.43 Gbits/sec  0/11481630 (0%)  sender

6.12.0 patched
    EPT_MISCONFIG/second: 58 (~93x reduction)
    Datagrams/second: ~650k  (~1.7x increase)
    Interval         Transfer     Bitrate         Lost/Total Datagrams
    0.00-30.00  sec  26.4 GBytes  7.55 Gbits/sec  0/19554720 (0%)  sender

Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jon Kohler <jon@nutanix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20250501020428.1889162-1-jon@nutanix.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 18:18:41 -07:00
Ruben Wauters
c2dbda0766 ipv4: ip_tunnel: Replace strcpy use with strscpy
Use of strcpy is decpreated, replaces the use of strcpy with strscpy as
recommended.

strscpy was chosen as it requires a NUL terminated non-padded string,
which is the case here.

I am aware there is an explicit bounds check above the second instance,
however using strscpy protects against buffer overflows in any future
code, and there is no good reason I can see to not use it.

I have also replaced the scrscpy above that had 3 params with the
version using 2 params. These are functionally equivalent, but it is
cleaner to have both using 2 params.

Signed-off-by: Ruben Wauters <rubenru09@aol.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250501202935.46318-1-rubenru09@aol.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 18:16:11 -07:00
Jakub Kicinski
f267eeeec8 Merge branch 'net-ethtool-introduce-ethnl-dump-helpers'
Maxime Chevallier says:

====================
net: ethtool: Introduce ethnl dump helpers

This is V8 for per-phy DUMP helpers, improving support for ->dumpit()
operations for PHY targetting commands.

This V8 fixes some issues spotted by Jakub (thanks !) on the multi-part
DUMP sequence. The netdev reftracking was reworked to make sure that
during a filtered DUMP, we only keep a ref on the netdev during
individual .dumpit() calls.

v1: https://lore.kernel.org/20250305141938.319282-1-maxime.chevallier@bootlin.com
v2: https://lore.kernel.org/20250308155440.267782-1-maxime.chevallier@bootlin.com
v3: https://lore.kernel.org/20250313182647.250007-1-maxime.chevallier@bootlin.com
v4: https://lore.kernel.org/20250324104012.367366-1-maxime.chevallier@bootlin.com
v5: https://lore.kernel.org/20250410123350.174105-1-maxime.chevallier@bootlin.com
v6: https://lore.kernel.org/20250415085155.132963-1-maxime.chevallier@bootlin.com
v7: https://lore.kernel.org/20250422161717.164440-1-maxime.chevallier@bootlin.com
====================

Link: https://patch.msgid.link/20250502085242.248645-1-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 17:17:43 -07:00
Maxime Chevallier
63fb100bf5 net: ethtool: netlink: Use netdev_hold for dumpit() operations
Move away from dev_hold and use netdev_hold with a local reftracker when
performing a DUMP on each netdev.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250502085242.248645-4-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 17:17:41 -07:00
Maxime Chevallier
9dd2ad5e92 net: ethtool: phy: Convert the PHY_GET command to generic phy dump
Now that we have an infrastructure in ethnl for perphy DUMPs, we can get
rid of the custom ->doit and ->dumpit to deal with PHY listing commands.

As most of the code was custom, this basically means re-writing how we
deal with PHY listing.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250502085242.248645-3-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 17:17:40 -07:00
Maxime Chevallier
172265b44c net: ethtool: Introduce per-PHY DUMP operations
ethnl commands that target a phy_device need a DUMP implementation that
will fill the reply for every PHY behind a netdev. We therefore need to
iterate over the dev->topo to list them.

When multiple PHYs are behind the same netdev, it's also useful to
perform DUMP with a filter on a given netdev, to get the capability of
every PHY.

Implement dedicated genl ->start(), ->dumpit() and ->done() operations
for PHY-targetting command, allowing filtered dumps and using a dump
context that keep track of the PHY iteration for multi-message dump.

PSE-PD and PLCA are converted to this new set of ops along the way.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250502085242.248645-2-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 17:17:40 -07:00
Haiyue Wang
953d9480f7 selftests: iou-zcrx: Clean up build warnings for error format
Clean up two build warnings:

[1]

iou-zcrx.c: In function ‘process_recvzc’:
iou-zcrx.c:263:37: warning: too many arguments for format [-Wformat-extra-args]
  263 |                         error(1, 0, "payload mismatch at ", i);
      |                                     ^~~~~~~~~~~~~~~~~~~~~~

[2] Use "%zd" for ssize_t type as better

iou-zcrx.c: In function ‘run_client’:
iou-zcrx.c:357:47: warning: format ‘%d’ expects argument of type ‘int’, but argument 4 has type ‘ssize_t’ {aka ‘long int’} [-Wformat=]
  357 |                         error(1, 0, "send(): %d", sent);
      |                                              ~^   ~~~~
      |                                               |   |
      |                                               int ssize_t {aka long int}
      |                                              %ld

Signed-off-by: Haiyue Wang <haiyuewa@163.com>
Reviewed-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20250502175136.1122-1-haiyuewa@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 16:55:15 -07:00
Jakub Kicinski
d8b1a33ae8 Merge branch 'selftests-mptcp-increase-code-coverage'
Matthieu Baerts says:

====================
selftests: mptcp: increase code coverage

Here are various patches slightly improving MPTCP code coverage:

- Patch 1: avoid a harmless 'grep: write error' warning.

- Patch 2: use getaddrinfo() with IPPROTO_MPTCP in more places.

- Patch 3-6: prepare and add support to get info for a specific subflow
  when giving the 5-tuple.

- Patch 7: validate the previous patch and cover "subflow_get_info_size"
  in the kernel code (mptcp_diag.c).
====================

Link: https://patch.msgid.link/20250502-net-next-mptcp-sft-inc-cover-v1-0-68eec95898fb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 16:53:05 -07:00
Gang Yan
110f8f77fd selftests: mptcp: add chk_sublfow in diag.sh
This patch aims to add chk_dump_subflow in diag.sh. The subflow's
info can be obtained through "ss -tin", then use the 'mptcp_diag'
to verify the token in subflow_info.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/524
Co-developed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250502-net-next-mptcp-sft-inc-cover-v1-7-68eec95898fb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 16:52:00 -07:00
Gang Yan
c7ac7452df selftests: mptcp: add helpers to get subflow_info
This patch adds 'get_subflow_info' in 'mptcp_diag', which can check whether
a TCP connection is an MPTCP subflow based on the "INET_ULP_INFO_MPTCP"
with tcp_diag method.

The helper 'print_subflow_info' in 'mptcp_diag' can print the subflow_filed
of an MPTCP subflow for further checking the 'subflow_info' through
inet_diag method.

The example of the whole output should be:

  $ ./mptcp_diag -s "127.0.0.1:10000 127.0.0.1:38984"
  127.0.0.1:10000 -> 127.0.0.1:38984
  It's a mptcp subflow, the subflow info:
   flags:Mec token:0000(id:0)/4278e77e(id:0) seq:9288466187236176036 \
   sfseq:1 ssnoff:2317083055 maplen:215

Co-developed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250502-net-next-mptcp-sft-inc-cover-v1-6-68eec95898fb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 16:52:00 -07:00
Gang Yan
caa6811cca selftests: mptcp: refactor NLMSG handling with 'proto'
This patch introduces the '__u32 proto' variable to the 'send_query' and
'recv_nlmsg' functions for further extending function.

In the 'send_query' function, the inclusion of this variable makes the
structure clearer and more readable.

In the 'recv_nlmsg' function, the '__u32 proto' variable ensures that
the 'diag_info' field remains unmodified when processing IPPROTO_TCP data,
thereby preventing unintended transformation into 'mptcp_info' format.

While at it, increment iovlen directly when an item is added to simplify
this portion of the code and improve its readaility.

Co-developed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250502-net-next-mptcp-sft-inc-cover-v1-5-68eec95898fb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 16:52:00 -07:00
Gang Yan
3fea468dca selftests: mptcp: refactor send_query parameters for code clarity
This patch use 'inet_diag_req_v2' instead of 'token' as parameters of
send_query, and construct the req in 'get_mptcpinfo'.

This modification enhances the clarity of the code, and prepare for the
dump_subflow_info.

Co-developed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250502-net-next-mptcp-sft-inc-cover-v1-4-68eec95898fb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 16:52:00 -07:00
Gang Yan
cd732d5110 selftests: mptcp: add struct params in mptcp_diag
This patch adds a struct named 'params' to save 'target_token' and other
future parameters. This structure facilitates future function expansions.

Co-developed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250502-net-next-mptcp-sft-inc-cover-v1-3-68eec95898fb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 16:52:00 -07:00
Geliang Tang
dd367e81b7 selftests: mptcp: sockopt: use IPPROTO_MPTCP for getaddrinfo
getaddrinfo MPTCP is recently supported in glibc and IPPROTO_MPTCP for
getaddrinfo is used in mptcp_connect.c. But in mptcp_sockopt.c and
mptcp_inq.c, IPPROTO_TCP are still used for getaddrinfo, So this patch
updates them.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250502-net-next-mptcp-sft-inc-cover-v1-2-68eec95898fb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-05 16:51:59 -07:00