linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-22 12:05:09 -04:00

Author	SHA1	Message	Date
Alexander Duyck	284a67d59f	fbnic: Pass fbnic_dev instead of netdev to __fbnic_set/clear_rx_mode To make the __fbnic_set_rx_mode and __fbnic_clear_rx_mode calls usable by more points in the code we can make to that they expect a fbnic_dev pointer instead of a netdev pointer. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/175623749436.2246365.6068665520216196789.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-28 14:51:07 +02:00
Alexander Duyck	cf79bd4495	fbnic: Move promisc_sync out of netdev code and into RPC path In order for us to support the BMC possibly connecting, disconnecting, and then reconnecting we need to be able to support entities outside of just the NIC setting up promiscuous mode as the BMC can use a multicast promiscuous setup. To support that we should move the promisc_sync code out of the netdev and into the RPC section of the driver so that it is reachable from more paths. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/175623748769.2246365.2130394904175851458.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-28 14:51:07 +02:00
Paolo Abeni	84482586b2	Merge branch 'add-si3474-pse-controller-driver' Piotr Kubik says: ==================== Add Si3474 PSE controller driver From: Piotr Kubik <piotr.kubik@adtran.com> These patch series provide support for Skyworks Si3474 I2C Power Sourcing Equipment controller. Based on the TPS23881 driver code. Supported features of Si3474: - get port status, - get port power, - get port voltage, - enable/disable port power Signed-off-by: Piotr Kubik <piotr.kubik@adtran.com> ==================== Link: https://patch.msgid.link/6af537dc-8a52-4710-8a18-dcfbb911cf23@adtran.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-28 14:42:02 +02:00
Piotr Kubik	a2317231df	net: pse-pd: Add Si3474 PSE controller driver Add a driver for the Skyworks Si3474 I2C Power Sourcing Equipment controller. Driver supports basic features of Si3474 IC: - get port status, - get port power, - get port voltage, - enable/disable port power. Only 4p configurations are supported at this moment. Signed-off-by: Piotr Kubik <piotr.kubik@adtran.com> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/9b72c8cd-c8d3-4053-9c80-671b9481d166@adtran.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-28 14:41:59 +02:00
Piotr Kubik	7cb4d28e11	dt-bindings: net: pse-pd: Add bindings for Si3474 PSE controller Add the Si3474 I2C Power Sourcing Equipment controller device tree bindings documentation. Signed-off-by: Piotr Kubik <piotr.kubik@adtran.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/71a67c6f-6fce-49c7-96ec-554602dbd4f1@adtran.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-28 14:41:59 +02:00
Paolo Abeni	e250798586	Merge branch 'net-better-drop-accounting' Eric Dumazet says: ==================== net: better drop accounting Incrementing sk->sk_drops for every dropped packet can cause serious cache line contention under DOS. Add optional sk->sk_drop_counters pointer so that protocols can opt-in to use two dedicated cache lines to hold drop counters. Convert UDP and RAW to use this infrastructure. Tested on UDP (see patch 4/5 for details) Before: nstat -n ; sleep 1 ; nstat \| grep Udp Udp6InDatagrams 615091 0.0 Udp6InErrors 3904277 0.0 Udp6RcvbufErrors 3904277 0.0 After: nstat -n ; sleep 1 ; nstat \| grep Udp Udp6InDatagrams 816281 0.0 Udp6InErrors 7497093 0.0 Udp6RcvbufErrors 7497093 0.0 ==================== Link: https://patch.msgid.link/20250826125031.1578842-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-28 13:14:52 +02:00
Eric Dumazet	b81aa23234	inet: raw: add drop_counters to raw sockets When a packet flood hits one or more RAW sockets, many cpus have to update sk->sk_drops. This slows down other cpus, because currently sk_drops is in sock_write_rx group. Add a socket_drop_counters structure to raw sockets. Using dedicated cache lines to hold drop counters makes sure that consumers no longer suffer from false sharing if/when producers only change sk->sk_drops. This adds 128 bytes per RAW socket. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250826125031.1578842-6-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-28 13:14:50 +02:00
Eric Dumazet	51132b99f0	udp: add drop_counters to udp socket When a packet flood hits one or more UDP sockets, many cpus have to update sk->sk_drops. This slows down other cpus, because currently sk_drops is in sock_write_rx group. Add a socket_drop_counters structure to udp sockets. Using dedicated cache lines to hold drop counters makes sure that consumers no longer suffer from false sharing if/when producers only change sk->sk_drops. This adds 128 bytes per UDP socket. Tested with the following stress test, sending about 11 Mpps to a dual socket AMD EPYC 7B13 64-Core. super_netperf 20 -t UDP_STREAM -H DUT -l10 -- -n -P,1000 -m 120 Note: due to socket lookup, only one UDP socket is receiving packets on DUT. Then measure receiver (DUT) behavior. We can see both consumer and BH handlers can process more packets per second. Before: nstat -n ; sleep 1 ; nstat \| grep Udp Udp6InDatagrams 615091 0.0 Udp6InErrors 3904277 0.0 Udp6RcvbufErrors 3904277 0.0 After: nstat -n ; sleep 1 ; nstat \| grep Udp Udp6InDatagrams 816281 0.0 Udp6InErrors 7497093 0.0 Udp6RcvbufErrors 7497093 0.0 Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250826125031.1578842-5-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-28 13:14:50 +02:00
Eric Dumazet	c51613fa27	net: add sk->sk_drop_counters Some sockets suffer from heavy false sharing on sk->sk_drops, and fields in the same cache line. Add sk->sk_drop_counters to: - move the drop counter(s) to dedicated cache lines. - Add basic NUMA awareness to these drop counter(s). Following patches will use this infrastructure for UDP and RAW sockets. sk_clone_lock() is not yet ready, it would need to properly set newsk->sk_drop_counters if we plan to use this for TCP sockets. v2: used Paolo suggestion from https://lore.kernel.org/netdev/8f09830a-d83d-43c9-b36b-88ba0a23e9b2@redhat.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250826125031.1578842-4-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-28 13:14:50 +02:00
Eric Dumazet	cb4d5a6eb6	net: add sk_drops_skbadd() helper Existing sk_drops_add() helper is renamed to sk_drops_skbadd(). Add sk_drops_add() and convert sk_drops_inc() to use it. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250826125031.1578842-3-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-28 13:14:50 +02:00
Eric Dumazet	f86f42ed2c	net: add sk_drops_read(), sk_drops_inc() and sk_drops_reset() helpers We want to split sk->sk_drops in the future to reduce potential contention on this field. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250826125031.1578842-2-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-28 13:14:50 +02:00
Jakub Kicinski	c2a756891b	uapi: wrap compiler_types.h in an ifdef instead of the implicit strip The uAPI stddef header includes compiler_types.h, a kernel-only header, to make sure that kernel definitions of annotations like __counted_by() take precedence. There is a hack in scripts/headers_install.sh which strips includes of compiler.h and compiler_types.h when installing uAPI headers. While explicit handling makes sense for compiler.h, which is included all over the uAPI, compiler_types.h is only included by stddef.h (within the uAPI, obviously it's included in kernel code a lot). Remove the stripping from scripts/headers_install.sh and wrap the include of compiler_types.h in #ifdef __KERNEL__ instead. This should be equivalent functionally, but is easier to understand to a casual reader of the code. It also makes it easier to work with kernel headers directly from under tools/ Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250825201828.2370083-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-28 13:06:48 +02:00
Jakub Kicinski	d4854be4ec	Merge branch 'eth-fbnic-extend-hw-stats-support' Jakub Kicinski says: ==================== eth: fbnic: Extend hw stats support Mohsin says: Extend hardware stats support for fbnic by adding the ability to reset hardware stats when the device experience a reset due to a PCI error and include MAC stats in the hardware stats reset. Additionally, expand hardware stats coverage to include FEC, PHY, and Pause stats. v1: https://lore.kernel.org/20250822164731.1461754-1-kuba@kernel.org ==================== Link: https://patch.msgid.link/20250825200206.2357713-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:56:28 -07:00
Mohsin Bashir	e9faf4db5f	eth: fbnic: Add pause stats support Add support to read pause stats for fbnic. Unlike FEC and PCS stats, pause stats won't wrap, do not fetch them under the service task. Since, they are exclusively accessed via the ethtool API, don't include them in fbnic_get_hw_stats(). ]# ethtool -I -a eth0 Pause parameters for eth0: Autonegotiate: on RX: off TX: off Statistics: tx_pause_frames: 0 rx_pause_frames: 0 Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250825200206.2357713-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:56:19 -07:00
Mohsin Bashir	33c493791b	eth: fbnic: Read PHY stats via the ethtool API Provide support to read PHY stats (FEC and PCS) via the ethtool API. ]# ethtool -I --show-fec eth0 FEC parameters for eth0: Supported/Configured FEC encodings: RS Active FEC encoding: RS Statistics: corrected_blocks: 0 uncorrectable_blocks: 0 ]# ethtool -S eth0 --groups eth-phy Standard stats for eth0: eth-phy-SymbolErrorDuringCarrier: 0 Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250825200206.2357713-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:56:19 -07:00
Mohsin Bashir	df4c5d9a29	eth: fbnic: Fetch PHY stats from device Add support to fetch PHY stats consisting of PCS and FEC stats from the device. When reading the stats counters, the lo part is read first, which latches the hi part to ensure consistent reading of the stats counter. FEC and PCS stats can wrap depending on the access frequency. To prevent wrapping, fetch these stats periodically under the service task. Also to maintain consistency fetch these stats along with other 32b stats under __fbnic_get_hw_stats32(). Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250825200206.2357713-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:56:19 -07:00
Mohsin Bashir	bcf54e5d7c	eth: fbnic: Reset MAC stats Reset the MAC stats as part of the hardware stats reset to ensure consistency. Currently, hardware stats are reset during device bring-up and upon experiencing PCI errors; however, MAC stats are being skipped during these resets. When fbnic_reset_hw_stats() is called upon recovering from PCI error, MAC stats are accessed outside the rtnl_lock. The only other access to MAC stats is via the ethtool API, which is protected by rtnl_lock. This can result in concurrent access to MAC stats and a potential race. Protect the fbnic_reset_hw_stats() call in __fbnic_pm_attach() with rtnl_lock to avoid this. Note that fbnic_reset_hw_mac_stats() is called outside the hardware stats lock which protects access to the fbnic_hw_stats. This is intentional because MAC stats are fetched from the device outside this lock and are exclusively read via the ethtool API. Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250825200206.2357713-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:56:19 -07:00
Mohsin Bashir	b1161b1863	eth: fbnic: Reset hw stats upon PCI error Upon experiencing a PCI error, fbnic reset the device to recover from the failure. Reset the hardware stats as part of the device reset to ensure accurate stats reporting. Note that the reset is not really resetting the aggregate value to 0, which may result in a spike for a system collecting deltas in stats. Rather, the reset re-latches the current value as previous, in case HW got reset. Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250825200206.2357713-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:56:19 -07:00
Mohsin Bashir	2ee5c8c0c2	eth: fbnic: Move hw_stats_lock out of fbnic_dev Move hw_stats_lock out of fbnic_dev to a more appropriate struct fbnic_hw_stats since the only use of this lock is to protect access to the hardware stats. While at it, enclose the lock and stats initialization in a single init call. Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250825200206.2357713-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:56:18 -07:00
Jakub Kicinski	ef5ca97293	Merge branch 'macsec-replace-custom-netlink-attribute-checks-with-policy-level-checks' Sabrina Dubroca says: ==================== macsec: replace custom netlink attribute checks with policy-level checks We can simplify attribute validation a lot by describing the accepted ranges more precisely in the policies, using NLA_POLICY_MAX etc. Some of the checks still need to be done later on, because the attribute length and acceptable range can vary based on values that can't be known when the policy is validated (cipher suite determines the key length and valid ICV length, presence of XPN changes the PN length, detection of duplicate SCIs or ANs, etc). As a bonus, we get a few extack messages from the policy validation. I'll add extack to the rest of the checks (mostly in the genl commands) in an future series. v1: https://lore.kernel.org/netdev/cover.1664379352.git.sd@queasysnail.net ==================== Link: https://patch.msgid.link/cover.1756202772.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:34:56 -07:00
Sabrina Dubroca	db9dfc4d30	macsec: replace custom check on IFLA_MACSEC_ENCODING_SA with NLA_POLICY_MAX Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/085bc642136cf3d267ddbb114e6f0c4a9247c797.1756202772.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:34:54 -07:00
Sabrina Dubroca	b46f5ddb40	macsec: replace custom checks for IFLA_MACSEC_* flags with NLA_POLICY_MAX Those are all off/on flags. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/95707fb36adc1904fa327bc8f4eb055895aa6eff.1756202772.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:34:53 -07:00
Sabrina Dubroca	b81d1e9588	macsec: validate IFLA_MACSEC_VALIDATION with NLA_POLICY_MAX Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/629efe0b2150b30abc6472074018cbd521b46578.1756202772.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:34:53 -07:00
Sabrina Dubroca	4d844cb1ea	macsec: use NLA_POLICY_VALIDATE_FN to validate IFLA_MACSEC_CIPHER_SUITE Unfortunately, since the value of MACSEC_DEFAULT_CIPHER_ID doesn't fit near the others, we can't use a simple range in the policy. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/015e43ade9548c7682c9739087eba0853b3a1331.1756202772.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:34:53 -07:00
Sabrina Dubroca	17882d23a6	macsec: replace custom checks on IFLA_MACSEC_ICV_LEN with NLA_POLICY_RANGE The existing checks already force this range. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/398cf16191a634ab343ecd811c481d7bdd44a933.1756202772.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:34:53 -07:00
Sabrina Dubroca	35a35279e8	macsec: add NLA_POLICY_MAX for MACSEC_OFFLOAD_ATTR_TYPE and IFLA_MACSEC_OFFLOAD This is equivalent to the existing checks allowing either MACSEC_OFFLOAD_OFF or calling macsec_check_offload. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/37e1f1716f1d1d46d3d06c52317564b393fe60e6.1756202772.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:34:53 -07:00
Sabrina Dubroca	80810c89d3	macsec: remove validate_add_rxsc It's not doing much anymore. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/218147f2f11cab885abc86b779dcefcd3208a2f8.1756202772.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:34:53 -07:00
Sabrina Dubroca	82f3116132	macsec: use NLA_UINT for MACSEC_SA_ATTR_PN MACSEC_SA_ATTR_PN is either a u32 or a u64, we can now use NLA_UINT for this instead of a custom binary type. We can then use a min check within the policy. We need to keep the length checks done in macsec_{add,upd}_{rx,tx}sa based on whether the device is set up for XPN (with 64b PNs instead of 32b). On the dump side, keep the existing custom code as userspace may expect a u64 when using XPN, and nla_put_uint may only output a u32 attribute if the value fits. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/c9d32bd479cd4464e09010fbce1becc75377c8a0.1756202772.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:34:52 -07:00
Sabrina Dubroca	15a700a842	macsec: use NLA_POLICY_MAX_LEN for MACSEC_SA_ATTR_KEY Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/192227ca0047b643d6530ece0a3679998b010fac.1756202772.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:34:52 -07:00
Sabrina Dubroca	d29ae0d775	macsec: replace custom checks on MACSEC_SA_ATTR_KEYID with NLA_POLICY_EXACT_LEN The existing checks already specify that MACSEC_SA_ATTR_KEYID must have length MACSEC_KEYID_LEN. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/c4c113328962aae4146183e7a27854e854c796fb.1756202772.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:34:52 -07:00
Sabrina Dubroca	8cf22afc15	macsec: replace custom checks on MACSEC_SA_ATTR_SALT with NLA_POLICY_EXACT_LEN The existing checks already specify that MACSEC_SA_ATTR_SALT must have length MACSEC_SALT_LEN. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/9699c5fd72322118b164cc8777fadabcce3b997c.1756202772.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:34:52 -07:00
Sabrina Dubroca	ae6a8f5abe	macsec: replace custom checks on MACSEC_*_ATTR_ACTIVE with NLA_POLICY_MAX Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/2b07434304c725c72a7d81a8460d0bbe8af384a2.1756202772.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:34:52 -07:00
Sabrina Dubroca	d5e0a8cec1	macsec: replace custom checks on MACSEC_SA_ATTR_AN with NLA_POLICY_MAX Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/22a7820cfc2cbfe5e33f030f1a3276e529cc70dc.1756202772.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:34:52 -07:00
Jakub Kicinski	86b2676816	Merge branch 'net-prevent-rps-table-overwrite-of-active-flows' Krishna Kumar says: ==================== net: Prevent RPS table overwrite of active flows This series splits the original RPS patch [1] into two patches for net-next. It also addresses a kernel test robot warning by defining rps_flow_is_active() only when aRFS is enabled. I tested v3 with four builds and reboots: two for [PATCH 1/2] with aRFS enabled & disabled, and two for [PATCH 2/2]. There are no code changes in v4 and v5, only documentation. Patch v6 has one line change to keep 'hash' field under #ifdef, and was test built with aRFS=on and aRFS=off. The same two builds were done for v7, along with 15m load testing with aRFS=on to ensure the new changes are correct. The first patch prevents RPS table overwrite for active flows thereby improving aRFS stability. The second patch caches hash & flow_id in get_rps_cpu() to avoid recalculating it in set_rps_cpu(). [1] lore.kernel.org/netdev/20250708081516.53048-1-krikku@gmail.com/ [2] lore.kernel.org/netdev/20250729104109.1687418-1-krikku@gmail.com/ ==================== Link: https://patch.msgid.link/20250825031005.3674864-1-krikku@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:24:22 -07:00
Krishna Kumar	48aa30443e	net: Cache hash and flow_id to avoid recalculation get_rps_cpu() can cache flow_id and hash as both are required by set_rps_cpu() instead of recalculating them twice. Signed-off-by: Krishna Kumar <krikku@gmail.com> Link: https://patch.msgid.link/20250825031005.3674864-3-krikku@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:24:20 -07:00
Krishna Kumar	97bcc5b6f4	net: Prevent RPS table overwrite of active flows This patch fixes an issue where two different flows on the same RXq produce the same hash resulting in continuous flow overwrites. Flow #1: A packet for Flow #1 comes in, kernel calls the steering function. The driver gives back a filter id. The kernel saves this filter id in the selected slot. Later, the driver's service task checks if any filters have expired and then installs the rule for Flow #1. Flow #2: A packet for Flow #2 comes in. It goes through the same steps. But this time, the chosen slot is being used by Flow #1. The driver gives a new filter id and the kernel saves it in the same slot. When the driver's service task runs, it runs through all the flows, checks if Flow #1 should be expired, the kernel returns True as the slot has a different filter id, and then the driver installs the rule for Flow #2. Flow #1: Another packet for Flow #1 comes in. The same thing repeats. The slot is overwritten with a new filter id for Flow #1. This causes a repeated cycle of flow programming for missed packets, wasting CPU cycles while not improving performance. This problem happens at higher rates when the RPS table is small, but tests show it still happens even with 12,000 connections and an RPS size of 16K per queue (global table size = 144x16K = 64K). This patch prevents overwriting an rps_dev_flow entry if it is active. The intention is that it is better to do aRFS for the first flow instead of hurting all flows on the same hash. Without this, two (or more) flows on one RX queue with the same hash can keep overwriting each other. This causes the driver to reprogram the flow repeatedly. Changes: 1. Add a new 'hash' field to struct rps_dev_flow. 2. Add rps_flow_is_active(): a helper function to check if a flow is active or not, extracted from rps_may_expire_flow(). It is further simplified as per reviewer feedback. 3. In set_rps_cpu(): - Avoid overwriting by programming a new filter if: - The slot is not in use, or - The slot is in use but the flow is not active, or - The slot has an active flow with the same hash, but target CPU differs. - Save the hash in the rps_dev_flow entry. 4. rps_may_expire_flow(): Use earlier extracted rps_flow_is_active(). Testing & results: - Driver: ice (E810 NIC), Kernel: net-next - #CPUs = #RXq = 144 (1:1) - Number of flows: 12K - Eight RPS settings from 256 to 32768. Though RPS=256 is not ideal, it is still sufficient to cover 12K flows (256144 rx-queues = 64K global table slots) - Global Table Size = 144 RPS (effectively equal to 256 * RPS) - Each RPS test duration = 8 mins (org code) + 8 mins (new code). - Metrics captured on client Legend for following tables: Steer-C: #times ndo_rx_flow_steer() was Called by set_rps_cpu() Steer-L: #times ice_arfs_flow_steer() Looped over aRFS entries Add: #times driver actually programmed aRFS (ice_arfs_build_entry()) Del: #times driver deleted the flow (ice_arfs_del_flow_rules()) Units: K = 1,000 times, M = 1 million times \|-------\|---------\|------\| Org Code \|---------\|---------\| \| RPS \| Latency \| CPU \| Add \| Del \| Steer-C \| Steer-L \| \|-------\|---------\|------\|--------\|--------\|---------\|---------\| \| 256 \| 227.0 \| 93.2 \| 1.6M \| 1.6M \| 121.7M \| 267.6M \| \| 512 \| 225.9 \| 94.1 \| 11.5M \| 11.2M \| 65.7M \| 199.6M \| \| 1024 \| 223.5 \| 95.6 \| 16.5M \| 16.5M \| 27.1M \| 187.3M \| \| 2048 \| 222.2 \| 96.3 \| 10.5M \| 10.5M \| 12.5M \| 115.2M \| \| 4096 \| 223.9 \| 94.1 \| 5.5M \| 5.5M \| 7.2M \| 65.9M \| \| 8192 \| 224.7 \| 92.5 \| 2.7M \| 2.7M \| 3.0M \| 29.9M \| \| 16384 \| 223.5 \| 92.5 \| 1.3M \| 1.3M \| 1.4M \| 13.9M \| \| 32768 \| 219.6 \| 93.2 \| 838.1K \| 838.1K \| 965.1K \| 8.9M \| \|-------\|---------\|------\| New Code \|---------\|---------\| \| 256 \| 201.5 \| 99.1 \| 13.4K \| 5.0K \| 13.7K \| 75.2K \| \| 512 \| 202.5 \| 98.2 \| 11.2K \| 5.9K \| 11.2K \| 55.5K \| \| 1024 \| 207.3 \| 93.9 \| 11.5K \| 9.7K \| 11.5K \| 59.6K \| \| 2048 \| 207.5 \| 96.7 \| 11.8K \| 11.1K \| 15.5K \| 79.3K \| \| 4096 \| 206.9 \| 96.6 \| 11.8K \| 11.7K \| 11.8K \| 63.2K \| \| 8192 \| 205.8 \| 96.7 \| 11.9K \| 11.8K \| 11.9K \| 63.9K \| \| 16384 \| 200.9 \| 98.2 \| 11.9K \| 11.9K \| 11.9K \| 64.2K \| \| 32768 \| 202.5 \| 98.0 \| 11.9K \| 11.9K \| 11.9K \| 64.2K \| \|-------\|---------\|------\|--------\|--------\|---------\|---------\| Some observations: 1. Overall Latency improved: (1790.19-1634.94)/1790.19100 = 8.67% 2. Overall CPU increased: (777.32-751.49)/751.45100 = 3.44% 3. Flow Management (add/delete) remained almost constant at ~11K compared to values in millions. Signed-off-by: Krishna Kumar <krikku@gmail.com> Link: https://patch.msgid.link/20250825031005.3674864-2-krikku@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:24:13 -07:00
Qianfeng Rong	f0c88a0d83	net: wwan: iosm: use int type to store negative error codes The 'ret' variable in ipc_pcie_resources_request() either stores '-EBUSY' directly or holds returns from pci_request_regions() and ipc_acquire_irq(). Storing negative error codes in u32 causes no runtime issues but is stylistically inconsistent and very ugly. Change 'ret' from u32 to int type - this has no runtime impact. Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Reviewed-by: Loic Poulain <loic.poulain@oss.qualcomm.com> Link: https://patch.msgid.link/20250826135021.510767-1-rongqianfeng@vivo.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:17:15 -07:00
Heiner Kallweit	6aff369990	net: phy: fixed_phy: simplify fixed_mdio_read swphy_read_reg() doesn't change the passed struct fixed_phy_status, so we can pass &fp->status directly. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/c49195c7-a3a1-485c-baed-9b33740752de@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:09:08 -07:00
Qianfeng Rong	a6bac18229	amd-xgbe: Use int type to store negative error codes Use int instead of unsigned int for the 'ret' variable to store return values from functions that either return zero on success or negative error codes on failure. Storing negative error codes in an unsigned int causes no runtime issues, but it's ugly as pants, Change 'ret' from unsigned int to int type - this change has no runtime impact. Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250826142159.525059-1-rongqianfeng@vivo.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 18:06:46 -07:00
Andre Przywara	330355191a	net: stmmac: sun8i: drop unneeded default syscon value For some odd reason we were very jealous about the value of the EMAC clock register from the syscon block, insisting on a reset value and only doing read-modify-write operations on that register, even though we pretty much know the register layout. This already led to a basically redundant entry for the H6, which only differs by that value. We seem to have the same situation with the new A523 SoC, which again is compatible to the A64, but has a different syscon reset value. Drop any assumptions about that value, and set or clear the bits that we want to program, from scratch (starting with a value of 0). For the remove() implementation, we just turn on the POWERDOWN bit, and deselect the internal PHY, which mimics the existing code. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com> Acked-by: Corentin LABBE <clabbe.montjoie@gmail.com> Tested-by: Corentin LABBE <clabbe.montjoie@gmail.com> Tested-by: Paul Kocialkowski <paulk@sys-base.io> Reviewed-by: Paul Kocialkowski <paulk@sys-base.io> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250825172055.19794-1-andre.przywara@arm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 17:40:14 -07:00
Fabio Estevam	40fb9751cc	dt-bindings: nfc: ti,trf7970a: Restrict the ti,rx-gain-reduction-db values Instead of stating the supported values for the ti,rx-gain-reduction-db property in free text format, add an enum entry that can help validating the devicetree files. Signed-off-by: Fabio Estevam <festevam@gmail.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250826141736.712827-1-festevam@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 17:17:58 -07:00
Alok Tiwari	705609dede	net: stmmac: rk: remove incorrect _DLY_DISABLE bit definition The RK3328 GMAC clock delay macros define enable/disable controls for TX and RX clock delay. While the TX definitions are correct, the RXCLK_DLY_DISABLE macro incorrectly clears bit 0. The macros RK3328_GMAC_TXCLK_DLY_DISABLE and RK3328_GMAC_RXCLK_DLY_DISABLE are not referenced anywhere in the driver code. Remove them to clean up unused definitions. No functional change. Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20250826102219.49656-1-alok.a.tiwari@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 17:14:30 -07:00
Aleksander Jan Bajkowski	f63f21e82e	net: phy: realtek: support for TRIGGER_NETDEV_LINK on RTL8211E and RTL8211F This patch adds support for the TRIGGER_NETDEV_LINK trigger. It activates the LED when a link is established, regardless of the speed. Tested on Orange Pi PC2 with RTL8211E PHY. Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250825211059.143231-1-olek2@wp.pl Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-27 17:13:32 -07:00
Jakub Kicinski	2420411643	Merge branch 'ipv6-sr-simplify-and-optimize-hmac-calculations' Eric Biggers says: ==================== ipv6: sr: Simplify and optimize HMAC calculations This series simplifies and optimizes the HMAC calculations in IPv6 Segment Routing. ==================== Link: https://patch.msgid.link/20250824013644.71928-1-ebiggers@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-26 18:11:31 -07:00
Eric Biggers	fe60065689	ipv6: sr: Prepare HMAC key ahead of time Prepare the HMAC key when it is added to the kernel, instead of preparing it implicitly for every packet. This significantly improves the performance of seg6_hmac_compute(). A microbenchmark on x86_64 shows seg6_hmac_compute() (with HMAC-SHA256) dropping from ~1978 cycles to ~1419 cycles, a 28% improvement. The size of 'struct seg6_hmac_info' increases by 128 bytes, but that should be fine, since there should not be a massive number of keys. Signed-off-by: Eric Biggers <ebiggers@kernel.org> Link: https://patch.msgid.link/20250824013644.71928-3-ebiggers@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-26 18:11:29 -07:00
Eric Biggers	095928e7d8	ipv6: sr: Use HMAC-SHA1 and HMAC-SHA256 library functions Use the HMAC-SHA1 and HMAC-SHA256 library functions instead of crypto_shash. This is simpler and faster. Pre-allocating per-CPU hash transformation objects and descriptors is no longer needed, and a microbenchmark on x86_64 shows seg6_hmac_compute() (with HMAC-SHA256) dropping from ~2494 cycles to ~1978 cycles, a 20% improvement. Signed-off-by: Eric Biggers <ebiggers@kernel.org> Link: https://patch.msgid.link/20250824013644.71928-2-ebiggers@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-26 18:11:29 -07:00
Jakub Kicinski	f19434dd41	Merge branch 'selftests-drv-net-ncdevmem-fix-error-paths' Jakub Kicinski says: ==================== selftests: drv-net: ncdevmem: fix error paths Make ncdevmem clean up after itself. While at it make sure it sets HDS threshold to 0 automatically. v1: https://lore.kernel.org/20250822200052.1675613-1-kuba@kernel.org ==================== Link: https://patch.msgid.link/20250825180447.2252977-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-26 17:35:31 -07:00
Jakub Kicinski	a9d533fbba	selftests: drv-net: ncdevmem: explicitly set HDS threshold to 0 Make sure we set HDS threshold to 0 if the device supports changing it. It's required for ZC. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250825180447.2252977-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-26 17:35:28 -07:00
Jakub Kicinski	6351fadbd5	selftests: drv-net: ncdevmem: restore original HDS setting before exiting Restore HDS settings if we modified them. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250825180447.2252977-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-26 17:35:28 -07:00
Jakub Kicinski	b9f4f95298	selftests: drv-net: ncdevmem: restore old channel config In case changing channel count with provider bound succeeds unexpectedly - make sure we return to original settings. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250825180447.2252977-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-26 17:35:27 -07:00

1 2 3 4 5 ...

1382493 Commits