linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-21 13:45:53 -04:00

Author	SHA1	Message	Date
Christian Brauner	d7afdf8895	ns: add to_<type>_ns() to respective headers Every namespace type has a container_of(ns, <ns_type>, ns) static inline function that is currently not exposed in the header. So we have a bunch of places that open-code it via container_of(). Move it to the headers so we can use it directly. Reviewed-by: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-09-19 14:26:16 +02:00
Christian Brauner	195f742229	net: support ns lookup Support the generic ns lookup infrastructure to support file handles for namespaces. The network namespace has a separate list with different lifetime rules which we can just leave in tact. We have a similar concept for mount namespaces as well where it is on two differenet lists for different purposes. Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-09-19 14:26:15 +02:00
Christian Brauner	7914f15c5e	Merge branch 'no-rebase-mnt_ns_tree_remove' Bring in the fix for removing a mount namespace from the mount namespace rbtree and list. Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-09-19 14:26:14 +02:00
Christian Brauner	08027f6b79	net: use ns_common_init() Don't cargo-cult the same thing over and over. Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-09-19 14:26:13 +02:00
Aditya Kumar Singh	32d340ae67	wifi: mac80211: fix Rx packet handling when pubsta information is not available In ieee80211_rx_handle_packet(), if the caller does not provide pubsta information, an attempt is made to find the station using the address 2 (source address) field in the header. Since pubsta is missing, link information such as link_valid and link_id is also unavailable. Now if such a situation comes, and if a matching ML station entry is found based on the source address, currently the packet is dropped due to missing link ID in the status field which is not correct. Hence, to fix this issue, if link_valid is not set and the station is an ML station, make an attempt to find a link station entry using the source address. If a valid link station is found, derive the link ID and proceed with packet processing. Otherwise, drop the packet as per the existing flow. Fixes: `ea9d807b56` ("wifi: mac80211: add link information in ieee80211_rx_status") Suggested-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com> Signed-off-by: Aditya Kumar Singh <aditya.kumar.singh@oss.qualcomm.com> Link: https://patch.msgid.link/20250917-fix_data_packet_rx_with_mlo_and_no_pubsta-v1-1-8cf971a958ac@oss.qualcomm.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-19 11:59:34 +02:00
Lachlan Hodges	cbcd507f01	wifi: cfg80211: remove ieee80211_s1g_channel_width With the introduction of proper S1G channel flags, this function is no longer used. Remove it. Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com> Link: https://patch.msgid.link/20250918051913.500781-4-lachlan.hodges@morsemicro.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-19 11:56:07 +02:00
Lachlan Hodges	31e7681da7	wifi: mac80211: correctly initialise S1G chandef for STA When moving to the APs channel, ensure we correctly initialise the chandef and perform the required validation. Additionally, if the AP is beaconing on a 2MHz primary, calculate the 2MHz primary center frequency by extracting the sibling 1MHz primary and averaging the frequencies to find the 2MHz primary center frequency. Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com> Link: https://patch.msgid.link/20250918051913.500781-3-lachlan.hodges@morsemicro.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-19 11:56:07 +02:00
Lachlan Hodges	d0688dc2b1	wifi: cfg80211: correctly implement and validate S1G chandef Currently, the S1G channelisation implementation differs from that of VHT, which is the PHY that S1G is based on. The major difference between the clock rate is 1/10th of VHT. However how their channelisation is represented within cfg80211 and mac80211 vastly differ. To rectify this, remove the use of IEEE80211_CHAN_1/2/4.. flags that were previously used to indicate the control channel width, however it should be implied that the control channels are 1MHz in the case of S1G. Additionally, introduce the invert - being IEEE80211_CHAN_NO_4/8/16MHz - that imply the control channel may not be used for a certain bandwidth. With these new flags, we can perform regulatory and chandef validation just as we would for VHT. To deal with the notion that S1G PHYs may contain a 2MHz primary channel, introduce a new variable, s1g_primary_2mhz, which indicates whether we are operating on a 2MHz primary channel. In this case, the chandef::chan points to the 1MHz primary channel pointed to by the primary channel location. Alongside this, introduce some new helper routines that can extract the sibling 1MHz channel. The sibling being the alternate 1MHz primary subchannel within the 2MHz primary channel that is not pointed to by chandef::chan. Furthermore, due to unique restrictions imposed on S1G PHYs, introduce a new flag, IEEE80211_CHAN_S1G_NO_PRIMARY, which states that the 1MHz channel cannot be used as a primary channel. This is assumed to be set by vendors as it is hardware and regdom specific, When we validate a 2MHz primary channel, we need to ensure both 1MHz subchannels do not contain this flag. If one or both of the 1MHz subchannels contain this flag then the 2MHz primary is not permitted for use as a primary channel. Properly integrate S1G channel validation such that it is implemented according with other PHY types such as VHT. Additionally, implement a new S1G-specific regulatory flag to allow cfg80211 to understand specific vendor requirements for S1G PHYs. Signed-off-by: Arien Judge <arien.judge@morsemicro.com> Signed-off-by: Andrew Pope <andrew.pope@morsemicro.com> Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com> Link: https://patch.msgid.link/20250918051913.500781-2-lachlan.hodges@morsemicro.com [remove redundant NL80211_ATTR_S1G_PRIMARY_2MHZ check] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-19 11:55:56 +02:00
Sarika Sharma	ccdc96fa0e	wifi: mac80211: remove tx_handlers_drop debugfs stats Commit `906a5a8c71` ("wifi: mac80211: add tx_handlers_drop statistics to ethtool") added a tx_handlers_drop counter to ethtool stats. During review [1], Johannes noted that the existing debugfs counter is now redundant. Remove the debugfs stat to avoid duplication and streamline statistics reporting. Link: https://lore.kernel.org/linux-wireless/ce5f2bd899caa2de32f36ce554d9cada073979c0.camel@sipsolutions.net/ # [1] Signed-off-by: Sarika Sharma <quic_sarishar@quicinc.com> Link: https://patch.msgid.link/20250918040846.4032734-1-quic_sarishar@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-19 11:50:48 +02:00
pengdonglin	872e397d62	wifi: mac80211: Remove redundant rcu_read_lock/unlock() in spin_lock Since commit `a8bb74acd8` ("rcu: Consolidate RCU-sched update-side function definitions") there is no difference between rcu_read_lock(), rcu_read_lock_bh() and rcu_read_lock_sched() in terms of RCU read section and the relevant grace period. That means that spin_lock(), which implies rcu_read_lock_sched(), also implies rcu_read_lock(). There is no need no explicitly start a RCU read section if one has already been started implicitly by spin_lock(). Simplify the code and remove the inner rcu_read_lock() invocation. Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: pengdonglin <pengdonglin@xiaomi.com> Signed-off-by: pengdonglin <dolinux.peng@gmail.com> Link: https://patch.msgid.link/20250916044735.2316171-11-dolinux.peng@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-19 11:48:14 +02:00
Ilan Peer	1d04fad3a4	wifi: mac80211: Extend support for changing NAN configuration As 'struct cfg80211_nan_config' was updated, update the relevant logic to accommodate these changes. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.92b530ddaedf.I2b6d6f6074e25487303fde573ce764a64f87bdcd@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-19 11:26:23 +02:00
Ilan Peer	04f17cfea2	wifi: mac80211: Export an API to check if NAN is started So it can be used by drivers to check if NAN Device interface is started or not. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.c69652f77eb6.Ie4f3d197e0706e742e3d97614fadc11b22adfbc6@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-19 11:26:23 +02:00
Ilan Peer	c7b5355b37	wifi: mac80211: Get the correct interface for non-netdev skb status The function ieee80211_sdata_from_skb() always returned the P2P Device interface in case the skb was not associated with a netdev and didn't consider the possibility that an NAN Device interface is also enabled. To support configurations where both P2P Device and a NAN Device interface are active, extend the function to match the correct interface based on address 2 in the 802.11 MAC header. Since the 'p2p_sdata' field in struct ieee80211_local is no longer needed, remove it. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Reviewed-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.5252d2579a49.Id4576531c6b2ad83c9498b708dc0ade6b0214fa8@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-19 11:26:23 +02:00
Ilan Peer	8f79d2f13d	wifi: mac80211: Track NAN interface start/stop In case that NAN is started, mark the device as non idle, and set LED triggering similar to scan and ROC. Set the device to idle once NAN is stopped. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Reviewed-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.2711d62fce22.I9b9f826490e50967a66788d713b0eba985879873@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-19 11:26:23 +02:00
Ilan Peer	488d2e0bba	wifi: mac80211: Accept management frames on NAN interface Accept Public Action frames and Authentication frames on NAN Device interface to support flows that require these frames: - SDFs: For user space Discovery Engine (DE) implementation. - NAFs: For user space NAN Data Path (NDP) establishment. - Authentication frames: For NAN Pairing and Verification. Accept only frames from devices that are part of the NAN cluster. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.46528d69e881.Ifccd87fb2a49a3af05238f74f52fa6da8de28811@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-19 11:26:23 +02:00
Ilan Peer	fc41f4a28a	wifi: mac80211: Support Tx of action frame for NAN Add support for sending management frame over a NAN Device interface: - Declare support for the supported management frames types. - Since action frame transmissions over a NAN Device interface do not necessarily require a channel configuration, e.g., they can be transmitted during DW, modify the Tx path to avoid accessing channel information for NAN Device interface. - In addition modify the points in the Tx path logic to account for cases that a band is not specified in the Tx information. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.23b160089228.I65a58af753bcbcfb5c4ad8ef372d546f889725ba@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-19 11:26:22 +02:00
Ilan Peer	1884e2594b	wifi: cfg80211: Store the NAN cluster ID When the driver indicates that the device has joined a cluster, store the cluster ID. This is needed for data path operations, e.g., filtering received frames etc. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.63e9fef2a3aa.I6c858185c9e71f84bd2c5174d7ee45902b4391c3@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-19 11:26:22 +02:00
Ilan Peer	78e3bd0133	wifi: cfg80211: Support Tx/Rx of action frame for NAN Add support for sending and receiving action frames over a NAN Device interface: - For Synchronized NAN operation NAN Service Discovery Frames (SDFs) and NAN Action Frames (NAFs) transmissions over a NAN Device interface, a channel parameter is not mandatory as the frame can be transmitted based on the NAN Device schedule. - For Unsynchronized NAN Discovery (USD) operation the SDFs and NAFs could be transmitted using NL80211_CMD_FRAME where a specific channel and dwell time are configured. As Synchronized NAN Operation and USD can be done concurrently, both modes need to be supported. Thus, allow sending NAN action frames when user space handles the NAN Discovery Engine (DE) with and without providing a channel as a parameter. To support reception of NAN Action frames and Authentication frames (used for NAN paring and verification) allow to register for management frame reception of NAN Device interface when user space handles the NAN DE. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.71da2b062929.I0166d51dcf14393f628cd5da366c21114f518618@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-19 11:26:22 +02:00
Ilan Peer	b9c3d426c8	wifi: cfg80211: Advertise supported NAN capabilities Allow drivers to specify the supported NAN capabilities and support advertising the NAN capabilities to user space. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.2976966556f5.Ic6e43b10049573180c909dad806f279cfb31143e@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-19 11:26:22 +02:00
Andrei Otcheretianski	1ccfd8db34	wifi: cfg80211: Add cluster joined notification APIs The drivers should notify upper layers and user space when a NAN device joins a cluster. This is needed, for example, to set the correct addr3 in SDF frames. Add API to report cluster join event. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.ad27b7b6e4d9.I70b213a2a49f18d1ba2ad325e67e8eff51cc7a1f@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-19 11:26:21 +02:00
Andrei Otcheretianski	ba9b2ceaa2	wifi: nl80211: Add NAN Discovery Window (DW) notification This notification will be used by the device to inform user space about upcoming DW. When received, user space will be able to prepare multicast Service Discovery Frames (SDFs) to be transmitted during the next DW using %NL80211_CMD_FRAME command on the NAN management interface. The device/driver will take care to transmit the frames in the correct timing. This allows to implement a synchronized Discovery Engine (DE) in user space, if the device doesn't support DE offload. Note that this notification can be sent before the actual DW starts as long as the driver/device handles the actual timing of the SDF transmission. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.0e1d15031bab.I5b1721e61b63910452b3c5cdcdc1e94cb094d4c9@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-19 11:26:21 +02:00
Andrei Otcheretianski	01b4a3061b	wifi: nl80211: Add more configuration options for NAN commands Current NAN APIs have only basic configuration for master preference and operating bands. Add and parse additional parameters which provide more control over NAN synchronization. The newly added attributes allow to publish additional NAN attributes and vendor elements in NAN beacons, control scan and discovery beacons periodicity, enable/disable DW notifications etc. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> tested: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.a4779492bf8e.I375feb919bd72358173766b9fe10010c40796b33@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-19 11:26:21 +02:00
Oleg Nesterov	e8fe3f07a3	9p/trans_fd: p9_fd_request: kick rx thread if EPOLLIN p9_read_work() doesn't set Rworksched and doesn't do schedule_work(m->rq) if list_empty(&m->req_list). However, if the pipe is full, we need to read more data and this used to work prior to commit `aaec5a95d5` ("pipe_read: don't wake up the writer if the pipe is still full"). p9_read_work() does p9_fd_read() -> ... -> anon_pipe_read() which (before the commit above) triggered the unnecessary wakeup. This wakeup calls p9_pollwake() which kicks p9_poll_workfn() -> p9_poll_mux(), p9_poll_mux() will notice EPOLLIN and schedule_work(&m->rq). This no longer happens after the optimization above, change p9_fd_request() to use p9_poll_mux() instead of only checking for EPOLLOUT. Reported-by: syzbot+d1b5dace43896bc386c3@syzkaller.appspotmail.com Tested-by: syzbot+d1b5dace43896bc386c3@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68a2de8f.050a0220.e29e5.0097.GAE@google.com/ Link: https://lore.kernel.org/all/67dedd2f.050a0220.31a16b.003f.GAE@google.com/ Co-developed-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Message-ID: <20250819161013.GB11345@redhat.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2025-09-19 16:34:51 +09:00
Jakub Kicinski	f2cdc4c22b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.17-rc7). No conflicts. Adjacent changes: drivers/net/ethernet/mellanox/mlx5/core/en/fs.h `9536fbe10c` ("net/mlx5e: Add PSP steering in local NIC RX") `7601a0a462` ("net/mlx5e: Add a miss level for ipsec crypto offload") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-18 11:26:06 -07:00
Linus Torvalds	cbf658dd09	Merge tag 'net-6.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from wireless. No known regressions at this point. Current release - fix to a fix: - eth: Revert "net/mlx5e: Update and set Xon/Xoff upon port speed set" - wifi: iwlwifi: pcie: fix byte count table for 7000/8000 devices - net: clear sk->sk_ino in sk_set_socket(sk, NULL), fix CRIU Previous releases - regressions: - bonding: set random address only when slaves already exist - rxrpc: fix untrusted unsigned subtract - eth: - ice: fix Rx page leak on multi-buffer frames - mlx5: don't return mlx5_link_info table when speed is unknown Previous releases - always broken: - tls: make sure to abort the stream if headers are bogus - tcp: fix null-deref when using TCP-AO with TCP_REPAIR - dpll: fix skipping last entry in clock quality level reporting - eth: qed: don't collect too many protection override GRC elements, fix memory corruption" * tag 'net-6.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits) octeontx2-pf: Fix use-after-free bugs in otx2_sync_tstamp() cnic: Fix use-after-free bugs in cnic_delete_task devlink rate: Remove unnecessary 'static' from a couple places MAINTAINERS: update sundance entry net: liquidio: fix overflow in octeon_init_instr_queue() net: clear sk->sk_ino in sk_set_socket(sk, NULL) Revert "net/mlx5e: Update and set Xon/Xoff upon port speed set" selftests: tls: test skb copy under mem pressure and OOB tls: make sure to abort the stream if headers are bogus selftest: packetdrill: Add tcp_fastopen_server_reset-after-disconnect.pkt. tcp: Clear tcp_sk(sk)->fastopen_rsk in tcp_disconnect(). octeon_ep: fix VF MAC address lifecycle handling selftests: bonding: add vlan over bond testing bonding: don't set oif to bond dev when getting NS target destination net: rfkill: gpio: Fix crash due to dereferencering uninitialized pointer net/mlx5e: Add a miss level for ipsec crypto offload net/mlx5e: Harden uplink netdev access against device unbind MAINTAINERS: make the DPLL entry cover drivers doc/netlink: Fix typos in operation attributes igc: don't fail igc_probe() on LED setup error ...	2025-09-18 10:22:02 -07:00
Cosmin Ratiu	3191df0a48	devlink rate: Remove unnecessary 'static' from a couple places devlink_rate_node_get_by_name() and devlink_rate_nodes_destroy() have a couple of unnecessary static variables for iterating over devlink rates. This could lead to races/corruption/unhappiness if two concurrent operations execute the same function. Remove 'static' from both. It's amazing this was missed for 4+ years. While at it, I confirmed there are no more examples of this mistake in net/ with 1, 2 or 3 levels of indentation. Fixes: `a8ecb93ef0` ("devlink: Introduce rate nodes") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-18 07:47:18 -07:00
Breno Leitao	8b7c4b612d	net: ethtool: use the new helper in rss_set_prep_indir() Refactor rss_set_prep_indir() to utilize the new ethtool_get_rx_ring_count() helper for determining the number of RX rings, replacing the direct use of get_rxnfc with ETHTOOL_GRXRINGS. This ensures compatibility with both legacy and new ethtool_ops interfaces by transparently multiplexing between them. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250917-gxrings-v4-7-dae520e2e1cb@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-18 07:07:38 -07:00
Breno Leitao	dce08107f1	net: ethtool: update set_rxfh_indir to use ethtool_get_rx_ring_count helper Modify ethtool_set_rxfh() to use the new ethtool_get_rx_ring_count() helper function for retrieving the number of RX rings instead of directly calling get_rxnfc with ETHTOOL_GRXRINGS. This way, we can leverage the new helper if it is available in ethtool_ops. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250917-gxrings-v4-6-dae520e2e1cb@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-18 07:07:38 -07:00
Breno Leitao	d5544688d4	net: ethtool: update set_rxfh to use ethtool_get_rx_ring_count helper Modify ethtool_set_rxfh() to use the new ethtool_get_rx_ring_count() helper function for retrieving the number of RX rings instead of directly calling get_rxnfc with ETHTOOL_GRXRINGS. This way, we can leverage the new helper if it is available in ethtool_ops. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250917-gxrings-v4-5-dae520e2e1cb@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-18 07:07:38 -07:00
Breno Leitao	84eaf4359c	net: ethtool: add get_rx_ring_count callback to optimize RX ring queries Add a new optional get_rx_ring_count callback in ethtool_ops to allow drivers to provide the number of RX rings directly without going through the full get_rxnfc flow classification interface. Create ethtool_get_rx_ring_count() to use .get_rx_ring_count if available, falling back to get_rxnfc() otherwise. It needs to be non-static, given it will be called by other ethtool functions laters, as those calling get_rxfh(). Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250917-gxrings-v4-4-dae520e2e1cb@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-18 07:07:37 -07:00
Breno Leitao	87c76c2db0	net: ethtool: remove the duplicated handling from ethtool_get_rxrings ethtool_get_rxrings() was a copy of ethtool_get_rxnfc(). Clean the code that will never be executed for GRXRINGS specifically. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250917-gxrings-v4-3-dae520e2e1cb@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-18 07:07:37 -07:00
Breno Leitao	06fad5a4ae	net: ethtool: add support for ETHTOOL_GRXRINGS ioctl This patch adds handling for the ETHTOOL_GRXRINGS ioctl command in the ethtool ioctl dispatcher. It introduces a new helper function ethtool_get_rxrings() that calls the driver's get_rxnfc() callback with appropriate parameters to retrieve the number of RX rings supported by the device. By explicitly handling ETHTOOL_GRXRINGS, userspace queries through ethtool can now obtain RX ring information in a structured manner. In this patch, ethtool_get_rxrings() is a simply copy of ethtool_get_rxnfc(). Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250917-gxrings-v4-2-dae520e2e1cb@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-18 07:07:37 -07:00
Breno Leitao	3efaede2e1	net: ethtool: pass the num of RX rings directly to ethtool_copy_validate_indir Modify ethtool_copy_validate_indir() and callers to validate indirection table entries against the number of RX rings as an integer instead of accessing rx_rings->data. This will be useful in the future, given that struct ethtool_rxnfc might not exist for native GRXRINGS call. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250917-gxrings-v4-1-dae520e2e1cb@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-18 07:07:36 -07:00
Eric Dumazet	672beab066	psp: rename our psp_dev_destroy() psp_dev_destroy() was already used in drivers/crypto/ccp/psp-dev.c Use psp_dev_free() instead, to avoid a link error when CRYPTO_DEV_SP_CCP=y Fixes: `00c94ca2b9` ("psp: base PSP device support") Closes: https://lore.kernel.org/netdev/CANn89i+ZdBDEV6TE=Nw5gn9ycTzWw4mZOpPuCswgwEsrgOyNnw@mail.gmail.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Daniel Zahka <daniel.zahka@gmail.com> Link: https://patch.msgid.link/20250918113546.177946-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-18 07:04:25 -07:00
Jakub Kicinski	0aeb54ac4c	tls: make sure to abort the stream if headers are bogus Normally we wait for the socket to buffer up the whole record before we service it. If the socket has a tiny buffer, however, we read out the data sooner, to prevent connection stalls. Make sure that we abort the connection when we find out late that the record is actually invalid. Retrying the parsing is fine in itself but since we copy some more data each time before we parse we can overflow the allocated skb space. Constructing a scenario in which we're under pressure without enough data in the socket to parse the length upfront is quite hard. syzbot figured out a way to do this by serving us the header in small OOB sends, and then filling in the recvbuf with a large normal send. Make sure that tls_rx_msg_size() aborts strp, if we reach an invalid record there's really no way to recover. Reported-by: Lee Jones <lee@kernel.org> Fixes: `84c61fe1a7` ("tls: rx: do not use the standard strparser") Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250917002814.1743558-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 12:43:54 +02:00
Paolo Abeni	64d2616972	Merge branch 'add-basic-psp-encryption-for-tcp-connections' Daniel Zahka says: ================== add basic PSP encryption for TCP connections This is v13 of the PSP RFC [1] posted by Jakub Kicinski one year ago. General developments since v1 include a fork of packetdrill [2] with support for PSP added, as well as some test cases, and an implementation of PSP key exchange and connection upgrade [3] integrated into the fbthrift RPC library. Both [2] and [3] have been tested on server platforms with PSP-capable CX7 NICs. Below is the cover letter from the original RFC: Add support for PSP encryption of TCP connections. PSP is a protocol out of Google: https://github.com/google/psp/blob/main/doc/PSP_Arch_Spec.pdf which shares some similarities with IPsec. I added some more info in the first patch so I'll keep it short here. The protocol can work in multiple modes including tunneling. But I'm mostly interested in using it as TLS replacement because of its superior offload characteristics. So this patch does three things: - it adds "core" PSP code PSP is offload-centric, and requires some additional care and feeding, so first chunk of the code exposes device info. This part can be reused by PSP implementations in xfrm, tunneling etc. - TCP integration TLS style Reuse some of the existing concepts from TLS offload, such as attaching crypto state to a socket, marking skbs as "decrypted", egress validation. PSP does not prescribe key exchange protocols. To use PSP as a more efficient TLS offload we intend to perform a TLS handshake ("inline" in the same TCP connection) and negotiate switching to PSP based on capabilities of both endpoints. This is also why I'm not including a software implementation. Nobody would use it in production, software TLS is faster, it has larger crypto records. - mlx5 implementation That's mostly other people's work, not 100% sure those folks consider it ready hence the RFC in the title. But it works :) Not posted, queued a branch [4] are follow up pieces: - standard stats - netdevsim implementation and tests [1] https://lore.kernel.org/netdev/20240510030435.120935-1-kuba@kernel.org/ [2] https://github.com/danieldzahka/packetdrill [3] https://github.com/danieldzahka/fbthrift/tree/dzahka/psp [4] https://github.com/kuba-moo/linux/tree/psp Comments we intend to defer to future series: - we prefer to keep the version field in the tx-assoc netlink request, because it makes parsing keys require less state early on, but we are willing to change in the next version of this series. - using a static branch to wrap psp_enqueue_set_decrypted() and other functions called from tcp. - using INDIRECT_CALL for tls/psp in sk_validate_xmit_skb(). We prefer to address this in a dedicated patch series, so that this series does not need to modify the way tls_validate_xmit_skb() is declared and stubbed out. v12: https://lore.kernel.org/netdev/20250916000559.1320151-1-kuba@kernel.org/ v11: https://lore.kernel.org/20250911014735.118695-1-daniel.zahka@gmail.com v10: https://lore.kernel.org/netdev/20250828162953.2707727-1-daniel.zahka@gmail.com/ v9: https://lore.kernel.org/netdev/20250827155340.2738246-1-daniel.zahka@gmail.com/ v8: https://lore.kernel.org/netdev/20250825200112.1750547-1-daniel.zahka@gmail.com/ v7: https://lore.kernel.org/netdev/20250820113120.992829-1-daniel.zahka@gmail.com/ v6: https://lore.kernel.org/netdev/20250812003009.2455540-1-daniel.zahka@gmail.com/ v5: https://lore.kernel.org/netdev/20250723203454.519540-1-daniel.zahka@gmail.com/ v4: https://lore.kernel.org/netdev/20250716144551.3646755-1-daniel.zahka@gmail.com/ v3: https://lore.kernel.org/netdev/20250702171326.3265825-1-daniel.zahka@gmail.com/ v2: https://lore.kernel.org/netdev/20250625135210.2975231-1-daniel.zahka@gmail.com/ v1: https://lore.kernel.org/netdev/20240510030435.120935-1-kuba@kernel.org/ ================== Links: https://patch.msgid.link/20250917000954.859376-1-daniel.zahka@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> --- * add-basic-psp-encryption-for-tcp-connections: net/mlx5e: Implement PSP key_rotate operation net/mlx5e: Add Rx data path offload psp: provide decapsulation and receive helper for drivers net/mlx5e: Configure PSP Rx flow steering rules net/mlx5e: Add PSP steering in local NIC RX net/mlx5e: Implement PSP Tx data path psp: provide encapsulation helper for drivers net/mlx5e: Implement PSP operations .assoc_add and .assoc_del net/mlx5e: Support PSP offload functionality psp: track generations of device key net: psp: update the TCP MSS to reflect PSP packet overhead net: psp: add socket security association code net: tcp: allow tcp_timewait_sock to validate skbs before handing to device net: move sk_validate_xmit_skb() to net/core/dev.c psp: add op for rotation of device key tcp: add datapath logic for PSP with inline key exchange net: modify core data structures for PSP datapath support psp: base PSP device support psp: add documentation	2025-09-18 12:38:34 +02:00
Raed Salem	0eddb8023c	psp: provide decapsulation and receive helper for drivers Create psp_dev_rcv(), which drivers can call to psp decapsulate and attach a psp_skb_ext to an skb. psp_dev_rcv() only supports what the PSP architecture specification refers to as "transport mode" packets, where the L3 header is either IPv6 or IPv4. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Co-developed-by: Daniel Zahka <daniel.zahka@gmail.com> Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250917000954.859376-18-daniel.zahka@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 12:32:07 +02:00
Raed Salem	fc72451574	psp: provide encapsulation helper for drivers Create a new function psp_encapsulate(), which takes a TCP packet and PSP encapsulates it according to the "Transport Mode Packet Format" section of the PSP Architecture Specification. psp_encapsulate() does not push a PSP trailer onto the skb. Both IPv6 and IPv4 are supported. Virtualization cookie is not included. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Co-developed-by: Daniel Zahka <daniel.zahka@gmail.com> Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250917000954.859376-14-daniel.zahka@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 12:32:07 +02:00
Jakub Kicinski	e78851058b	psp: track generations of device key There is a (somewhat theoretical in absence of multi-host support) possibility that another entity will rotate the key and we won't know. This may lead to accepting packets with matching SPI but which used different crypto keys than we expected. The PSP Architecture specification mentions that an implementation should track device key generation when device keys are managed by the NIC. Some PSP implementations may opt to include this key generation state in decryption metadata each time a device key is used to decrypt a packet. If that is the case, that key generation counter can also be used when policy checking a decrypted skb against a psp_assoc. This is an optional feature that is not explicitly part of the PSP spec, but can provide additional security in the case where an attacker may have the ability to force key rotations faster than rekeying can occur. Since we're tracking "key generations" more explicitly now, maintain different lists for associations from different generations. This way we can catch stale associations (the user space should listen to rotation notifications and change the keys). Drivers can "opt out" of generation tracking by setting the generation value to 0. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250917000954.859376-11-daniel.zahka@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 12:32:06 +02:00
Jakub Kicinski	e97269257f	net: psp: update the TCP MSS to reflect PSP packet overhead PSP eats 40B of header space. Adjust MSS appropriately. We can either modify tcp_mtu_to_mss() / tcp_mss_to_mtu() or reuse icsk_ext_hdr_len. The former option is more TCP specific and has runtime overhead. The latter is a bit of a hack as PSP is not an ext_hdr. If one squints hard enough, UDP encap is just a more practical version of IPv6 exthdr, so go with the latter. Happy to change. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250917000954.859376-10-daniel.zahka@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 12:32:06 +02:00
Jakub Kicinski	6b46ca260e	net: psp: add socket security association code Add the ability to install PSP Rx and Tx crypto keys on TCP connections. Netlink ops are provided for both operations. Rx side combines allocating a new Rx key and installing it on the socket. Theoretically these are separate actions, but in practice they will always be used one after the other. We can add distinct "alloc" and "install" ops later. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Co-developed-by: Daniel Zahka <daniel.zahka@gmail.com> Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250917000954.859376-9-daniel.zahka@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 12:32:06 +02:00
Daniel Zahka	0917bb139e	net: tcp: allow tcp_timewait_sock to validate skbs before handing to device Provide a callback to validate skb's originating from tcp timewait socks before passing to the device layer. Full socks have a sk_validate_xmit_skb member for checking that a device is capable of performing offloads required for transmitting an skb. With psp, tcp timewait socks will inherit the crypto state from their corresponding full socks. Any ACKs or RSTs that originate from a tcp timewait sock carrying psp state should be psp encapsulated. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250917000954.859376-8-daniel.zahka@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 12:32:06 +02:00
Daniel Zahka	8c511c1df3	net: move sk_validate_xmit_skb() to net/core/dev.c Move definition of sk_validate_xmit_skb() from net/core/sock.c to net/core/dev.c. This change is in preparation of the next patch, where sk_validate_xmit_skb() will need to cast sk to a tcp_timewait_sock *, and access member fields. Including linux/tcp.h from linux/sock.h creates a circular dependency, and dev.c is the only current call site of this function. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250917000954.859376-7-daniel.zahka@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 12:32:06 +02:00
Jakub Kicinski	117f02a49b	psp: add op for rotation of device key Rotating the device key is a key part of the PSP protocol design. Some external daemon needs to do it once a day, or so. Add a netlink op to perform this operation. Add a notification group for informing users that key has been rotated and they should rekey (next rotation will cut them off). Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250917000954.859376-6-daniel.zahka@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 12:32:06 +02:00
Jakub Kicinski	659a2899a5	tcp: add datapath logic for PSP with inline key exchange Add validation points and state propagation to support PSP key exchange inline, on TCP connections. The expectation is that application will use some well established mechanism like TLS handshake to establish a secure channel over the connection and if both endpoints are PSP-capable - exchange and install PSP keys. Because the connection can existing in PSP-unsecured and PSP-secured state we need to make sure that there are no race conditions or retransmission leaks. On Tx - mark packets with the skb->decrypted bit when PSP key is at the enqueue time. Drivers should only encrypt packets with this bit set. This prevents retransmissions getting encrypted when original transmission was not. Similarly to TLS, we'll use sk->sk_validate_xmit_skb to make sure PSP skbs can't "escape" via a PSP-unaware device without being encrypted. On Rx - validation is done under socket lock. This moves the validation point later than xfrm, for example. Please see the documentation patch for more details on the flow of securing a connection, but for the purpose of this patch what's important is that we want to enforce the invariant that once connection is secured any skb in the receive queue has been encrypted with PSP. Add GRO and coalescing checks to prevent PSP authenticated data from being combined with cleartext data, or data with non-matching PSP state. On Rx, check skb's with psp_skb_coalesce_diff() at points before psp_sk_rx_policy_check(). After skb's are policy checked and on the socket receive queue, skb_cmp_decrypted() is sufficient for checking for coalescable PSP state. On Tx, tcp_write_collapse_fence() should be called when transitioning a socket into PSP Tx state to prevent data sent as cleartext from being coalesced with PSP encapsulated data. This change only adds the validation points, for ease of review. Subsequent change will add the ability to install keys, and flesh the enforcement logic out Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Co-developed-by: Daniel Zahka <daniel.zahka@gmail.com> Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250917000954.859376-5-daniel.zahka@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 12:32:06 +02:00
Jakub Kicinski	ed8a507b74	net: modify core data structures for PSP datapath support Add pointers to psp data structures to core networking structs, and an SKB extension to carry the PSP information from the drivers to the socket layer. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Co-developed-by: Daniel Zahka <daniel.zahka@gmail.com> Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250917000954.859376-4-daniel.zahka@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 12:32:06 +02:00
Jakub Kicinski	00c94ca2b9	psp: base PSP device support Add a netlink family for PSP and allow drivers to register support. The "PSP device" is its own object. This allows us to perform more flexible reference counting / lifetime control than if PSP information was part of net_device. In the future we should also be able to "delegate" PSP access to software devices, such as *vlan, veth or netkit more easily. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250917000954.859376-3-daniel.zahka@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 12:32:06 +02:00
Eric Dumazet	6471658dc6	udp: use skb_attempt_defer_free() Move skb freeing from udp recvmsg() path to the cpu which allocated/received it, as TCP did in linux-5.17. This increases max thoughput by 20% to 30%, depending on number of BH producers. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250916160951.541279-11-edumazet@google.com Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 10:17:10 +02:00
Eric Dumazet	3cd04c8f4a	udp: make busylock per socket While having all spinlocks packed into an array was a space saver, this also caused NUMA imbalance and hash collisions. UDPv6 socket size becomes 1600 after this patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250916160951.541279-10-edumazet@google.com Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 10:17:10 +02:00
Eric Dumazet	9db27c8062	udp: add udp_drops_inc() helper Generic sk_drops_inc() reads sk->sk_drop_counters. We know the precise location for UDP sockets. Move sk_drop_counters out of sock_read_rxtx so that sock_write_rxtx starts at a cache line boundary. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250916160951.541279-9-edumazet@google.com Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-18 10:17:10 +02:00

... 5 6 7 8 9 ...

82231 Commits