linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-16 06:41:39 -04:00

Author	SHA1	Message	Date
Xiang Mei	f81f4e79b1	bonding: remove unused bond_is_first_slave and bond_is_last_slave macros Since commit `2884bf72fb` ("net: bonding: fix use-after-free in bond_xmit_broadcast()"), bond_is_last_slave() was only used in bond_xmit_broadcast(). After the recent fix replaced that usage with a simple index comparison, bond_is_last_slave() has no remaining callers. bond_is_first_slave() likewise has no callers. Remove both unused macros. Signed-off-by: Xiang Mei <xmei5@asu.edu> Link: https://patch.msgid.link/20260404220412.444753-1-xmei5@asu.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-08 19:07:08 -07:00
Jakub Kicinski	bd5c24e400	docs: netdev: improve wording of reviewer guidance Reword the reviewer guidance based on behavior we see on the list. Steer folks: - towards sending tags - away from process issues. Reviewed-by: Joe Damato <joe@dama.to> Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de> Link: https://patch.msgid.link/20260406175334.3153451-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-08 19:03:00 -07:00
Jakub Kicinski	1795654f00	Merge tag 'nf-next-26-04-08' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Florian Westphal says: ==================== netfilter: updates for net-next 1) Fix ancient sparse warnings in nf conntrack nat modules, from Sun Jian. 2) Fix typo in enum description, from Jelle van der Waa. 3) remove redundant refetch of netns pointer in nf_conntrack_sip. 4) add a deprecation warning for dccp match. We can extend the deadline later if needed, but plan atm is to remove the feature. 5) remove nf_conntrack_h323 debug code that can read out-of-bounds with malformed messages. This code was commented out, but better remove this. 6+7) add more netlink policy validations in netfilter. This could theoretically cause issues when a client sends e.g. unsupported feature flags that were previously ignored, so we may have to relax some changes. For now, try to be stricter and reject upfront. 8+9) minor code cleanup in nft_set_pipapo (an nftables set backend). 10) Add nftables matching support fro double-tagged vlan and pppoe frames, from Pablo Neira Ayuso. 11) Fix up indentation of debug messages in nf_conntrack_h323 conntrack helper, from David Laight. 12) Add a helper to iterate to next flow action and bail out if the maximum number of actions is reached, also from Pablo. * tag 'nf-next-26-04-08' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nf_tables_offload: add nft_flow_action_entry_next() and use it netfilter: nf_conntrack_h323: Correct indentation when H323_TRACE defined netfilter: nft_meta: add double-tagged vlan and pppoe support netfilter: nft_set_pipapo_avx2: remove redundant loop in lookup_slow netfilter: nft_set_pipapo: increment data in one step netfilter: nf_tables: add netlink policy based cap on registers netfilter: add more netlink-based policy range checks netfilter: nf_conntrack_h323: remove unreliable debug code in decode_octstr netfilter: add deprecation warning for dccp support netfilter: nf_conntrack_sip: remove net variable shadowing netfilter: nf_tables: Fix typo in enum description netfilter: use function typedefs for __rcu NAT helper hook pointers ==================== Link: https://patch.msgid.link/20260408060419.25258-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-08 18:58:08 -07:00
Jakub Kicinski	ea0f90d1ed	Merge tag 'ipsec-next-2026-04-08' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2026-04-08 1) Update outdated comment in xfrm_dst_check(). From kexinsun. 2) Drop support for HMAC-RIPEMD-160 from IPsec. From Eric Biggers. * tag 'ipsec-next-2026-04-08' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next: xfrm: Drop support for HMAC-RIPEMD-160 xfrm: update outdated comment ==================== Link: https://patch.msgid.link/20260408094258.148555-1-steffen.klassert@secunet.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-08 18:51:54 -07:00
Pablo Neira Ayuso	c6f8557758	netfilter: nf_tables_offload: add nft_flow_action_entry_next() and use it Add a new helper function to retrieve the next action entry in flow rule, check if the maximum number of actions is reached, bail out in such case. Replace existing opencoded iteration on the action array by this helper function. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>	2026-04-08 07:51:31 +02:00
David Laight	f33fad8dbf	netfilter: nf_conntrack_h323: Correct indentation when H323_TRACE defined The trace lines are indented using PRINT("%.s", xx, " "). Userspace will treat this as "%.0s" and will output no characters when 'xx' is zero, the kernel treats it as "%s" and will output a single ' ' - which is probably what is intended. Change all the formats to "%s" removing the default precision. This gives a single space indent when level is zero. Signed-off-by: David Laight <david.laight.linux@gmail.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Florian Westphal <fw@strlen.de>	2026-04-08 07:51:31 +02:00
Pablo Neira Ayuso	3785091c6c	netfilter: nft_meta: add double-tagged vlan and pppoe support Currently: add rule netdev x y ip saddr 1.1.1.1 does not work with neither double-tagged vlan nor pppoe packets. This is because the network and transport header offset are not pointing to the IP and transport protocol headers in the stack. This patch expands NFT_META_PROTOCOL and NFT_META_L4PROTO to parse double-tagged vlan and pppoe packets so matching network and transport header fields becomes possible with the existing userspace generated bytecode. Note that this parser only supports double-tagged vlan which is composed of vlan offload + vlan header in the skb payload area for simplicity. NFT_META_PROTOCOL is used by bridge and netdev family as an implicit dependency in the bytecode to match on network header fields. Similarly, there is also NFT_META_L4PROTO, which is also used as an implicit dependency when matching on the transport protocol header fields. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>	2026-04-08 07:51:31 +02:00
Florian Westphal	a3f1e6a19a	netfilter: nft_set_pipapo_avx2: remove redundant loop in lookup_slow nft_pipapo_avx2_lookup_slow will never be used in reality, because the common sizes are handled by avx2 optimized versions. However, nft_pipapo_avx2_lookup_slow loops over the data just like the avx2 functions. However, _slow doesn't need to do that. As-is, first loop sets all the right result bits and the next iterations boil down to 'x = x & x'. Remove the loop. Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>	2026-04-08 07:51:31 +02:00
Florian Westphal	04e1ca21a5	netfilter: nft_set_pipapo: increment data in one step Since commit `e807b13cb3` ("nft_set_pipapo: Generalise group size for buckets") there is no longer a need to increment the data pointer in two steps. Switch to a single invocation of NFT_PIPAPO_GROUPS_PADDED_SIZE() helper, like the avx2 implementation. [ Stefano: Improve commit message ] Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>	2026-04-08 07:51:31 +02:00
Florian Westphal	8e57338c36	netfilter: nf_tables: add netlink policy based cap on registers Should have no effect in practice; all of these use the nft_parse_register_load/store apis which is mandatory anyway due to the need to further validate the register load/store, e.g. that the size argument doesn't result in out-of-bounds load/store. OTOH this is a simple method to reject obviously wrong input at earlier stage. Signed-off-by: Florian Westphal <fw@strlen.de>	2026-04-08 07:51:31 +02:00
Florian Westphal	66b75e6bbe	netfilter: add more netlink-based policy range checks These spots either already check the attribute range manually before use or the consuming functions tolerate unexpected values. Nevertheless, add more range checks via netlink policy so we gain more users and avoid possible re-use in other places that might not have the required manual checks. This also improves error reporting: netlink core can generate extack errors. Signed-off-by: Florian Westphal <fw@strlen.de>	2026-04-08 07:51:30 +02:00
Florian Westphal	390a57dd61	netfilter: nf_conntrack_h323: remove unreliable debug code in decode_octstr The debug code (not enabled in any build) reads up to 6 octets of the inpt buffer, but does so without bound checks. Zap this. Signed-off-by: Florian Westphal <fw@strlen.de>	2026-04-08 07:51:27 +02:00
Florian Westphal	606bd17ef0	netfilter: add deprecation warning for dccp support Add a deprecation warning for the xt_dccp match and the nft exthdr code. Signed-off-by: Florian Westphal <fw@strlen.de>	2026-04-08 07:51:27 +02:00
Florian Westphal	7970d6aaf7	netfilter: nf_conntrack_sip: remove net variable shadowing net is already set, derived from nf_conn. I don't see how the device could be living in a different netns than the conntrack entry. Remove the extra variable and re-use existing one. Signed-off-by: Florian Westphal <fw@strlen.de>	2026-04-08 07:51:27 +02:00
Jelle van der Waa	1f290c497c	netfilter: nf_tables: Fix typo in enum description Fix the spelling of "options". Signed-off-by: Jelle van der Waa <jelle@vdwaa.nl> Signed-off-by: Florian Westphal <fw@strlen.de>	2026-04-08 07:51:26 +02:00
Sun Jian	6e6f2b9b33	netfilter: use function typedefs for __rcu NAT helper hook pointers After commit `07919126ec` ("netfilter: annotate NAT helper hook pointers with __rcu"), sparse can warn about type/address-space mismatches when RCU-dereferencing NAT helper hook function pointers. The hooks are __rcu-annotated and accessed via rcu_dereference(), but the combination of complex function pointer declarators and the WRITE_ONCE() machinery used by RCU_INIT_POINTER()/rcu_assign_pointer() can confuse sparse and trigger false positives. Introduce typedefs for the NAT helper function types, so __rcu applies to a simple "fn_t __rcu " pointer form. Also replace local typeof(hook) variables with "fn_t " to avoid propagating __rcu address space into temporaries. No functional change intended. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202603022359.3dGE9fwI-lkp@intel.com/ Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>	2026-04-08 07:51:26 +02:00
Jakub Kicinski	b3e69fc319	Merge branch 'net-pull-gso-packet-headers-in-core-stack' Eric Dumazet says: ==================== net: pull gso packet headers in core stack Most ndo_start_xmit() methods expects headers of gso packets to be already in skb->head. net/core/tso.c users are particularly at risk, because tso_build_hdr() does a memcpy(hdr, skb->data, hdr_len); qdisc_pkt_len_segs_init() already does a dissection of gso packets. Use pskb_may_pull() instead of skb_header_pointer() to make sure drivers do not have to reimplement this. First patch is a small cleanup to ease second patch review. ==================== Link: https://patch.msgid.link/20260403221540.3297753-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-07 19:02:18 -07:00
Eric Dumazet	7fb4c19670	net: pull headers in qdisc_pkt_len_segs_init() Most ndo_start_xmit() methods expects headers of gso packets to be already in skb->head. net/core/tso.c users are particularly at risk, because tso_build_hdr() does a memcpy(hdr, skb->data, hdr_len); qdisc_pkt_len_segs_init() already does a dissection of gso packets. Use pskb_may_pull() instead of skb_header_pointer() to make sure drivers do not have to reimplement this. Some malicious packets could be fed, detect them so that we can drop them sooner with a new SKB_DROP_REASON_SKB_BAD_GSO drop_reason. Fixes: `e876f208af` ("net: Add a software TSO helper API") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Joe Damato <joe@dama.to> Link: https://patch.msgid.link/20260403221540.3297753-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-07 19:02:13 -07:00
Eric Dumazet	30e02ec3b4	net: qdisc_pkt_len_segs_init() cleanup Reduce indentation level by returning early if the transport header was not set. Add an unlikely() clause as this is not the common case. No functional change. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Joe Damato <joe@dama.to> Link: https://patch.msgid.link/20260403221540.3297753-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-07 19:02:13 -07:00
Jakub Kicinski	e65d8b6f30	selftests: drv-net: adjust to socat changes socat v1.8.1.0 now defaults to shut-null, it sends an extra 0-length UDP packet when sender disconnects. This breaks our tests which expect the exact packet sequence. Add shut-none which was the old default where necessary. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <joe@dama.to> Reviewed-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260404230103.2719103-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-07 18:54:03 -07:00
Fernando Fernandez Mancera	2ce8a41113	net: hsr: emit notification for PRP slave2 changed hw addr on port deletion On PRP protocol, when deleting the port the MAC address change notification was missing. In addition to that, make sure to only perform the MAC address change on slave2 deletion and PRP protocol as the operation isn't necessary for HSR nor slave1. Note that the eth_hw_addr_set() is correct on PRP context as the slaves are either in promiscuous mode or forward offload enabled. Reported-by: Luka Gejak <luka.gejak@linux.dev> Closes: https://lore.kernel.org/netdev/DHFCZEM93FTT.1RWFBIE32K7OT@linux.dev/ Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Reviewed-by: Felix Maurer <fmaurer@redhat.com> Link: https://patch.msgid.link/20260403123928.4249-2-fmancera@suse.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-07 17:06:16 +02:00
Paolo Abeni	97a8355b6a	Merge branch 'net-mlx5e-xdp-add-support-for-multi-packet-per-page' Tariq Toukan says: ==================== net/mlx5e: XDP, Add support for multi-packet per page This series removes the limitation of having one packet per page in XDP mode. This has the following implications: - XDP in Striding RQ mode can now be used on 64K page systems. - XDP in Legacy RQ mode was using a single packet per page which on 64K page systems is quite inefficient. The improvement can be observed with an XDP_DROP test when running in Legacy RQ mode on a ARM Neoverse-N1 system with a 64K page size: +-----------------------------------------------+ \| MTU \| baseline \| this change \| improvement \| \|------+------------+-------------+-------------\| \| 1500 \| 15.55 Mpps \| 18.99 Mpps \| 22.0 % \| \| 9000 \| 15.53 Mpps \| 18.24 Mpps \| 17.5 % \| +-----------------------------------------------+ After lifting this limitation, the series switches to using fragments for the side page in non-linear mode. This small improvement is at most visible for XDP_DROP tests with small 64B packets and a large enough MTU for Striding RQ to be in non-linear mode: +----------------------------------------------------------------------+ \| System \| MTU \| baseline \| this change \| improvement \| \|----------------------+------+------------+-------------+-------------\| \| 4K page x86_64 [1] \| 9000 \| 26.30 Mpps \| 30.45 Mpps \| 15.80 % \| \| 64K page aarch64 [2] \| 9000 \| 15.27 Mpps \| 20.10 Mpps \| 31.62 % \| +----------------------------------------------------------------------+ This series does not cover the xsk (AF_XDP) paths for 64K page systems. [1] https://lore.kernel.org/all/20260324024235.929875-1-kuba@kernel.org/ ==================== Link: https://patch.msgid.link/20260403090927.139042-1-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-07 13:34:08 +02:00
Dragos Tatulea	25b8c9b6d7	net/mlx5e: XDP, Use page fragments for linear data in multibuf-mode Currently in XDP multi-buffer mode for striding rq a whole page is allocated for the linear part of the XDP buffer. This is wasteful, especially on systems with larger page sizes. This change splits the page into fixed sized fragments. The page is replenished when the maximum number of allowed fragments is reached. When a fragment is not used, it will be simply recycled on next packet. This is great for XDP_DROP as the fragment can be recycled for the next packet. In the most extreme case (XDP_DROP everything), there will be 0 fragments used => only one linear page allocation for the lifetime of the XDP program. The previous page_pool size increase was too conservative (doubling the size) and now there are much fewer allocations (1/8 for a 4K page). So drop the page_pool size extension altogether when the linear side page is used. This small improvement is at most visible for XDP_DROP tests with small 64B packets and a large enough MTU for Striding RQ to be in non-linear mode: +----------------------------------------------------------------------+ \| System \| MTU \| baseline \| this change \| improvement \| \|----------------------+------+------------+-------------+-------------\| \| 4K page x86_64 [1] \| 9000 \| 26.30 Mpps \| 30.45 Mpps \| 15.80 % \| \| 64K page aarch64 [2] \| 9000 \| 15.27 Mpps \| 20.10 Mpps \| 31.62 % \| +----------------------------------------------------------------------+ [1] Intel Xeon Platinum 8580 [2] ARM Neoverse-N1 Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260403090927.139042-6-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-07 13:34:04 +02:00
Dragos Tatulea	ebd4ad29cc	net/mlx5e: XDP, Use a single linear page per rq Currently in striding rq there is one mlx5e_frag_page member per WQE for the linear page. This linear page is used only in XDP multi-buffer mode. This is wasteful because only one linear page is needed per rq: the page gets refreshed on every packet, regardless of WQE. Furthermore, it is not needed in other modes (non-XDP, XDP single-buffer). This change moves the linear page into its own structure (struct mlx5_mpw_linear_info) and allocates it only when necessary. A special structure is created because an upcoming patch will extend this structure to support fragmentation of the linear page. This patch has no functional changes. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260403090927.139042-5-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-07 13:34:04 +02:00
Dragos Tatulea	2dfaa02387	net/mlx5e: XDP, Remove stride size limitation Currently XDP mode always uses PAGE_SIZE strides. This limitation existed because page fragment counting was not implemented when XDP was added. Furthermore, due to this limitation there were other issues as well on system with larger pages (e.g. 64K): - XDP for Striding RQ was effectively disabled on such systems. - Legacy RQ allows the configuration but uses a fixed scheme of one XDP buffer per page which is inefficient. As fragment counting was added during the driver conversion to page_pool and the support for XDP multi-buffer, it is now possible to remove this stride size limitation. This patch does just that. Now it is possible to use XDP on systems with higher page sizes (e.g. 64K): - For Striding RQ, loading the program is no longer blocked. Although a 64K page can fit any packet, MTUs that result in stride > 8K will still make the RQ in non-linear mode. That's because the HW doesn't support a higher than 8K stride. - For Legacy RQ, the stride size was PAGE_SIZE which was very inefficient. Now the stride size will be calculated relative to MTU. Legacy RQ will always be in linear mode for larger system pages. This can be observed with an XDP_DROP test [1] when running in Legacy RQ mode on a ARM Neoverse-N1 system with a 64K page size: +-----------------------------------------------+ \| MTU \| baseline \| this change \| improvement \| \|------+------------+-------------+-------------\| \| 1500 \| 15.55 Mpps \| 18.99 Mpps \| 22.0 % \| \| 9000 \| 15.53 Mpps \| 18.24 Mpps \| 17.5 % \| +-----------------------------------------------+ There are performance benefits for Striding RQ mode as well: - Striding RQ non-linear mode now uses 256B strides, just like non-XDP mode. - Striding RQ linear mode can now fit a number of XDP buffers per page that is relative to the MTU size. That means that on 4K page systems and a small enough MTU, 2 XDP buffers can fit in one page. The above benefits for Striding RQ can be observed with an XDP_DROP test [1] when running on a 4K page x86_64 system (Intel Xeon Platinum 8580): +-----------------------------------------------+ \| MTU \| baseline \| this change \| improvement \| \|------+------------+-------------+-------------\| \| 1000 \| 28.36 Mpps \| 33.98 Mpps \| 19.82 % \| \| 9000 \| 20.76 Mpps \| 26.30 Mpps \| 26.70 % \| +-----------------------------------------------+ [1] Test description: - xdp-bench with XDP_DROP - RX: single queue - TX: sends 64B packets to saturate CPU on RX side Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260403090927.139042-4-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-07 13:34:04 +02:00
Dragos Tatulea	833e72645a	net/mlx5e: XDP, Improve dma address calculation of linear part for XDP_TX When calculating the dma address of the linear part of an XDP frame, the formula assumes that there is a single XDP buffer per page. Extend the formula to allow multiple XDP buffers per page by calculating the data offset in the page. This is a preparation for the upcoming removal of a single XDP buffer per page limitation when the formula will no longer be correct. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260403090927.139042-3-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-07 13:34:04 +02:00
Dragos Tatulea	1047e14b44	net/mlx5e: XSK, Increase size for chunk_size param When 64K pages are used, chunk_size can take the 64K value which doesn't fit in u16. This results in overflows that are detected in mlx5e_mpwrq_log_wqe_sz(). Increase the type to u32 to fix this. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260403090927.139042-2-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-07 13:34:04 +02:00
Qingfang Deng	dfecb0c5af	selftests: net: add tests for PPP Add ping and iperf3 tests for ppp_async.c and pppoe.c. Signed-off-by: Qingfang Deng <qingfang.deng@linux.dev> Link: https://patch.msgid.link/20260403034908.30017-1-qingfang.deng@linux.dev Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-07 12:08:46 +02:00
Eric Biggers	05d42dc8ab	xfrm: Drop support for HMAC-RIPEMD-160 Drop support for HMAC-RIPEMD-160 from IPsec to reduce the UAPI surface and simplify future maintenance. It's almost certainly unused. RIPEMD-160 received some attention in the early 2000s when SHA-* weren't quite as well established. But it never received much adoption outside of certain niches such as Bitcoin. It's actually unclear that Linux + IPsec + HMAC-RIPEMD-160 has ever been used, even historically. When support for it was added in 2003, it was done so in a "cleanup" commit without any justification [1]. It didn't actually work until someone happened to fix it 5 years later [2]. That person didn't use or test it either [3]. Finally, also note that "hmac(rmd160)" is by far the slowest of the algorithms in aalg_list[]. Of course, today IPsec is usually used with an AEAD, such as AES-GCM. But even for IPsec users still using a dedicated auth algorithm, they almost certainly aren't using, and shouldn't use, HMAC-RIPEMD-160. Thus, let's just drop support for it. Note: no kconfig update is needed, since CRYPTO_RMD160 wasn't actually being selected anyway. References: [1] linux-history commit d462985fc1941a47 ("[IPSEC]: Clean up key manager algorithm handling.") [2] linux commit `a13366c632` ("xfrm: xfrm_algo: correct usage of RIPEMD-160") [3] https://lore.kernel.org/all/1212340578-15574-1-git-send-email-rueegsegger@swiss-it.ch Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2026-04-07 10:47:58 +02:00
Jakub Kicinski	c149d90e26	Merge branch 'mptcp-support-msg_eor-and-small-cleanups' Matthieu Baerts says: ==================== mptcp: support MSG_EOR and small cleanups This series contains various unrelated patches: - Patches 1 & 2: support MSG_EOR instead of ignoring it. - Patch 3: avoid duplicated code in TCP and MPTCP by using a new helper. - Patch 4: adapt test to reproduce bug and increase code coverage. ==================== Link: https://patch.msgid.link/20260403-net-next-mptcp-msg_eor-misc-v1-0-b0b33bea3fed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 19:14:31 -07:00
Matthieu Baerts (NGI0)	c4a5cb2f00	selftests: mptcp: join: recreate signal endp with same ID In this "delete re-add signal" MPTCP Join subtest, the endpoint linked to the initial subflow is removed, but readded once with different ID. It appears that there was an issue when reusing the same ID, recently fixed by commit `d191101dee` ("mptcp: pm: in-kernel: always set ID as avail when rm endp"). The test then now reuses the same ID the first time, but continue to use another one (88) the second time. This should then cover more cases. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/615 Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260403-net-next-mptcp-msg_eor-misc-v1-5-b0b33bea3fed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 19:14:30 -07:00
Geliang Tang	eb477fdd68	tcp: add recv_should_stop helper Factor out a new helper tcp_recv_should_stop() from tcp_recvmsg_locked() and tcp_splice_read() to check whether to stop receiving. And use this helper in mptcp_recvmsg() and mptcp_splice_read() to reduce redundant code. Suggested-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260403-net-next-mptcp-msg_eor-misc-v1-3-b0b33bea3fed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 19:14:27 -07:00
Gang Yan	7fb2f5f964	mptcp: preserve MSG_EOR semantics in sendmsg path Extend MPTCP's sendmsg handling to recognize and honor the MSG_EOR flag, which marks the end of a record for application-level message boundaries. Data fragments tagged with MSG_EOR are explicitly marked in the mptcp_data_frag structure and skb context to prevent unintended coalescing with subsequent data chunks. This ensures the intent of applications using MSG_EOR is preserved across MPTCP subflows, maintaining consistent message segmentation behavior. Signed-off-by: Gang Yan <yangang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260403-net-next-mptcp-msg_eor-misc-v1-2-b0b33bea3fed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 19:14:26 -07:00
Gang Yan	00d46be3c3	mptcp: reduce 'overhead' from u16 to u8 The 'overhead' in struct mptcp_data_frag can safely use u8, as it represents 'alignment + sizeof(mptcp_data_frag)'. With a maximum alignment of 7('ALIGN(1, sizeof(long)) - 1'), the overhead is at most 47, well below U8_MAX and validated with BUILD_BUG_ON(). This patch also adds a field named 'unused' for further extensions. Signed-off-by: Gang Yan <yangang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260403-net-next-mptcp-msg_eor-misc-v1-1-b0b33bea3fed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 19:14:26 -07:00
Arnd Bergmann	ede3136e56	dpaa2: avoid linking objects into multiple modules Each object file contains information about which module it gets linked into, so linking the same file into multiple modules now causes a warning: scripts/Makefile.build:254: drivers/net/ethernet/freescale/dpaa2/Makefile: dpaa2-mac.o is added to multiple modules: fsl-dpaa2-eth fsl-dpaa2-switch scripts/Makefile.build:254: drivers/net/ethernet/freescale/dpaa2/Makefile: dpmac.o is added to multiple modules: fsl-dpaa2-eth fsl-dpaa2-switch Change the way that dpaa2 is built by moving the two common files into a separate module with exported symbols instead. Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20260402184726.3746487-3-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 19:03:49 -07:00
Arnd Bergmann	df75bd552a	net: ethernet: ti-cpsw: fix linking built-in code to modules There are six variants of the cpsw driver, sharing various parts of the code: davinci-emac, cpsw, cpsw-switchdev, netcp, netcp_ethss and am65-cpsw-nuss. I noticed that this means some files can be linked into more than one loadable module, or even part of vmlinux but also linked into a loadable module, both of which mess up assumptions of the build system, and causes warnings: scripts/Makefile.build:279: cpsw_ale.o is added to multiple modules: ti-am65-cpsw-nuss ti_cpsw ti_cpsw_new scripts/Makefile.build:279: cpsw_priv.o is added to multiple modules: ti_cpsw ti_cpsw_new scripts/Makefile.build:279: cpsw_sl.o is added to multiple modules: ti-am65-cpsw-nuss ti_cpsw ti_cpsw_new scripts/Makefile.build:279: cpsw_ethtool.o is added to multiple modules: ti_cpsw ti_cpsw_new scripts/Makefile.build:279: davinci_cpdma.o is added to multiple modules: ti_cpsw ti_cpsw_new ti_davinci_emac Change this back to having separate modules for each portion that can be linked standalone, exporting symbols as needed: - ti-cpsw-common.ko now contains both cpsw-common.o and davinci_cpdma.o as they are always used together - ti-cpsw-priv.ko contains cpsw_priv.o, cpsw_sl.o and cpsw_ethtool.o, which are the core of the cpsw and cpsw-new drivers. - ti-cpsw-sl.ko contains the cpsw-sl.o object and is used on ti-am65-cpsw-nuss.ko in addition to the two other cpsw variants. - ti-cpsw-ale.o is the one standalone module that is used by all except davinci_emac. Each of these will be built-in if any of its users are built-in, otherwise it's a loadable module if there is at least one module using it. I did not bring back the separate Kconfig symbols for this, but just handle it using Makefile logic. Note: ideally this is something that Kbuild complains about, but usually we just notice when something using THIS_MODULE misbehaves in a way that a user notices. Fixes: `99f6297182` ("net: ethernet: ti: cpsw: drop TI_DAVINCI_CPDMA config option") Link: https://lore.kernel.org/lkml/20240417084400.3034104-1-arnd@kernel.org/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20260402184726.3746487-2-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 19:03:49 -07:00
Arnd Bergmann	961f3c5356	net: ethernet: ti-cpsw:: rename soft_reset() function While looking at the glob symbols shared between the cpsw drivers, I noticed that soft_reset() is the only one that is missing a proper namespace prefix, and will pollute the kernel namespace, so rename it to be consistent with the other symbols. Reviewed-by: Alexander Sverdlin <alexander.sverdlin@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20260402184726.3746487-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 19:03:46 -07:00
Jakub Kicinski	e6b7e1a10c	eth: remove the driver for acenic / tigon1&2 The entire git history for this driver looks like tree-wide and automated cleanups. There's even more coming now with AI, so let's try to delete it instead. Acked-by: Jes Sorensen <jes@trained-monkey.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/20260403220501.2263835-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:52:27 -07:00
Kevin Hao	c321b5676d	net: macb: Use netif_napi_add_tx() instead of netif_napi_add() for TX NAPI The TX NAPI should be registered via netif_napi_add_tx() to avoid unnecessarily polluting the napi_hash table. Signed-off-by: Kevin Hao <haokexin@gmail.com> Link: https://patch.msgid.link/20260403-macb-napi-tx-v1-1-08126a60c65e@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:51:57 -07:00
Jakub Kicinski	646dbda284	Merge branch 'nfc-support-for-five-qualcomm-sdm845-phones' David Heidelberg says: ==================== NFC support for five Qualcomm SDM845 phones - OnePlus 6 / 6T - Pixel 3 / 3 XL - SHIFT 6MQ Verified with NFC card using neard: systemctl enable --now neard nfctool --device nfc0 -1 nfctool -d nfc0 -p gdbus introspect --system --dest org.neard --object-path /org/neard/nfc0/tag0/record0 or use gNFC: https://gitlab.gnome.org/dh/gnfc/ successfully detecting and reading a tag. ==================== Link: https://patch.msgid.link/20260403-oneplus-nfc-v3-0-fbdce57d63c1@ixit.cz Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:50:51 -07:00
David Heidelberg	e72058a4be	dt-bindings: nfc: nxp,nci: Document PN557 compatible The PN557 uses the same hardware as the PN553 but ships with firmware compliant with NCI 2.0. Document PN557 as a compatible device. Signed-off-by: David Heidelberg <david@ixit.cz> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Link: https://patch.msgid.link/20260403-oneplus-nfc-v3-1-fbdce57d63c1@ixit.cz Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:50:46 -07:00
Yue Haibing	2f60df9e61	ip6_tunnel: use generic for_each_ip_tunnel_rcu macro Remove the locally defined for_each_ip6_tunnel_rcu macro and use the generic for_each_ip_tunnel_rcu from linux/if_tunnel.h instead. This eliminates code duplication and ensures consistency across the kernel tunnel implementations. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260403084619.4107978-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:41:03 -07:00
Jason Xing	8a4e3ab61d	net: advance skb_defer_disable_key check in napi_consume_skb When net.core.skb_defer_max is adjusted to zero, napi_consume_skb() shouldn't go into that deeper in skb_attempt_defer_free() because it adds an additional pair of local_bh_enable/disable() which is evidently not needed. Advancing the check of the static key saves more cycles and benefits non defer case. Signed-off-by: Jason Xing <kernelxing@tencent.com> Link: https://patch.msgid.link/20260402034114.65766-1-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:32:04 -07:00
Jakub Kicinski	1ef05ed263	Merge branch 'net-dsa-mxl862xx-add-support-for-bridge-offloading' Daniel Golle says: ==================== net: dsa: mxl862xx: add support for bridge offloading As a next step to complete the mxl862xx DSA driver, add support for offloading forwarding between bridged ports to the switch hardware. This works pretty much without any big surprises, apart from two subtleties: * per-port control over flooding behavior has to be implemented by (ab)using a 0-rate QoS meters as stopper in lack of any better option. * STP state transition unconditionally enables learning on a port even if it was previously explicitely disabled (a firmware bug) Note that as the driver is still lacking all VLAN features (which are going to be added next), at this point some of the bridge_vlan_aware.sh tests are failing after applying this series. This is expected and cannot be avoided without implementing port_vlan_filtering + port_vlan_add/del. And adding both bridge and VLAN offloading at the same time would be too much for anyone to review, so VLAN support is going to be submitted in a follow-up series immediately after this series has been accepted. All other relevant selftests (including bridge_vlan_unaware.sh) are still passing. Inspired by the comments received from Paolo Abeni as reply to v5 the driver now no longer caches bridge port membership in the driver, but instead imports an existing helper from yt921x.c to dsa.h in order to allow the driver to easily iterate over bridge members. The mapping between DSA bridge num and firmware bridge ID is done using a simple fixed-size array in mxl862xx_priv. ==================== Link: https://patch.msgid.link/cover.1775049897.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:30:35 -07:00
Daniel Golle	340bdf9846	net: dsa: mxl862xx: implement bridge offloading Implement joining and leaving bridges as well as add, delete and dump operations on isolated FDBs, port MDB membership management, and setting a port's STP state. The switch supports a maximum of 63 bridges, however, up to 12 may be used as "single-port bridges" to isolate standalone ports. Allowing up to 48 bridges to be offloaded seems more than enough on that hardware, hence that is set as max_num_bridges. A total of 128 bridge ports are supported in the bridge portmap, and virtual bridge ports have to be used eg. for link-aggregation, hence potentially exceeding the number of hardware ports. The firmware-assigned bridge identifier (FID) for each offloaded bridge is stored in an array used to map DSA bridge num to firmware bridge ID, avoiding the need for a driver-private bridge tracking structure. Bridge member portmaps are rebuilt on join/leave using dsa_switch_for_each_bridge_member(). As there are now more users of the BRIDGEPORT_CONFIG_SET API and the state of each port is cached locally, introduce a helper function mxl862xx_set_bridge_port(struct dsa_switch *ds, int port) which applies the cached per-port state to hardware. For standalone user ports (dp->bridge == NULL), it additionally resets the port to single-port bridge state: CPU-only portmap, learning and flooding disabled. The CPU port path sets its state explicitly before calling this helper and is therefore not affected by the reset. Note that MASK_VLAN_BASED_MAC_LEARNING is intentionally absent from the firmware write mask. After mxl862xx_reset(), the firmware initialises all VLAN-based MAC learning fields to 0 (disabled), so SVL is the active mode by default without having to set it explicitly. Note that there is no convenient way to control flooding on per-port level, so the driver is using a 0-rate QoS meter setup as a stopper in lack of any better option. In order to be perfect the firmware-enforced minimum bucket size is bypassed by directly writing 0s to the relevant registers -- without that at least one 64-byte packet could still pass before the meter would change from 'yellow' into 'red' state. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/dd079180e2098e5f9626fcd149b9bad9a1b5a1b2.1775049897.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:30:33 -07:00
Daniel Golle	4250ff1640	dsa: tag_mxl862xx: set dsa_default_offload_fwd_mark() The MxL862xx offloads bridge forwarding in hardware, so set dsa_default_offload_fwd_mark() to avoid duplicate forwarding of packets of (eg. flooded) frames arriving at the CPU port. Link-local frames are directly trapped to the CPU port only, so don't set dsa_default_offload_fwd_mark() on those. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/e1161c90894ddc519c57dc0224b3a0f6bfa1d2d6.1775049897.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:30:33 -07:00
Daniel Golle	f259e08494	net: dsa: add bridge member iteration macro Drivers that offload bridges need to iterate over the ports that are members of a given bridge, for example to rebuild per-port forwarding bitmaps when membership changes. Currently drivers typically open-code this by combining dsa_switch_for_each_user_port() with a dsa_port_offloads_bridge_dev() check, or cache bridge membership within the driver. Add dsa_switch_for_each_bridge_member() macro to express this pattern directly, and use it for the existing dsa_bridge_ports() inline helper. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/e7136aaa26773f39e805a00fe4ecf13cd2b83fc0.1775049897.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:30:33 -07:00
Daniel Golle	b0a79590d1	net: dsa: move dsa_bridge_ports() helper to dsa.h The yt921x driver contains a helper to create a bitmap of ports which are members of a bridge. Move the helper as static inline function into dsa.h, so other driver can make use of it as well. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/4f8bbfce3e4e3a02064fc4dc366263136c6e0383.1775049897.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:30:33 -07:00
Laurence Rowe	98f28d8d6e	vsock: avoid timeout for non-blocking accept() with empty backlog A common pattern in epoll network servers is to eagerly accept all pending connections from the non-blocking listening socket after epoll_wait indicates the socket is ready by calling accept in a loop until EAGAIN is returned indicating that the backlog is empty. Scheduling a timeout for a non-blocking accept with an empty backlog meant AF_VSOCK sockets used by epoll network servers incurred hundreds of microseconds of additional latency per accept loop compared to AF_INET or AF_UNIX sockets. Signed-off-by: Laurence Rowe <laurencerowe@gmail.com> Reviewed-by: Bobby Eshleman <bobbyeshleman@meta.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://patch.msgid.link/20260402204918.130395-1-laurencerowe@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:29:01 -07:00
Daniel Zahka	c8eee00c0f	psp: add missing device stats to get-stats reply attributes Commit `f05d26198c` ("psp: add stats from psp spec to driver facing api") added device statistics (rx-packets, rx-bytes, rx-auth-fail, rx-error, rx-bad, tx-packets, tx-bytes, tx-error) to the stats attribute-set but did not add them to the get-stats operation reply attributes. The kernel reports these attributes in the reply, so list them in the spec to match. Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260403-psp-yaml-fix-v1-1-dacee0663903@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:12:34 -07:00

1 2 3 4 5 ...

1430747 Commits