linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-16 12:31:52 -04:00

Author	SHA1	Message	Date
Anderson Nascimento	d666540d21	rxrpc: Fix key reference count leak from call->key When creating a client call in rxrpc_alloc_client_call(), the code obtains a reference to the key. This is never cleaned up and gets leaked when the call is destroyed. Fix this by freeing call->key in rxrpc_destroy_call(). Before the patch, it shows the key reference counter elevated: $ cat /proc/keys \| grep afs@54321 1bffe9cd I--Q--i 8053480 4169w 3b010000 1000 1000 rxrpc afs@54321: ka $ After the patch, the invalidated key is removed when the code exits: $ cat /proc/keys \| grep afs@54321 $ Fixes: `f3441d4125` ("rxrpc: Copy client call parameters into rxrpc_call earlier") Signed-off-by: Anderson Nascimento <anderson@allelesecurity.com> Co-developed-by: David Howells <dhowells@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260408121252.2249051-9-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-08 18:44:32 -07:00
Alok Tiwari	65b3ffe097	rxrpc: Fix rack timer warning to report unexpected mode rxrpc_rack_timer_expired() clears call->rack_timer_mode to OFF before the switch. The default case warning therefore always prints OFF and doesn't identify the unexpected timer mode. Log the saved mode value instead so the warning reports the actual unexpected rack timer mode. Fixes: `7c48266593` ("rxrpc: Implement RACK/TLP to deal with transmission stalls [RFC8985]") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260408121252.2249051-8-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-08 18:44:32 -07:00
Alok Tiwari	b33f5741bb	rxrpc: Fix use of wrong skb when comparing queued RESP challenge serial In rxrpc_post_response(), the code should be comparing the challenge serial number from the cached response before deciding to switch to a newer response, but looks at the newer packet private data instead, rendering the comparison always false. Fix this by switching to look at the older packet. Fix further[1] to substitute the new packet in place of the old one if newer and also to release whichever we don't use. Fixes: `5800b1cf3f` ("rxrpc: Allow CHALLENGEs to the passed to the app for a RESPONSE") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://sashiko.dev/#/patchset/20260319150150.4189381-1-dhowells%40redhat.com [1] Link: https://patch.msgid.link/20260408121252.2249051-7-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-08 18:44:32 -07:00
Oleh Konko	d179a868dd	rxrpc: Fix RxGK token loading to check bounds rxrpc_preparse_xdr_yfs_rxgk() reads the raw key length and ticket length from the XDR token as u32 values and passes each through round_up(x, 4) before using the rounded value for validation and allocation. When the raw length is >= 0xfffffffd, round_up() wraps to 0, so the bounds check and kzalloc both use 0 while the subsequent memcpy still copies the original ~4 GiB value, producing a heap buffer overflow reachable from an unprivileged add_key() call. Fix this by: (1) Rejecting raw key lengths above AFSTOKEN_GK_KEY_MAX and raw ticket lengths above AFSTOKEN_GK_TOKEN_MAX before rounding, consistent with the caps that the RxKAD path already enforces via AFSTOKEN_RK_TIX_MAX. (2) Sizing the flexible-array allocation from the validated raw key length via struct_size_t() instead of the rounded value. (3) Caching the raw lengths so that the later field assignments and memcpy calls do not re-read from the token, eliminating a class of TOCTOU re-parse. The control path (valid token with lengths within bounds) is unaffected. Fixes: `0ca100ff4d` ("rxrpc: Add YFS RxGK (GSSAPI) security class") Signed-off-by: Oleh Konko <security@1seal.org> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260408121252.2249051-6-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-08 18:44:32 -07:00
David Howells	146d4ab94c	rxrpc: Fix call removal to use RCU safe deletion Fix rxrpc call removal from the rxnet->calls list to use list_del_rcu() rather than list_del_init() to prevent stuffing up reading /proc/net/rxrpc/calls from potentially getting into an infinite loop. This, however, means that list_empty() no longer works on an entry that's been deleted from the list, making it harder to detect prior deletion. Fix this by: Firstly, make rxrpc_destroy_all_calls() only dump the first ten calls that are unexpectedly still on the list. Limiting the number of steps means there's no need to call cond_resched() or to remove calls from the list here, thereby eliminating the need for rxrpc_put_call() to check for that. rxrpc_put_call() can then be fixed to unconditionally delete the call from the list as it is the only place that the deletion occurs. Fixes: `2baec2c3f8` ("rxrpc: Support network namespacing") Closes: https://sashiko.dev/#/patchset/20260319150150.4189381-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Linus Torvalds <torvalds@linux-foundation.org> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260408121252.2249051-5-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-08 18:44:32 -07:00
David Howells	6a59d84b4f	rxrpc: Fix anonymous key handling In rxrpc_new_client_call_for_sendmsg(), a key with no payload is meant to be substituted for a NULL key pointer, but the variable this is done with is subsequently not used. Fix this by using "key" rather than "rx->key" when filling in the connection parameters. Note that this only affects direct use of AF_RXRPC; the kAFS filesystem doesn't use sendmsg() directly and so bypasses the issue. Further, AF_RXRPC passes a NULL key in if no key is set, so using an anonymous key in that manner works. Since this hasn't been noticed to this point, it might be better just to remove the "key" variable and the code that sets it - and, arguably, rxrpc_init_client_call_security() would be a better place to handle it. Fixes: `19ffa01c9c` ("rxrpc: Use structs to hold connection params and protocol info") Closes: https://sashiko.dev/#/patchset/20260319150150.4189381-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260408121252.2249051-4-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-08 18:44:31 -07:00
David Howells	b555912b9b	rxrpc: Fix key parsing memleak In rxrpc_preparse_xdr_yfs_rxgk(), the memory attached to token->rxgk can be leaked in a few error paths after it's allocated. Fix this by freeing it in the "reject_token:" case. Fixes: `0ca100ff4d` ("rxrpc: Add YFS RxGK (GSSAPI) security class") Closes: https://sashiko.dev/#/patchset/20260319150150.4189381-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260408121252.2249051-3-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-08 18:44:31 -07:00
David Howells	bdbfead6d3	rxrpc: Fix key quota calculation for multitoken keys In the rxrpc key preparsing, every token extracted sets the proposed quota value, but for multitoken keys, this will overwrite the previous proposed quota, losing it. Fix this by adding to the proposed quota instead. Fixes: `8a7a3eb4dd` ("KEYS: RxRPC: Use key preparsing") Closes: https://sashiko.dev/#/patchset/20260319150150.4189381-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260408121252.2249051-2-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-08 18:44:31 -07:00
Felix Gu	c09ea768bd	net: mdio: realtek-rtl9300: use scoped device_for_each_child_node loop Switch to device_for_each_child_node_scoped() to auto-release fwnode references on early exit. Fixes: `24e31e4747` ("net: mdio: Add RTL9300 MDIO driver") Signed-off-by: Felix Gu <ustc.gu@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260405-rtl9300-v1-1-08e4499cf944@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-08 18:42:08 -07:00
Jakub Kicinski	f821664dde	Merge branch 'seg6-fix-dst_cache-sharing-in-seg6-lwtunnel' Andrea Mayer says: ==================== seg6: fix dst_cache sharing in seg6 lwtunnel The seg6 lwtunnel encap uses a single per-route dst_cache shared between seg6_input_core() and seg6_output_core(). These two paths can perform the post-encap SID lookup in different routing contexts (e.g., ip rules matching on the ingress interface, or VRF table separation). Whichever path runs first populates the cache, and the other reuses it blindly, bypassing its own lookup. Patch 1 fixes this by splitting the cache into cache_input and cache_output. Patch 2 adds a selftest that validates the isolation. ==================== Link: https://patch.msgid.link/20260404004405.4057-1-andrea.mayer@uniroma2.it Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-07 20:21:00 -07:00
Andrea Mayer	32dfd742f0	selftests: seg6: add test for dst_cache isolation in seg6 lwtunnel Add a selftest that verifies the dst_cache in seg6 lwtunnel is not shared between the input (forwarding) and output (locally generated) paths. The test creates three namespaces (ns_src, ns_router, ns_dst) connected in a line. An SRv6 encap route on ns_router encapsulates traffic destined to cafe::1 with SID fc00::100. The SID is reachable only for forwarded traffic (from ns_src) via an ip rule matching the ingress interface (iif veth-r0 lookup 100), and blackholed in the main table. The test verifies that: 1. A packet generated locally on ns_router does not reach ns_dst with an empty cache, since the SID is blackholed; 2. A forwarded packet from ns_src populates the input cache from table 100 and reaches ns_dst; 3. A packet generated locally on ns_router still does not reach ns_dst after the input cache is populated, confirming the output path does not reuse the input cache entry. Both the forwarded and local packets are pinned to the same CPU with taskset, since dst_cache is per-cpu. Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Justin Iurman <justin.iurman@gmail.com> Link: https://patch.msgid.link/20260404004405.4057-3-andrea.mayer@uniroma2.it Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-07 20:20:56 -07:00
Andrea Mayer	c3812651b5	seg6: separate dst_cache for input and output paths in seg6 lwtunnel The seg6 lwtunnel uses a single dst_cache per encap route, shared between seg6_input_core() and seg6_output_core(). These two paths can perform the post-encap SID lookup in different routing contexts (e.g., ip rules matching on the ingress interface, or VRF table separation). Whichever path runs first populates the cache, and the other reuses it blindly, bypassing its own lookup. Fix this by splitting the cache into cache_input and cache_output, so each path maintains its own cached dst independently. Fixes: `6c8702c60b` ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels") Cc: stable@vger.kernel.org Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Justin Iurman <justin.iurman@gmail.com> Link: https://patch.msgid.link/20260404004405.4057-2-andrea.mayer@uniroma2.it Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-07 20:20:56 -07:00
Daniel Golle	efaa71faf2	selftests: net: bridge_vlan_mcast: wait for h1 before querier check The querier-interval test adds h1 (currently a slave of the VRF created by simple_if_init) to a temporary bridge br1 acting as an outside IGMP querier. The kernel VRF driver (drivers/net/vrf.c) calls cycle_netdev() on every slave add and remove, toggling the interface admin-down then up. Phylink takes the PHY down during the admin-down half of that cycle. Since h1 and swp1 are cable-connected, swp1 also loses its link may need several seconds to re-negotiate. Use setup_wait_dev $h1 0 which waits for h1 to return to UP state, so the test can rely on the link being back up at this point. Fixes: `4d8610ee8b` ("selftests: net: bridge: add vlan mcast_querier_interval tests") Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> Link: https://patch.msgid.link/c830f130860fd2efae08bfb9e5b25fd028e58ce5.1775424423.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-07 20:16:16 -07:00
Jakub Kicinski	944b3b734c	net: avoid nul-deref trying to bind mp to incapable device Sashiko points out that we use qops in __net_mp_open_rxq() but never validate they are null. This was introduced when check was moved from netdev_rx_queue_restart(). Look at ops directly instead of the locking config. qops imply netdev_need_ops_lock(). We used netdev_need_ops_lock() initially to signify that the real_num_rx_queues check below is safe without rtnl_lock, but I'm not sure if this is actually clear to most people, anyway. Fixes: `da7772a2b4` ("net: move mp->rx_page_size validation to __net_mp_open_rxq()") Acked-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20260404001938.2425670-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-07 18:57:56 -07:00
Johan Alvarado	f2777d5cb5	net: stmmac: dwmac-motorcomm: fix eFUSE MAC address read failure This patch fixes an issue where reading the MAC address from the eFUSE fails due to a race condition. The root cause was identified by comparing the driver's behavior with a custom U-Boot port. In U-Boot, the MAC address was read successfully every time because the driver was loaded later in the boot process, giving the hardware ample time to initialize. In Linux, reading the eFUSE immediately returns all zeros, resulting in a fallback to a random MAC address. Hardware cold-boot testing revealed that the eFUSE controller requires a short settling time to load its internal data. Adding a 2000-5000us delay after the reset ensures the hardware is fully ready, allowing the native MAC address to be read consistently. Fixes: `02ff155ea2` ("net: stmmac: Add glue driver for Motorcomm YT6801 ethernet controller") Reported-by: Georg Gottleuber <ggo@tuxedocomputers.com> Closes: https://lore.kernel.org/24cfefff-1233-4745-8c47-812b502d5d19@tuxedocomputers.com Signed-off-by: Johan Alvarado <contact@c127.dev> Reviewed-by: Yao Zi <me@ziyao.cc> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/fc5992a4-9532-49c3-8ec1-c2f8c5b84ca1@smtp-relay.sendinblue.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-07 18:21:00 -07:00
John Pavlick	95aca8602e	net: sfp: add quirks for Hisense and HSGQ GPON ONT SFP modules Several GPON ONT SFP sticks based on Realtek RTL960x report 1000BASE-LX at 1300MBd in their EEPROM but can operate at 2500base-X. On hosts capable of 2500base-X (e.g. Banana Pi R3 / MT7986), the kernel negotiates only 1G because it trusts the incorrect EEPROM data. Add quirks for: - Hisense-Leox LXT-010S-H - Hisense ZNID-GPON-2311NA - HSGQ HSGQ-XPON-Stick Each quirk advertises 2500base-X and ignores TX_FAULT during the module's ~40s Linux boot time. Tested on Banana Pi R3 (MT7986) with OpenWrt 25.12.1, confirmed 2.5Gbps link and full throughput with flow offloading. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Suggested-by: Marcin Nita <marcin.nita@leolabs.pl> Signed-off-by: John Pavlick <jspavlick@posteo.net> Link: https://patch.msgid.link/20260406132321.72563-1-jspavlick@posteo.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-07 18:13:51 -07:00
Muhammad Alifa Ramdhan	a9b8b18364	net/tls: fix use-after-free in -EBUSY error path of tls_do_encryption The -EBUSY handling in tls_do_encryption(), introduced by commit `8590541473` ("net: tls: handle backlogging of crypto requests"), has a use-after-free due to double cleanup of encrypt_pending and the scatterlist entry. When crypto_aead_encrypt() returns -EBUSY, the request is enqueued to the cryptd backlog and the async callback tls_encrypt_done() will be invoked upon completion. That callback unconditionally restores the scatterlist entry (sge->offset, sge->length) and decrements ctx->encrypt_pending. However, if tls_encrypt_async_wait() returns an error, the synchronous error path in tls_do_encryption() performs the same cleanup again, double-decrementing encrypt_pending and double-restoring the scatterlist. The double-decrement corrupts the encrypt_pending sentinel (initialized to 1), making tls_encrypt_async_wait() permanently skip the wait for pending async callbacks. A subsequent sendmsg can then free the tls_rec via bpf_exec_tx_verdict() while a cryptd callback is still pending, resulting in a use-after-free when the callback fires on the freed record. Fix this by skipping the synchronous cleanup when the -EBUSY async wait returns an error, since the callback has already handled encrypt_pending and sge restoration. Fixes: `8590541473` ("net: tls: handle backlogging of crypto requests") Cc: stable@vger.kernel.org Signed-off-by: Muhammad Alifa Ramdhan <ramdhan@starlabs.sg> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/20260403013617.2838875-1-ramdhan@starlabs.sg Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-07 14:53:42 +02:00
Michael Guralnik	a9d4f4f6e6	net/mlx5: Update the list of the PCI supported devices Add the upcoming ConnectX-10 NVLink-C2C device ID to the table of supported PCI device IDs. Cc: stable@vger.kernel.org Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260403091756.139583-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 19:17:42 -07:00
Jiayuan Chen	0f42e3f4fe	net: skb: fix cross-cache free of KFENCE-allocated skb head SKB_SMALL_HEAD_CACHE_SIZE is intentionally set to a non-power-of-2 value (e.g. 704 on x86_64) to avoid collisions with generic kmalloc bucket sizes. This ensures that skb_kfree_head() can reliably use skb_end_offset to distinguish skb heads allocated from skb_small_head_cache vs. generic kmalloc caches. However, when KFENCE is enabled, kfence_ksize() returns the exact requested allocation size instead of the slab bucket size. If a caller (e.g. bpf_test_init) allocates skb head data via kzalloc() and the requested size happens to equal SKB_SMALL_HEAD_CACHE_SIZE, then slab_build_skb() -> ksize() returns that exact value. After subtracting skb_shared_info overhead, skb_end_offset ends up matching SKB_SMALL_HEAD_HEADROOM, causing skb_kfree_head() to incorrectly free the object to skb_small_head_cache instead of back to the original kmalloc cache, resulting in a slab cross-cache free: kmem_cache_free(skbuff_small_head): Wrong slab cache. Expected skbuff_small_head but got kmalloc-1k Fix this by always calling kfree(head) in skb_kfree_head(). This keeps the free path generic and avoids allocator-specific misclassification for KFENCE objects. Fixes: `bf9f1baa27` ("net: add dedicated kmem_cache for typical/small skb->head") Reported-by: Antonius <antonius@bluedragonsec.com> Closes: https://lore.kernel.org/netdev/CAK8a0jxC5L5N7hq-DT2_NhUyjBxrPocoiDazzsBk4TGgT1r4-A@mail.gmail.com/ Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260403014517.142550-1-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:46:53 -07:00
Stefano Garzarella	24ad7ff668	vsock/test: fix send_buf()/recv_buf() EINTR handling When send() or recv() returns -1 with errno == EINTR, the code skips the break but still adds the return value to nwritten/nread, making it decrease by 1. This leads to wrong buffer offsets and wrong bytes count. Fix it by explicitly continuing the loop on EINTR, so the return value is only added when it is positive. Fixes: `a8ed71a27e` ("vsock/test: add recv_buf() utility function") Fixes: `12329bd51f` ("vsock/test: add send_buf() utility function") Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Luigi Leonardi <leonardi@redhat.com> Link: https://patch.msgid.link/20260403093251.30662-1-sgarzare@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:46:03 -07:00
Jakub Kicinski	270c0637b9	Merge branch 'xsk-tailroom-reservation-and-mtu-validation' Maciej Fijalkowski says: ==================== xsk: tailroom reservation and MTU validation here we fix a long-standing issue regarding multi-buffer scenario in ZC mode - we have not been providing space at the end of the buffer where multi-buffer XDP works on skb_shared_info. This has been brought to our attention via [0]. Unaligned mode does not get any specific treatment, it is user's responsibility to properly handle XSK addresses in queues. With adjustments included here in this set against xskxceiver I have been able to pass the full test suite on ice. [0]: https://community.intel.com/t5/Ethernet-Products/X710-XDP-Packet-Corruption-Issue-DRV-MODE-Zero-Copy-Multi-Buffer/m-p/1724208 ==================== Link: https://patch.msgid.link/20260402154958.562179-1-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:43:54 -07:00
Maciej Fijalkowski	62838e363e	selftests: bpf: adjust rx_dropped xskxceiver's test to respect tailroom Since we have changed how big user defined headroom in umem can be, change the logic in testapp_stats_rx_dropped() so we pass updated headroom validation in xdp_umem_reg() and still drop half of frames. Test works on non-mbuf setup so __xsk_pool_get_rx_frame_size() that is called on xsk_rcv_check() will not account skb_shared_info size. Taking the tailroom size into account in test being fixed is needed as xdp_umem_reg() defaults to respect it. Reviewed-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260402154958.562179-9-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:43:52 -07:00
Maciej Fijalkowski	16546954e1	selftests: bpf: have a separate variable for drop test Currently two different XDP programs share a static variable for different purposes (picking where to redirect on shared umem test & whether to drop a packet). This can be a problem when running full test suite - idx can be written by shared umem test and this value can cause a false behavior within XDP drop half test. Introduce a dedicated variable for drop half test so that these two don't step on each other toes. There is no real need for using __sync_fetch_and_add here as XSK tests are executed on single CPU. Reviewed-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260402154958.562179-8-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:43:52 -07:00
Maciej Fijalkowski	3197c51ce2	selftests: bpf: fix pkt grow tests Skip tail adjust tests in xskxceiver for SKB mode as it is not very friendly for it. multi-buffer case does not work as xdp_rxq_info that is registered for generic XDP does not report ::frag_size. The non-mbuf path copies packet via skb_pp_cow_data() which only accounts for headroom, leaving us with no tailroom and causing underlying XDP prog to drop packets therefore. For multi-buffer test on other modes, change the amount of bytes we use for growth, assume worst-case scenario and take care of headroom and tailroom. Reviewed-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260402154958.562179-7-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:43:51 -07:00
Maciej Fijalkowski	c5866a6be4	selftests: bpf: introduce a common routine for reading procfs Parametrize current way of getting MAX_SKB_FRAGS value from {sys,proc}fs so that it can be re-used to get cache line size of system's CPU. All that just to mimic and compute size of kernel's struct skb_shared_info which for xsk and test suite interpret as tailroom. Introduce two variables to ifobject struct that will carry count of skb frags and tailroom size. Do the reading and computing once, at the beginning of test suite execution in xskxceiver, but for test_progs such way is not possible as in this environment each test setups and torns down ifobject structs. Reviewed-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260402154958.562179-6-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:43:51 -07:00
Maciej Fijalkowski	36ee60b569	xsk: validate MTU against usable frame size on bind AF_XDP bind currently accepts zero-copy pool configurations without verifying that the device MTU fits into the usable frame space provided by the UMEM chunk. This becomes a problem since we started to respect tailroom which is subtracted from chunk_size (among with headroom). 2k chunk size might not provide enough space for standard 1500 MTU, so let us catch such settings at bind time. Furthermore, validate whether underlying HW will be able to satisfy configured MTU wrt XSK's frame size multiplied by supported Rx buffer chain length (that is exposed via net_device::xdp_zc_max_segs). Fixes: `24ea50127e` ("xsk: support mbuf on ZC RX") Reviewed-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260402154958.562179-5-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:43:51 -07:00
Maciej Fijalkowski	93e84fe45b	xsk: fix XDP_UMEM_SG_FLAG issues Currently xp_assign_dev_shared() is missing XDP_USE_SG being propagated to flags so set it in order to preserve mtu check that is supposed to be done only when no multi-buffer setup is in picture. Also, this flag has the same value as XDP_UMEM_TX_SW_CSUM so we could get unexpected SG setups for software Tx checksums. Since csum flag is UAPI, modify value of XDP_UMEM_SG_FLAG. Fixes: `d609f3d228` ("xsk: add multi-buffer support for sockets sharing umem") Reviewed-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260402154958.562179-4-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:43:51 -07:00
Maciej Fijalkowski	1ee1605138	xsk: respect tailroom for ZC setups Multi-buffer XDP stores information about frags in skb_shared_info that sits at the tailroom of a packet. The storage space is reserved via xdp_data_hard_end(): ((xdp)->data_hard_start + (xdp)->frame_sz - \ SKB_DATA_ALIGN(sizeof(struct skb_shared_info))) and then we refer to it via macro below: static inline struct skb_shared_info * xdp_get_shared_info_from_buff(const struct xdp_buff xdp) { return (struct skb_shared_info )xdp_data_hard_end(xdp); } Currently we do not respect this tailroom space in multi-buffer AF_XDP ZC scenario. To address this, introduce xsk_pool_get_tailroom() and use it within xsk_pool_get_rx_frame_size() which is used in ZC drivers to configure length of HW Rx buffer. Typically drivers on Rx Hw buffers side work on 128 byte alignment so let us align the value returned by xsk_pool_get_rx_frame_size() in order to avoid addressing this on driver's side. This addresses the fact that idpf uses mentioned function before pool->dev being set so we were at risk that after subtracting tailroom we would not provide 128-byte aligned value to HW. Since xsk_pool_get_rx_frame_size() is actively used in xsk_rcv_check() and __xsk_rcv(), add a variant of this routine that will not include 128 byte alignment and therefore old behavior is preserved. Reviewed-by: Björn Töpel <bjorn@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Fixes: `24ea50127e` ("xsk: support mbuf on ZC RX") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260402154958.562179-3-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:43:51 -07:00
Maciej Fijalkowski	a315e022a7	xsk: tighten UMEM headroom validation to account for tailroom and min frame The current headroom validation in xdp_umem_reg() could leave us with insufficient space dedicated to even receive minimum-sized ethernet frame. Furthermore if multi-buffer would come to play then skb_shared_info stored at the end of XSK frame would be corrupted. HW typically works with 128-aligned sizes so let us provide this value as bare minimum. Multi-buffer setting is known later in the configuration process so besides accounting for 128 bytes, let us also take care of tailroom space upfront. Reviewed-by: Björn Töpel <bjorn@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Fixes: `99e3a236dd` ("xsk: Add missing check on user supplied headroom size") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260402154958.562179-2-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:43:51 -07:00
Jakub Kicinski	1caa871bb0	Merge branch 'net-stmmac-fix-tegra234-mgbe-clock' Jon Hunter says: ==================== net: stmmac: Fix Tegra234 MGBE clock The name of the PTP ref clock for the Tegra234 MGBE ethernet controller does not match the generic name in the stmmac platform driver. Despite this basic ethernet is functional on the Tegra234 platforms that use this driver and as far as I know, we have not tested PTP support with this driver. Hence, the risk of breaking any functionality is low. The previous attempt to fix this in the stmmac platform driver, by supporting the Tegra234 PTP clock name, was rejected [0]. The preference from the netdev maintainers is to fix this in the DT binding for Tegra234. This series fixes this by correcting the device-tree binding to align with the generic name for the PTP clock. I understand that this is breaking the ABI for this device, which we should never do, but this is a last resort for getting this fixed. I am open to any better ideas to fix this. Please note that we still maintain backward compatibility in the driver to allow older device-trees to work, but we don't advertise this via the binding, because I did not see any value in doing so. ==================== Link: https://patch.msgid.link/20260401102941.17466-1-jonathanh@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-03 16:02:31 -07:00
Jon Hunter	fb22b1fc5b	dt-bindings: net: Fix Tegra234 MGBE PTP clock The PTP clock for the Tegra234 MGBE device is incorrectly named 'ptp-ref' and should be 'ptp_ref'. This is causing the following warning to be observed on Tegra234 platforms that use this device: ERR KERN tegra-mgbe 6800000.ethernet eth0: Invalid PTP clock rate WARNING KERN tegra-mgbe 6800000.ethernet eth0: PTP init failed Although this constitutes an ABI breakage in the binding for this device, PTP support has clearly never worked and so fix this now so we can correct the device-tree for this device. Note that the MGBE driver still supports the legacy 'ptp-ref' clock name and so older/existing device-trees will still work, but given that this is not the correct name, there is no point to advertise this in the binding. Fixes: `189c2e5c76` ("dt-bindings: net: Add Tegra234 MGBE") Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Link: https://patch.msgid.link/20260401102941.17466-3-jonathanh@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-03 16:02:30 -07:00
Jon Hunter	1345e9f4e3	net: stmmac: Fix PTP ref clock for Tegra234 Since commit `030ce919e1` ("net: stmmac: make sure that ptp_rate is not 0 before configuring timestamping") was added the following error is observed on Tegra234: ERR KERN tegra-mgbe 6800000.ethernet eth0: Invalid PTP clock rate WARNING KERN tegra-mgbe 6800000.ethernet eth0: PTP init failed It turns out that the Tegra234 device-tree binding defines the PTP ref clock name as 'ptp-ref' and not 'ptp_ref' and the above commit now exposes this and that the PTP clock is not configured correctly. In order to update device-tree to use the correct 'ptp_ref' name, update the Tegra MGBE driver to use 'ptp_ref' by default and fallback to using 'ptp-ref' if this clock name is present. Fixes: `d8ca113724` ("net: stmmac: tegra: Add MGBE support") Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260401102941.17466-2-jonathanh@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-03 16:02:21 -07:00
Pengpeng Hou	5c14a19d5b	nfc: s3fwrn5: allocate rx skb before consuming bytes s3fwrn82_uart_read() reports the number of accepted bytes to the serdev core. The current code consumes bytes into recv_skb and may already deliver a complete frame before allocating a fresh receive buffer. If that alloc_skb() fails, the callback returns 0 even though it has already consumed bytes, and it leaves recv_skb as NULL for the next receive callback. That breaks the receive_buf() accounting contract and can also lead to a NULL dereference on the next skb_put_u8(). Allocate the receive skb lazily before consuming the next byte instead. If allocation fails, return the number of bytes already accepted. Fixes: `3f52c2cb7e` ("nfc: s3fwrn5: Support a UART interface") Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn> Link: https://patch.msgid.link/20260402042148.65236-1-pengpeng@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-03 15:57:46 -07:00
Chris J Arges	77facb3522	net: increase IP_TUNNEL_RECURSION_LIMIT to 5 In configurations with multiple tunnel layers and MPLS lwtunnel routing, a single tunnel hop can increment the counter beyond this limit. This causes packets to be dropped with the "Dead loop on virtual device" message even when a routing loop doesn't exist. Increase IP_TUNNEL_RECURSION_LIMIT from 4 to 5 to handle this use-case. Fixes: `6f1a9140ec` ("net: add xmit recursion limit to tunnel xmit functions") Link: https://lore.kernel.org/netdev/88deb91b-ef1b-403c-8eeb-0f971f27e34f@redhat.com/ Signed-off-by: Chris J Arges <carges@cloudflare.com> Link: https://patch.msgid.link/20260402222401.3408368-1-carges@cloudflare.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-03 15:52:10 -07:00
Yiqi Sun	fde29fd934	ipv4: icmp: fix null-ptr-deref in icmp_build_probe() ipv6_stub->ipv6_dev_find() may return ERR_PTR(-EAFNOSUPPORT) when the IPv6 stack is not active (CONFIG_IPV6=m and not loaded), and passing this error pointer to dev_hold() will cause a kernel crash with null-ptr-deref. Instead, silently discard the request. RFC 8335 does not appear to define a specific response for the case where an IPv6 interface identifier is syntactically valid but the implementation cannot perform the lookup at runtime, and silently dropping the request may safer than misreporting "No Such Interface". Fixes: `d329ea5bd8` ("icmp: add response to RFC 8335 PROBE messages") Signed-off-by: Yiqi Sun <sunyiqixm@gmail.com> Link: https://patch.msgid.link/20260402070419.2291578-1-sunyiqixm@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-03 15:46:17 -07:00
Fernando Fernandez Mancera	14cf0cd353	ipv4: nexthop: allocate skb dynamically in rtm_get_nexthop() When querying a nexthop object via RTM_GETNEXTHOP, the kernel currently allocates a fixed-size skb using NLMSG_GOODSIZE. While sufficient for single nexthops and small Equal-Cost Multi-Path groups, this fixed allocation fails for large nexthop groups like 512 nexthops. This results in the following warning splat: WARNING: net/ipv4/nexthop.c:3395 at rtm_get_nexthop+0x176/0x1c0, CPU#20: rep/4608 [...] RIP: 0010:rtm_get_nexthop (net/ipv4/nexthop.c:3395) [...] Call Trace: <TASK> rtnetlink_rcv_msg (net/core/rtnetlink.c:6989) netlink_rcv_skb (net/netlink/af_netlink.c:2550) netlink_unicast (net/netlink/af_netlink.c:1319 net/netlink/af_netlink.c:1344) netlink_sendmsg (net/netlink/af_netlink.c:1894) ____sys_sendmsg (net/socket.c:721 net/socket.c:736 net/socket.c:2585) ___sys_sendmsg (net/socket.c:2641) __sys_sendmsg (net/socket.c:2671) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) </TASK> Fix this by allocating the size dynamically using nh_nlmsg_size() and using nlmsg_new(), this is consistent with nexthop_notify() behavior. In addition, adjust nh_nlmsg_size_grp() so it calculates the size needed based on flags passed. While at it, also add the size of NHA_FDB for nexthop group size calculation as it was missing too. This cannot be reproduced via iproute2 as the group size is currently limited and the command fails as follows: addattr_l ERROR: message exceeded bound of 1048 Fixes: `430a049190` ("nexthop: Add support for nexthop groups") Reported-by: Yiming Qian <yimingqian591@gmail.com> Closes: https://lore.kernel.org/netdev/CAL_bE8Li2h4KO+AQFXW4S6Yb_u5X4oSKnkywW+LPFjuErhqELA@mail.gmail.com/ Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20260402072613.25262-2-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-03 15:34:27 -07:00
Fernando Fernandez Mancera	06aaf04ca8	ipv4: nexthop: avoid duplicate NHA_HW_STATS_ENABLE on nexthop group dump Currently NHA_HW_STATS_ENABLE is included twice everytime a dump of nexthop group is performed with NHA_OP_FLAG_DUMP_STATS. As all the stats querying were moved to nla_put_nh_group_stats(), leave only that instance of the attribute querying. Fixes: `5072ae00ae` ("net: nexthop: Expose nexthop group HW stats to user space") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20260402072613.25262-1-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-03 15:34:27 -07:00
Pengpeng Hou	b76254c55d	net: qualcomm: qca_uart: report the consumed byte on RX skb allocation failure qca_tty_receive() consumes each input byte before checking whether a completed frame needs a fresh receive skb. When the current byte completes a frame, the driver delivers that frame and then allocates a new skb for the next one. If that allocation fails, the current code returns i even though data[i] has already been consumed and may already have completed the delivered frame. Since serdev interprets the return value as the number of accepted bytes, this under-reports progress by one byte and can replay the final byte of the completed frame into a fresh parser state on the next call. Return i + 1 in that failure path so the accepted-byte count matches the actual receive-state progress. Fixes: `dfc768fbe6` ("net: qualcomm: add QCA7000 UART driver") Cc: stable@vger.kernel.org Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn> Reviewed-by: Stefan Wahren <wahrenst@gmx.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260402071207.4036-1-pengpeng@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-03 15:32:56 -07:00
Oleh Konko	48a5fe3877	tipc: fix bc_ackers underflow on duplicate GRP_ACK_MSG The GRP_ACK_MSG handler in tipc_group_proto_rcv() currently decrements bc_ackers on every inbound group ACK, even when the same member has already acknowledged the current broadcast round. Because bc_ackers is a u16, a duplicate ACK received after the last legitimate ACK wraps the counter to 65535. Once wrapped, tipc_group_bc_cong() keeps reporting congestion and later group broadcasts on the affected socket stay blocked until the group is recreated. Fix this by ignoring duplicate or stale ACKs before touching bc_acked or bc_ackers. This makes repeated GRP_ACK_MSG handling idempotent and prevents the underflow path. Fixes: `2f487712b8` ("tipc: guarantee that group broadcast doesn't bypass group unicast") Cc: stable@vger.kernel.org Signed-off-by: Oleh Konko <security@1seal.org> Reviewed-by: Tung Nguyen <tung.quang.nguyen@est.tech> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/41a4833f368641218e444fdcff822039.security@1seal.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-03 15:31:17 -07:00
Nikolaos Gkarlis	7b735ef812	rtnetlink: add missing netlink_ns_capable() check for peer netns rtnl_newlink() lacks a CAP_NET_ADMIN capability check on the peer network namespace when creating paired devices (veth, vxcan, netkit). This allows an unprivileged user with a user namespace to create interfaces in arbitrary network namespaces, including init_net. Add a netlink_ns_capable() check for CAP_NET_ADMIN in the peer namespace before allowing device creation to proceed. Fixes: `81adee47df` ("net: Support specifying the network namespace upon device creation.") Signed-off-by: Nikolaos Gkarlis <nickgarlis@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260402181432.4126920-1-nickgarlis@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-03 15:07:18 -07:00
Zijing Yin	1979645e18	bridge: guard local VLAN-0 FDB helpers against NULL vlan group When CONFIG_BRIDGE_VLAN_FILTERING is not set, br_vlan_group() and nbp_vlan_group() return NULL (br_private.h stub definitions). The BR_BOOLOPT_FDB_LOCAL_VLAN_0 toggle code is compiled unconditionally and reaches br_fdb_delete_locals_per_vlan_port() and br_fdb_insert_locals_per_vlan_port(), where the NULL vlan group pointer is dereferenced via list_for_each_entry(v, &vg->vlan_list, vlist). The observed crash is in the delete path, triggered when creating a bridge with IFLA_BR_MULTI_BOOLOPT containing BR_BOOLOPT_FDB_LOCAL_VLAN_0 via RTM_NEWLINK. The insert helper has the same bug pattern. Oops: general protection fault, probably for non-canonical address 0xdffffc0000000056: 0000 [#1] KASAN NOPTI KASAN: null-ptr-deref in range [0x00000000000002b0-0x00000000000002b7] RIP: 0010:br_fdb_delete_locals_per_vlan+0x2b9/0x310 Call Trace: br_fdb_toggle_local_vlan_0+0x452/0x4c0 br_toggle_fdb_local_vlan_0+0x31/0x80 net/bridge/br.c:276 br_boolopt_toggle net/bridge/br.c:313 br_boolopt_multi_toggle net/bridge/br.c:364 br_changelink net/bridge/br_netlink.c:1542 br_dev_newlink net/bridge/br_netlink.c:1575 Add NULL checks for the vlan group pointer in both helpers, returning early when there are no VLANs to iterate. This matches the existing pattern used by other bridge FDB functions such as br_fdb_add() and br_fdb_delete(). Fixes: `21446c06b4` ("net: bridge: Introduce UAPI for BR_BOOLOPT_FDB_LOCAL_VLAN_0") Signed-off-by: Zijing Yin <yzjaurora@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20260402140153.3925663-1-yzjaurora@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-03 14:45:51 -07:00
Eric Dumazet	4e65a8b8da	ipv6: ioam: fix potential NULL dereferences in __ioam6_fill_trace_data() We need to check __in6_dev_get() for possible NULL value, as suggested by Yiming Qian. Also add skb_dst_dev_rcu() instead of skb_dst_dev(), and two missing READ_ONCE(). Note that @dev can't be NULL. Fixes: `9ee11f0fff` ("ipv6: ioam: Data plane support for Pre-allocated Trace") Reported-by: Yiming Qian <yimingqian591@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Justin Iurman <justin.iurman@gmail.com> Link: https://patch.msgid.link/20260402101732.1188059-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-03 14:44:43 -07:00
Lorenzo Bianconi	285fa6b1e0	net: airoha: Fix memory leak in airoha_qdma_rx_process() If an error occurs on the subsequents buffers belonging to the non-linear part of the skb (e.g. due to an error in the payload length reported by the NIC or if we consumed all the available fragments for the skb), the page_pool fragment will not be linked to the skb so it will not return to the pool in the airoha_qdma_rx_process() error path. Fix the memory leak partially reverting commit 'd6d2b0e1538d ("net: airoha: Fix page recycling in airoha_qdma_rx_process()")' and always running page_pool_put_full_page routine in the airoha_qdma_rx_process() error path. Fixes: `d6d2b0e153` ("net: airoha: Fix page recycling in airoha_qdma_rx_process()") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260402-airoha_qdma_rx_process-mem-leak-fix-v1-1-b5706f402d3c@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-03 14:42:51 -07:00
Eric Dumazet	b120e4432f	net: lapbether: handle NETDEV_PRE_TYPE_CHANGE lapbeth_data_transmit() expects the underlying device type to be ARPHRD_ETHER. Returning NOTIFY_BAD from lapbeth_device_event() makes sure bonding driver can not break this expectation. Fixes: `872254dd6b` ("net/bonding: Enable bonding to enslave non ARPHRD_ETHER") Reported-by: syzbot+d8c285748fa7292580a9@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/69cd22a1.050a0220.70c3a.0002.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Martin Schiller <ms@dev.tdt.de> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260402103519.1201565-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-03 14:40:36 -07:00
Arnd Bergmann	e16a0d3677	net: fec: make FIXED_PHY dependency unconditional When CONFIG_FIXED_PHY is in a loadable module, the fec driver cannot be built-in any more: x86_64-linux-ld: vmlinux.o: in function `fec_enet_mii_probe': fec_main.c:(.text+0xc4f367): undefined reference to `fixed_phy_unregister' x86_64-linux-ld: vmlinux.o: in function `fec_enet_close': fec_main.c:(.text+0xc59591): undefined reference to `fixed_phy_unregister' x86_64-linux-ld: vmlinux.o: in function `fec_enet_mii_probe.cold': Select the fixed phy support on all targets to make this build correctly, not just on coldfire. Notat that Essentially the stub helpers in include/linux/phy_fixed.h cannot be used correctly because of this build time dependency, and we could just remove them to hit the build failure more often when a driver uses them without the 'select FIXED_PHY'. Fixes: `dc86b621e1` ("net: fec: register a fixed phy using fixed_phy_register_100fd if needed") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260402141048.2713445-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-03 14:39:37 -07:00
Ruide Cao	c842743d07	net: sched: act_csum: validate nested VLAN headers tcf_csum_act() walks nested VLAN headers directly from skb->data when an skb still carries in-payload VLAN tags. The current code reads vlan->h_vlan_encapsulated_proto and then pulls VLAN_HLEN bytes without first ensuring that the full VLAN header is present in the linear area. If only part of an inner VLAN header is linearized, accessing h_vlan_encapsulated_proto reads past the linear area, and the following skb_pull(VLAN_HLEN) may violate skb invariants. Fix this by requiring pskb_may_pull(skb, VLAN_HLEN) before accessing and pulling each nested VLAN header. If the header still is not fully available, drop the packet through the existing error path. Fixes: `2ecba2d1e4` ("net: sched: act_csum: Fix csum calc for tagged packets") Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Co-developed-by: Yuan Tan <yuantan098@gmail.com> Signed-off-by: Yuan Tan <yuantan098@gmail.com> Suggested-by: Xin Liu <bird@lzu.edu.cn> Tested-by: Ren Wei <enjou1224z@gmail.com> Signed-off-by: Ruide Cao <caoruide123@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/22df2fcb49f410203eafa5d97963dd36089f4ecf.1774892775.git.caoruide123@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-03 14:34:56 -07:00
Tyllis Xu	51f4e090b9	net: stmmac: fix integer underflow in chain mode The jumbo_frm() chain-mode implementation unconditionally computes len = nopaged_len - bmax; where nopaged_len = skb_headlen(skb) (linear bytes only) and bmax is BUF_SIZE_8KiB or BUF_SIZE_2KiB. However, the caller stmmac_xmit() decides to invoke jumbo_frm() based on skb->len (total length including page fragments): is_jumbo = stmmac_is_jumbo_frm(priv, skb->len, enh_desc); When a packet has a small linear portion (nopaged_len <= bmax) but a large total length due to page fragments (skb->len > bmax), the subtraction wraps as an unsigned integer, producing a huge len value (~0xFFFFxxxx). This causes the while (len != 0) loop to execute hundreds of thousands of iterations, passing skb->data + bmax * i pointers far beyond the skb buffer to dma_map_single(). On IOMMU-less SoCs (the typical deployment for stmmac), this maps arbitrary kernel memory to the DMA engine, constituting a kernel memory disclosure and potential memory corruption from hardware. Fix this by introducing a buf_len local variable clamped to min(nopaged_len, bmax). Computing len = nopaged_len - buf_len is then always safe: it is zero when the linear portion fits within a single descriptor, causing the while (len != 0) loop to be skipped naturally, and the fragment loop in stmmac_xmit() handles page fragments afterward. Fixes: `286a837217` ("stmmac: add CHAINED descriptor mode support (V4)") Cc: stable@vger.kernel.org Signed-off-by: Tyllis Xu <LivelyCarpet87@gmail.com> Link: https://patch.msgid.link/20260401044708.1386919-1-LivelyCarpet87@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-02 18:28:10 -07:00
David Carlier	6dede39676	net: altera-tse: fix skb leak on DMA mapping error in tse_start_xmit() When dma_map_single() fails in tse_start_xmit(), the function returns NETDEV_TX_OK without freeing the skb. Since NETDEV_TX_OK tells the stack the packet was consumed, the skb is never freed, leaking memory on every DMA mapping failure. Add dev_kfree_skb_any() before returning to properly free the skb. Fixes: `bbd2190ce9` ("Altera TSE: Add main and header file for Altera Ethernet Driver") Cc: stable@vger.kernel.org Signed-off-by: David Carlier <devnexen@gmail.com> Link: https://patch.msgid.link/20260401211218.279185-1-devnexen@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-02 18:25:23 -07:00
Allison Henderson	e9c9f084cd	MAINTAINERS: Update email for Allison Henderson Switch active email address to kernel.org alias Signed-off-by: Allison Henderson <achender@kernel.org> Link: https://patch.msgid.link/20260402005833.38376-1-achender@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-02 18:06:03 -07:00
Qingfang Deng	2fd68c7ea2	MAINTAINERS: orphan PPP over Ethernet driver We haven't seen activities from Michal Ostrowski for quite a long time. The last commit from him is `fb64bb560e` ("PPPoE: Fix flush/close races."), which was in 2009. Email to mostrows@earthlink.net also bounces. Signed-off-by: Qingfang Deng <dqfext@gmail.com> Link: https://patch.msgid.link/20260401022842.15082-1-dqfext@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-02 18:03:47 -07:00

1 2 3 4 5 ...

1429534 Commits