linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-04-06 06:43:13 -04:00

Author	SHA1	Message	Date
Kees Cook	189f164e57	Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses Conversion performed via this Coccinelle script: // SPDX-License-Identifier: GPL-2.0-only // Options: --include-headers-for-types --all-includes --include-headers --keep-comments virtual patch @gfp depends on patch && !(file in "tools") && !(file in "samples")@ identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex, kzalloc_obj,kzalloc_objs,kzalloc_flex, kvmalloc_obj,kvmalloc_objs,kvmalloc_flex, kvzalloc_obj,kvzalloc_objs,kvzalloc_flex}; @@ ALLOC(... - , GFP_KERNEL ) $ make coccicheck MODE=patch COCCI=gfp.cocci Build and boot tested x86_64 with Fedora 42's GCC and Clang: Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-22 08:26:33 -08:00
Linus Torvalds	32a92f8c89	Convert more 'alloc_obj' cases to default GFP_KERNEL arguments This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 20:03:00 -08:00
Linus Torvalds	323bbfcf1e	Convert 'alloc_flex' family to use the new default GFP_KERNEL argument This is the exact same thing as the 'alloc_obj()' version, only much smaller because there are a lot fewer users of the alloc_flex() interface. As with alloc_obj() version, this was done entirely with mindless brute force, using the same script, except using 'flex' in the pattern rather than 'objs'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Linus Torvalds	bf4afc53b7	Convert 'alloc_obj' family to use the new default GFP_KERNEL argument This was done entirely with mindless brute force, using git grep -l '\<k[vmz]alloc_objs(., GFP_KERNEL)' \| xargs sed -i 's/$alloc_objs(.*$, GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Kees Cook	7a70c15bd1	kmalloc_obj: Clean up after treewide replacements Coccinelle doesn't handle re-indenting line escapes. Fix the 2 places where these got misaligned. Remove 2 now-redundant type casts, found with: $ git grep -P 'struct (\S+).\)\sk\S+alloc_(objs?\|flex)\(struct \1' Signed-off-by: Kees Cook <kees@kernel.org>	2026-02-21 01:02:52 -08:00
Kees Cook	69050f8d6d	treewide: Replace kmalloc with kmalloc_obj for non-scalar types This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(PTR, FAM, COUNT, ...) (where TYPE may also be VAR) The resulting allocations no longer return "void ", instead returning "TYPE ". Signed-off-by: Kees Cook <kees@kernel.org>	2026-02-21 01:02:28 -08:00
Linus Torvalds	8bf22c33e7	Merge tag 'net-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from Netfilter. Current release - new code bugs: - net: fix backlog_unlock_irq_restore() vs CONFIG_PREEMPT_RT - eth: mlx5e: XSK, Fix unintended ICOSQ change - phy_port: correctly recompute the port's linkmodes - vsock: prevent child netns mode switch from local to global - couple of kconfig fixes for new symbols Previous releases - regressions: - nfc: nci: fix false-positive parameter validation for packet data - net: do not delay zero-copy skbs in skb_attempt_defer_free() Previous releases - always broken: - mctp: ensure our nlmsg responses to user space are zero-initialised - ipv6: ioam: fix heap buffer overflow in __ioam6_fill_trace_data() - fixes for ICMP rate limiting Misc: - intel: fix PCI device ID conflict between i40e and ipw2200" * tag 'net-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits) net: nfc: nci: Fix parameter validation for packet data net/mlx5e: Use unsigned for mlx5e_get_max_num_channels net/mlx5e: Fix deadlocks between devlink and netdev instance locks net/mlx5e: MACsec, add ASO poll loop in macsec_aso_set_arm_event net/mlx5: Fix misidentification of write combining CQE during poll loop net/mlx5e: Fix misidentification of ASO CQE during poll loop net/mlx5: Fix multiport device check over light SFs bonding: alb: fix UAF in rlb_arp_recv during bond up/down bnge: fix reserving resources from FW eth: fbnic: Advertise supported XDP features. rds: tcp: fix uninit-value in __inet_bind net/rds: Fix NULL pointer dereference in rds_tcp_accept_one octeontx2-af: Fix default entries mcam entry action net/mlx5e: XSK, Fix unintended ICOSQ change ipv6: icmp: icmpv6_xrlim_allow() optimization if net.ipv6.icmp.ratelimit is zero ipv4: icmp: icmpv4_xrlim_allow() optimization if net.ipv4.icmp_ratelimit is zero ipv6: icmp: remove obsolete code in icmpv6_xrlim_allow() inet: move icmp_global_{credit,stamp} to a separate cache line icmp: prevent possible overflow in icmp_global_allow() selftests/net: packetdrill: add ipv4-mapped-ipv6 tests ...	2026-02-19 10:39:08 -08:00
Cosmin Ratiu	57a94d4b22	net/mlx5e: Use unsigned for mlx5e_get_max_num_channels The max number of channels is always an unsigned int, use the correct type to fix compilation errors done with strict type checking, e.g.: error: call to ‘__compiletime_assert_1110’ declared with attribute error: min(mlx5e_get_devlink_param_num_doorbells(mdev), mlx5e_get_max_num_channels(mdev)) signedness error Fixes: `74a8dadac1` ("net/mlx5e: Preparations for supporting larger number of channels") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260218072904.1764634-7-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-19 09:30:19 -08:00
Cosmin Ratiu	83ac0304a2	net/mlx5e: Fix deadlocks between devlink and netdev instance locks In the mentioned "Fixes" commit, various work tasks triggering devlink health reporter recovery were switched to use netdev_trylock to protect against concurrent tear down of the channels being recovered. But this had the side effect of introducing potential deadlocks because of incorrect lock ordering. The correct lock order is described by the init flow: probe_one -> mlx5_init_one (acquires devlink lock) -> mlx5_init_one_devl_locked -> mlx5_register_device -> mlx5_rescan_drivers_locked -...-> mlx5e_probe -> _mlx5e_probe -> register_netdev (acquires rtnl lock) -> register_netdevice (acquires netdev lock) => devlink lock -> rtnl lock -> netdev lock. But in the current recovery flow, the order is wrong: mlx5e_tx_err_cqe_work (acquires netdev lock) -> mlx5e_reporter_tx_err_cqe -> mlx5e_health_report -> devlink_health_report (acquires devlink lock => boom!) -> devlink_health_reporter_recover -> mlx5e_tx_reporter_recover -> mlx5e_tx_reporter_recover_from_ctx -> mlx5e_tx_reporter_err_cqe_recover The same pattern exists in: mlx5e_reporter_rx_timeout mlx5e_reporter_tx_ptpsq_unhealthy mlx5e_reporter_tx_timeout Fix these by moving the netdev_trylock calls from the work handlers lower in the call stack, in the respective recovery functions, where they are actually necessary. Fixes: `8f7b00307b` ("net/mlx5e: Convert mlx5 netdevs to instance locking") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260218072904.1764634-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-19 09:30:16 -08:00
Gal Pressman	9854b243ce	net/mlx5e: MACsec, add ASO poll loop in macsec_aso_set_arm_event The macsec_aso_set_arm_event function calls mlx5_aso_poll_cq once without a retry loop. If the CQE is not immediately available after posting the WQE, the function fails unnecessarily. Use read_poll_timeout() to poll 3-10 usecs for CQE, consistent with other ASO polling code paths in the driver. Fixes: `739cfa3451` ("net/mlx5: Make ASO poll CQ usable in atomic context") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260218072904.1764634-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-19 09:30:14 -08:00
Gal Pressman	d451994ebc	net/mlx5: Fix misidentification of write combining CQE during poll loop The write combining completion poll loop uses usleep_range() which can sleep much longer than requested due to scheduler latency. Under load, we witnessed a 20ms+ delay until the process was rescheduled, causing the jiffies based timeout to expire while the thread is sleeping. The original do-while loop structure (poll, sleep, check timeout) would exit without a final poll when waking after timeout, missing a CQE that arrived during sleep. Instead of the open-coded while loop, use the kernel's poll_timeout_us() which always performs an additional check after the sleep expiration, and is less error-prone. Note: poll_timeout_us() doesn't accept a sleep range, by passing 10 sleep_us the sleep range effectively changes from 2-10 to 3-10 usecs. Fixes: `d98995b4bf` ("net/mlx5: Reimplement write combining test") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260218072904.1764634-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-19 09:30:11 -08:00
Gal Pressman	ae3cb71e6c	net/mlx5e: Fix misidentification of ASO CQE during poll loop The ASO completion poll loop uses usleep_range() which can sleep much longer than requested due to scheduler latency. Under load, we witnessed a 20ms+ delay until the process was rescheduled, causing the jiffies based timeout to expire while the thread is sleeping. The original do-while loop structure (poll, sleep, check timeout) would exit without a final poll when waking after timeout, missing a CQE that arrived during sleep. Instead of the open-coded while loop, use the kernel's read_poll_timeout() which always performs an additional check after the sleep expiration, and is less error-prone. Note: read_poll_timeout() doesn't accept a sleep range, by passing 10 sleep_us the sleep range effectively changes from 2-10 to 3-10 usecs. Fixes: `739cfa3451` ("net/mlx5: Make ASO poll CQ usable in atomic context") Fixes: `7e3fce82d9` ("net/mlx5e: Overcome slow response for first macsec ASO WQE") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260218072904.1764634-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-19 09:30:08 -08:00
Hangbin Liu	e6834a4c47	bonding: alb: fix UAF in rlb_arp_recv during bond up/down The ALB RX path may access rx_hashtbl concurrently with bond teardown. During rapid bond up/down cycles, rlb_deinitialize() frees rx_hashtbl while RX handlers are still running, leading to a null pointer dereference detected by KASAN. However, the root cause is that rlb_arp_recv() can still be accessed after setting recv_probe to NULL, which is actually a use-after-free (UAF) issue. That is the reason for using the referenced commit in the Fixes tag. [ 214.174138] Oops: general protection fault, probably for non-canonical address 0xdffffc000000001d: 0000 [#1] SMP KASAN PTI [ 214.186478] KASAN: null-ptr-deref in range [0x00000000000000e8-0x00000000000000ef] [ 214.194933] CPU: 30 UID: 0 PID: 2375 Comm: ping Kdump: loaded Not tainted 6.19.0-rc8+ #2 PREEMPT(voluntary) [ 214.205907] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.14.0 01/14/2022 [ 214.214357] RIP: 0010:rlb_arp_recv+0x505/0xab0 [bonding] [ 214.220320] Code: 0f 85 2b 05 00 00 48 b8 00 00 00 00 00 fc ff df 40 0f b6 ed 48 c1 e5 06 49 03 ad 78 01 00 00 48 8d 7d 28 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 06 0f 8e 12 05 00 00 80 7d 28 00 0f 84 8c 00 [ 214.241280] RSP: 0018:ffffc900073d8870 EFLAGS: 00010206 [ 214.247116] RAX: dffffc0000000000 RBX: ffff888168556822 RCX: ffff88816855681e [ 214.255082] RDX: 000000000000001d RSI: dffffc0000000000 RDI: 00000000000000e8 [ 214.263048] RBP: 00000000000000c0 R08: 0000000000000002 R09: ffffed11192021c8 [ 214.271013] R10: ffff8888c9010e43 R11: 0000000000000001 R12: 1ffff92000e7b119 [ 214.278978] R13: ffff8888c9010e00 R14: ffff888168556822 R15: ffff888168556810 [ 214.286943] FS: 00007f85d2d9cb80(0000) GS:ffff88886ccb3000(0000) knlGS:0000000000000000 [ 214.295966] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 214.302380] CR2: 00007f0d047b5e34 CR3: 00000008a1c2e002 CR4: 00000000001726f0 [ 214.310347] Call Trace: [ 214.313070] <IRQ> [ 214.315318] ? __pfx_rlb_arp_recv+0x10/0x10 [bonding] [ 214.320975] bond_handle_frame+0x166/0xb60 [bonding] [ 214.326537] ? __pfx_bond_handle_frame+0x10/0x10 [bonding] [ 214.332680] __netif_receive_skb_core.constprop.0+0x576/0x2710 [ 214.339199] ? __pfx_arp_process+0x10/0x10 [ 214.343775] ? sched_balance_find_src_group+0x98/0x630 [ 214.349513] ? __pfx___netif_receive_skb_core.constprop.0+0x10/0x10 [ 214.356513] ? arp_rcv+0x307/0x690 [ 214.360311] ? __pfx_arp_rcv+0x10/0x10 [ 214.364499] ? __lock_acquire+0x58c/0xbd0 [ 214.368975] __netif_receive_skb_one_core+0xae/0x1b0 [ 214.374518] ? __pfx___netif_receive_skb_one_core+0x10/0x10 [ 214.380743] ? lock_acquire+0x10b/0x140 [ 214.385026] process_backlog+0x3f1/0x13a0 [ 214.389502] ? process_backlog+0x3aa/0x13a0 [ 214.394174] __napi_poll.constprop.0+0x9f/0x370 [ 214.399233] net_rx_action+0x8c1/0xe60 [ 214.403423] ? __pfx_net_rx_action+0x10/0x10 [ 214.408193] ? lock_acquire.part.0+0xbd/0x260 [ 214.413058] ? sched_clock_cpu+0x6c/0x540 [ 214.417540] ? mark_held_locks+0x40/0x70 [ 214.421920] handle_softirqs+0x1fd/0x860 [ 214.426302] ? __pfx_handle_softirqs+0x10/0x10 [ 214.431264] ? __neigh_event_send+0x2d6/0xf50 [ 214.436131] do_softirq+0xb1/0xf0 [ 214.439830] </IRQ> The issue is reproducible by repeatedly running ip link set bond0 up/down while receiving ARP messages, where rlb_arp_recv() can race with rlb_deinitialize() and dereference a freed rx_hashtbl entry. Fix this by setting recv_probe to NULL and then calling synchronize_net() to wait for any concurrent RX processing to finish. This ensures that no RX handler can access rx_hashtbl after it is freed in bond_alb_deinitialize(). Reported-by: Liang Li <liali@redhat.com> Fixes: `3aba891dde` ("bonding: move processing of recv handlers into handle_frame()") Reviewed-by: Nikolay Aleksandrov <nikolay@nvidia.com> Acked-by: Jay Vosburgh <jv@jvosburgh.net> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20260218060919.101574-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-19 09:30:01 -08:00
Vikas Gupta	604530085b	bnge: fix reserving resources from FW HWRM_FUNC_CFG is used to reserve resources, whereas HWRM_FUNC_QCFG is intended for querying resource information from the firmware. Since __bnge_hwrm_reserve_pf_rings() reserves resources for a specific PF, the command type should be HWRM_FUNC_CFG. Fixes: `627c67f038` ("bng_en: Add resource management support") Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260218052755.4097468-1-vikas.gupta@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-19 09:29:59 -08:00
Dimitri Daskalakis	e977fcb3a3	eth: fbnic: Advertise supported XDP features. Drivers are supposed to advertise the XDP features they support. This was missed while adding XDP support. Before: $ ynl --family netdev --dump dev-get ... {'ifindex': 3, 'xdp-features': set(), 'xdp-rx-metadata-features': set(), 'xsk-features': set()}, ... After: $ ynl --family netdev --dump dev-get ... {'ifindex': 3, 'xdp-features': {'basic', 'rx-sg'}, 'xdp-rx-metadata-features': set(), 'xsk-features': set()}, ... Fixes: `168deb7b31` ("eth: fbnic: Add support for XDP_TX action") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260218030620.3329608-1-dimitri.daskalakis1@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-02-19 18:03:04 +01:00
Hariprasad Kelam	45be47bf5d	octeontx2-af: Fix default entries mcam entry action As per design, AF should update the default MCAM action only when mcam_index is -1. A bug in the previous patch caused default entries to be changed even when the request was not for them. Fixes: `570ba37898` ("octeontx2-af: Update RSS algorithm index") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260216090338.1318976-1-hkelam@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-18 17:11:10 -08:00
Tariq Toukan	0da1dba726	net/mlx5e: XSK, Fix unintended ICOSQ change XSK wakeup must use the async ICOSQ (with proper locking), as it is not guaranteed to run on the same CPU as the channel. The commit that converted the NAPI trigger path to use the sync ICOSQ incorrectly applied the same change to XSK, causing XSK wakeups to use the sync ICOSQ as well. Revert XSK flows to use the async ICOSQ. XDP program attach/detach triggers channel reopen, while XSK pool enable/disable can happen on-the-fly via NDOs without reopening channels. As a result, xsk_pool state cannot be reliably used at mlx5e_open_channel() time to decide whether an async ICOSQ is needed. Update the async_icosq_needed logic to depend on the presence of an XDP program rather than the xsk_pool, ensuring the async ICOSQ is available when XSK wakeups are enabled. This fixes multiple issues: 1. Illegal synchronize_rcu() in an RCU read- side critical section via mlx5e_xsk_wakeup() -> mlx5e_trigger_napi_icosq() -> synchronize_net(). The stack holds RCU read-lock in xsk_poll(). 2. Hitting a NULL pointer dereference in mlx5e_xsk_wakeup(): [] BUG: kernel NULL pointer dereference, address: 0000000000000240 [] #PF: supervisor read access in kernel mode [] #PF: error_code(0x0000) - not-present page [] PGD 0 P4D 0 [] Oops: Oops: 0000 [#1] SMP [] CPU: 0 UID: 0 PID: 2255 Comm: qemu-system-x86 Not tainted 6.19.0-rc5+ #229 PREEMPT(none) [] Hardware name: [...] [] RIP: 0010:mlx5e_xsk_wakeup+0x53/0x90 [mlx5_core] Reported-by: Daniel Borkmann <daniel@iogearbox.net> Closes: https://lore.kernel.org/all/20260123223916.361295-1-daniel@iogearbox.net/ Fixes: `56aca3e0f7` ("net/mlx5e: Use regular ICOSQ for triggering NAPI") Tested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Acked-by: Alice Mikityanska <alice.kernel@fastmail.im> Link: https://patch.msgid.link/20260217074525.1761454-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-18 16:48:28 -08:00
Dimitri Daskalakis	ccd8e87748	eth: fbnic: Add validation for MTU changes Increasing the MTU beyond the HDS threshold causes the hardware to fragment packets across multiple buffers. If a single-buffer XDP program is attached, the driver will drop all multi-frag frames. While we can't prevent a remote sender from sending non-TCP packets larger than the MTU, this will prevent users from inadvertently breaking new TCP streams. Traditionally, drivers supported XDP with MTU less than 4Kb (packet per page). Fbnic currently prevents attaching XDP when MTU is too high. But it does not prevent increasing MTU after XDP is attached. Fixes: `1b0a3950db` ("eth: fbnic: Add XDP pass, drop, abort support") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2026-02-18 06:09:31 +00:00
Thomas Fourier	ffe68c3766	net: ethernet: ec_bhf: Fix dma_free_coherent() dma handle dma_free_coherent() in error path takes priv->rx_buf.alloc_len as the dma handle. This would lead to improper unmapping of the buffer. Change the dma handle to priv->rx_buf.alloc_phys. Fixes: `6af55ff52b` ("Driver for Beckhoff CX5020 EtherCAT master module.") Cc: <stable@vger.kernel.org> Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com> Link: https://patch.msgid.link/20260213164340.77272-2-fourier.thomas@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-17 17:16:55 -08:00
Eric Dumazet	e3f000f0de	macvlan: observe an RCU grace period in macvlan_common_newlink() error path valis reported that a race condition still happens after my prior patch. macvlan_common_newlink() might have made @dev visible before detecting an error, and its caller will directly call free_netdev(dev). We must respect an RCU period, either in macvlan or the core networking stack. After adding a temporary mdelay(1000) in macvlan_forward_source_one() to open the race window, valis repro was: ip link add p1 type veth peer p2 ip link set address 00:00:00:00:00:20 dev p1 ip link set up dev p1 ip link set up dev p2 ip link add mv0 link p2 type macvlan mode source (ip link add invalid% link p2 type macvlan mode source macaddr add 00:00:00:00:00:20 &) ; sleep 0.5 ; ping -c1 -I p1 1.2.3.4 PING 1.2.3.4 (1.2.3.4): 56 data bytes RTNETLINK answers: Invalid argument BUG: KASAN: slab-use-after-free in macvlan_forward_source (drivers/net/macvlan.c:408 drivers/net/macvlan.c:444) Read of size 8 at addr ffff888016bb89c0 by task e/175 CPU: 1 UID: 1000 PID: 175 Comm: e Not tainted 6.19.0-rc8+ #33 NONE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Call Trace: <IRQ> dump_stack_lvl (lib/dump_stack.c:123) print_report (mm/kasan/report.c:379 mm/kasan/report.c:482) ? macvlan_forward_source (drivers/net/macvlan.c:408 drivers/net/macvlan.c:444) kasan_report (mm/kasan/report.c:597) ? macvlan_forward_source (drivers/net/macvlan.c:408 drivers/net/macvlan.c:444) macvlan_forward_source (drivers/net/macvlan.c:408 drivers/net/macvlan.c:444) ? tasklet_init (kernel/softirq.c:983) macvlan_handle_frame (drivers/net/macvlan.c:501) Allocated by task 169: kasan_save_stack (mm/kasan/common.c:58) kasan_save_track (./arch/x86/include/asm/current.h:25 mm/kasan/common.c:70 mm/kasan/common.c:79) __kasan_kmalloc (mm/kasan/common.c:419) __kvmalloc_node_noprof (./include/linux/kasan.h:263 mm/slub.c:5657 mm/slub.c:7140) alloc_netdev_mqs (net/core/dev.c:12012) rtnl_create_link (net/core/rtnetlink.c:3648) rtnl_newlink (net/core/rtnetlink.c:3830 net/core/rtnetlink.c:3957 net/core/rtnetlink.c:4072) rtnetlink_rcv_msg (net/core/rtnetlink.c:6958) netlink_rcv_skb (net/netlink/af_netlink.c:2550) netlink_unicast (net/netlink/af_netlink.c:1319 net/netlink/af_netlink.c:1344) netlink_sendmsg (net/netlink/af_netlink.c:1894) __sys_sendto (net/socket.c:727 net/socket.c:742 net/socket.c:2206) __x64_sys_sendto (net/socket.c:2209) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:131) Freed by task 169: kasan_save_stack (mm/kasan/common.c:58) kasan_save_track (./arch/x86/include/asm/current.h:25 mm/kasan/common.c:70 mm/kasan/common.c:79) kasan_save_free_info (mm/kasan/generic.c:587) __kasan_slab_free (mm/kasan/common.c:287) kfree (mm/slub.c:6674 mm/slub.c:6882) rtnl_newlink (net/core/rtnetlink.c:3845 net/core/rtnetlink.c:3957 net/core/rtnetlink.c:4072) rtnetlink_rcv_msg (net/core/rtnetlink.c:6958) netlink_rcv_skb (net/netlink/af_netlink.c:2550) netlink_unicast (net/netlink/af_netlink.c:1319 net/netlink/af_netlink.c:1344) netlink_sendmsg (net/netlink/af_netlink.c:1894) __sys_sendto (net/socket.c:727 net/socket.c:742 net/socket.c:2206) __x64_sys_sendto (net/socket.c:2209) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:131) Fixes: `f8db6475a8` ("macvlan: fix error recovery in macvlan_common_newlink()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: valis <sec@valis.email> Link: https://patch.msgid.link/20260213142557.3059043-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-17 17:14:17 -08:00
Arnd Bergmann	458c95de5c	net: dsa: MxL862xx: don't force-enable MAXLINEAR_GPHY The newly added dsa driver attempts to enable the corresponding PHY driver, but that one has additional dependencies that may not be available: WARNING: unmet direct dependencies detected for MAXLINEAR_GPHY Depends on [m]: NETDEVICES [=y] && PHYLIB [=y] && (HWMON [=m] \|\| HWMON [=m]=n [=n]) Selected by [y]: - NET_DSA_MXL862 [=y] && NETDEVICES [=y] && NET_DSA [=y] aarch64-linux-ld: drivers/net/phy/mxl-gpy.o: in function `gpy_probe': mxl-gpy.c:(.text.gpy_probe+0x13c): undefined reference to `devm_hwmon_device_register_with_info' aarch64-linux-ld: drivers/net/phy/mxl-gpy.o: in function `gpy_hwmon_read': mxl-gpy.c:(.text.gpy_hwmon_read+0x48): undefined reference to `polynomial_calc' There is actually no compile-time dependency, as DSA correctly uses the PHY abstractions. Remove the 'select' statement to reduce the complexity. Fixes: `23794bec1c` ("net: dsa: add basic initial driver for MxL862xx switches") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/20260216105522.2382373-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-17 17:10:28 -08:00
Linus Torvalds	505d195b0f	Merge tag 'char-misc-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc/IIO driver updates from Greg KH: "Here is the big set of char/misc/iio and other smaller driver subsystem changes for 7.0-rc1. Lots of little things in here, including: - Loads of iio driver changes and updates and additions - gpib driver updates - interconnect driver updates - i3c driver updates - hwtracing (coresight and intel) driver updates - deletion of the obsolete mwave driver - binder driver updates (rust and c versions) - mhi driver updates (causing a merge conflict, see below) - mei driver updates - fsi driver updates - eeprom driver updates - lots of other small char and misc driver updates and cleanups All of these have been in linux-next for a while, with no reported issues" * tag 'char-misc-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (297 commits) mux: mmio: fix regmap leak on probe failure rust_binder: return p from rust_binder_transaction_target_node() drivers: android: binder: Update ARef imports from sync::aref rust_binder: fix needless borrow in context.rs iio: magn: mmc5633: Fix Kconfig for combination of I3C as module and driver builtin iio: sca3000: Fix a resource leak in sca3000_probe() iio: proximity: rfd77402: Add interrupt handling support iio: proximity: rfd77402: Document device private data structure iio: proximity: rfd77402: Use devm-managed mutex initialization iio: proximity: rfd77402: Use kernel helper for result polling iio: proximity: rfd77402: Align polling timeout with datasheet iio: cros_ec: Allow enabling/disabling calibration mode iio: frequency: ad9523: correct kernel-doc bad line warning iio: buffer: buffer_impl.h: fix kernel-doc warnings iio: gyro: itg3200: Fix unchecked return value in read_raw MAINTAINERS: add entry for ADE9000 driver iio: accel: sca3000: remove unused last_timestamp field iio: accel: adxl372: remove unused int2_bitmask field iio: adc: ad7766: Use iio_trigger_generic_data_rdy_poll() iio: magnetometer: Remove IRQF_ONESHOT ...	2026-02-17 09:11:04 -08:00
Arnd Bergmann	636fd32d40	printk: add CONFIG_PRINTK dependency for netconsole The 'select PRINTK_EXECUTION_CTX' line now causes a harmless warning when NETCONSOLE_DYNAMIC is enabled but PRINTK is not: WARNING: unmet direct dependencies detected for PRINTK_EXECUTION_CTX Depends on [n]: PRINTK [=n] Selected by [y]: - NETCONSOLE_DYNAMIC [=y] && NETDEVICES [=y] && NET_CORE [=y] && NETCONSOLE [=y] && SYSFS [=y] && CONFIGFS_FS [=y] && (NETCONSOLE [=y]!=y [=y] \|\| CONFIGFS_FS [=y]!=m [=m]) In that configuration, the netconsole driver is useless anyway, so avoid this with an added dependency that prevents CONFIG_NETCONSOLE to be enabled without CONFIG_PRINTK. Fixes: `60325c27d3` ("printk: Add execution context (task name/CPU) to printk_info") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260213074431.1729627-1-arnd@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-02-17 13:04:20 +01:00
Ethan Nelson-Moore	c7d9be66b7	net: arcnet: com20020-pci: fix support for 2.5Mbit cards Commit `8c14f9c703` ("ARCNET: add com20020 PCI IDs with metadata") converted the com20020-pci driver to use a card info structure instead of a single flag mask in driver_data. However, it failed to take into account that in the original code, driver_data of 0 indicates a card with no special flags, not a card that should not have any card info structure. This introduced a null pointer dereference when cards with no flags were probed. Commit `bd6f1fd5d3` ("net: arcnet: com20020: Fix null-ptr-deref in com20020pci_probe()") then papered over this issue by rejecting cards with no driver_data instead of resolving the problem at its source. Fix the original issue by introducing a new card info structure for 2.5Mbit cards that does not set any flags and using it if no driver_data is present. Fixes: `8c14f9c703` ("ARCNET: add com20020 PCI IDs with metadata") Fixes: `bd6f1fd5d3` ("net: arcnet: com20020: Fix null-ptr-deref in com20020pci_probe()") Cc: stable@vger.kernel.org Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com> Link: https://patch.msgid.link/20260213045510.32368-1-enelsonmoore@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-02-17 12:14:21 +01:00
Ziyi Guo	6d1dc80143	xen-netback: reject zero-queue configuration from guest A malicious or buggy Xen guest can write "0" to the xenbus key "multi-queue-num-queues". The connect() function in the backend only validates the upper bound (requested_num_queues > xenvif_max_queues) but not zero, allowing requested_num_queues=0 to reach vzalloc(array_size(0, sizeof(struct xenvif_queue))), which triggers WARN_ON_ONCE(!size) in __vmalloc_node_range(). On systems with panic_on_warn=1, this allows a guest-to-host denial of service. The Xen network interface specification requires the queue count to be "greater than zero". Add a zero check to match the validation already present in xen-blkback, which has included this guard since its multi-queue support was added. Fixes: `8d3d53b3e4` ("xen-netback: Add support for multiple queues") Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://patch.msgid.link/20260212224040.86674-1-n7l8m4@u.northwestern.edu Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-02-17 11:57:57 +01:00
Ziyi Guo	9e7021d2ae	net: usb: catc: enable basic endpoint checking catc_probe() fills three URBs with hardcoded endpoint pipes without verifying the endpoint descriptors: - usb_sndbulkpipe(usbdev, 1) and usb_rcvbulkpipe(usbdev, 1) for TX/RX - usb_rcvintpipe(usbdev, 2) for interrupt status A malformed USB device can present these endpoints with transfer types that differ from what the driver assumes. Add a catc_usb_ep enum for endpoint numbers, replacing magic constants throughout. Add usb_check_bulk_endpoints() and usb_check_int_endpoints() calls after usb_set_interface() to verify endpoint types before use, rejecting devices with mismatched descriptors at probe time. Similar to - commit `90b7f29617` ("net: usb: rtl8150: enable basic endpoint checking") which fixed the issue in rtl8150. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Suggested-by: Simon Horman <horms@kernel.org> Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu> Link: https://patch.msgid.link/20260212214154.3609844-1-n7l8m4@u.northwestern.edu Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-02-17 11:53:36 +01:00
Antonio Quartulli	94560267d6	ovpn: tcp - don't deref NULL sk_socket member after tcp_close() When deleting a peer in case of keepalive expiration, the peer is removed from the OpenVPN hashtable and is temporary inserted in a "release list" for further processing. This happens in: ovpn_peer_keepalive_work() unlock_ovpn(release_list) This processing includes detaching from the socket being used to talk to this peer, by restoring its original proto and socket ops/callbacks. In case of TCP it may happen that, while the peer is sitting in the release list, userspace decides to close the socket. This will result in a concurrent execution of: tcp_close(sk) __tcp_close(sk) sock_orphan(sk) sk_set_socket(sk, NULL) The last function call will set sk->sk_socket to NULL. When the releasing routine is resumed, ovpn_tcp_socket_detach() will attempt to dereference sk->sk_socket to restore its original ops member. This operation will crash due to sk->sk_socket being NULL. Fix this race condition by testing-and-accessing sk->sk_socket atomically under sk->sk_callback_lock. Link: https://lore.kernel.org/netdev/176996279620.3109699.15382994681575380467@eldamar.lan/ Link: https://github.com/OpenVPN/ovpn-net-next/issues/29 Signed-off-by: Antonio Quartulli <antonio@openvpn.net> Fixes: `11851cbd60` ("ovpn: implement TCP transport") Link: https://patch.msgid.link/20260212213130.11497-1-antonio@openvpn.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-02-17 11:40:36 +01:00
Paolo Abeni	e74ff27fc6	Merge tag 'ovpn-net-20260212' of https://github.com/OpenVPN/ovpn-net-next Antonio Quartulli says: ==================== This batch includes the following fixes: * set sk_user_data before installing callbacks, to avoid dropping early packets * fix use-after-free in ovpn_net_xmit when accessing shared skbs that got released * fix TX bytes stats by adding-up every positively processed GSO segment ==================== Link: https://patch.msgid.link/20260212210340.11260-1-antonio@openvpn.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-02-17 11:16:15 +01:00
Bobby Eshleman	0f30a31b55	eth: fbnic: set DMA_HINT_L4 for all flows fbnic always advertises ETHTOOL_TCP_DATA_SPLIT_ENABLED via ethtool .get_ringparam. To enable proper splitting for all flow types, even for IP/Ethernet flows, this patch sets DMA_HINT_L4 unconditionally for all RSS and NFC flow steering rules. According to the spec, L4 falls back to L3 if no valid L4 is found, and L3 falls back to L2 if no L3 is found. This makes sure that the correct header boundary is used regardless of traffic type. This is important for zero-copy use cases where we must ensure that all ZC packets are split correctly. Fixes: `2b30fc01a6` ("eth: fbnic: Add support for HDS configuration") Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260211-fbnic-tcp-hds-fixes-v1-3-55d050e6f606@meta.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-02-17 10:40:14 +01:00
Bobby Eshleman	bd254115f3	eth: fbnic: increase FBNIC_HDR_BYTES_MIN from 128 to 256 bytes Increase FBNIC_HDR_BYTES_MIN from 128 to 256 bytes. The previous minimum was too small to guarantee that very long L2+L3+L4 headers always fit within the header buffer. When EN_HDR_SPLIT is disabled and a packet exceeds MAX_HEADER_BYTES, splitting occurs at that byte offset instead of the header boundary, resulting in some of the header landing in the payload page. The increased minimum ensures headers always fit with the MAX_HEADER_BYTES cut off and land in the header page. Fixes: `2b30fc01a6` ("eth: fbnic: Add support for HDS configuration") Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Acked-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20260211-fbnic-tcp-hds-fixes-v1-2-55d050e6f606@meta.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-02-17 10:40:14 +01:00
Bobby Eshleman	bbeb3bfbff	eth: fbnic: set FBNIC_QUEUE_RDE_CTL0_EN_HDR_SPLIT on RDE_CTL0 Fix EN_HDR_SPLIT configuration by writing the field to RDE_CTL0 instead of RDE_CTL1. Because drop mode configuration and header splitting enablement both use RDE_CTL0, we consolidate these configurations into the single function fbnic_config_drop_mode. Fixes: `2b30fc01a6` ("eth: fbnic: Add support for HDS configuration") Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Acked-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20260211-fbnic-tcp-hds-fixes-v1-1-55d050e6f606@meta.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-02-17 10:40:14 +01:00
Chengfeng Ye	ee5492fd88	fbnic: close fw_log race between users and teardown Fixes a theoretical race on fw_log between the teardown path and fw_log write functions. fw_log is written inside fbnic_fw_log_write() and can be reached from the mailbox handler fbnic_fw_msix_intr(), but fw_log is freed before IRQ/MBX teardown during cleanup, resulting in a potential data race of dereferencing a freed/null variable. Possible Interleaving Scenario: CPU0: fbnic_fw_msix_intr() // Entry fbnic_fw_log_write() if (fbnic_fw_log_ready()) // true ... preempt ... CPU1: fbnic_remove() // Entry fbnic_fw_log_free() vfree(log->data_start); log->data_start = NULL; CPU0: continues, walks log->entries or writes to log->data_start The initialization also has an incorrect order problem, as the fw_log is currently allocated after MBX setup during initialization. Fix the problems by adjusting the synchronization order to put initialization in place before the mailbox is enabled, and not cleared until after the mailbox has been disabled. Fixes: `ecc53b1b46` ("eth: fbnic: Enable firmware logging") Signed-off-by: Chengfeng Ye <dg573847474@gmail.com> Link: https://patch.msgid.link/20260211191329.530886-1-dg573847474@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-13 12:30:28 -08:00
Daniel Machon	a49d2a2c37	net: sparx5/lan969x: fix PTP clock max_adj value The max_adj field in ptp_clock_info tells userspace how much the PHC clock frequency can be adjusted. ptp4l reads this and will never request a correction larger than max_adj. On both sparx5 and lan969x the clock offset may never converge because the servo needs a frequency correction larger than the current max_adj of 200000 (200 ppm) allows. The servo rails at the max and the offset stays in the tens of microseconds. The hardware has no inherent max adjustment limit; frequency correction is done by writing a 64-bit clock period increment to CLK_PER_CFG, and the register has plenty of range. The 200000 value was just an overly conservative software limit. The max_adj is shared between sparx5 and lan969x, and the increased value is safe for both. Fix this by increasing max_adj to 10000000 (10000 ppm), giving the servo sufficient headroom. Fixes: `0933bd0404` ("net: sparx5: Add support for ptp clocks") Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20260212-sparx5-ptp-max-adj-v2-v1-1-06b200e50ce3@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-13 12:25:49 -08:00
Ziyi Guo	026f6513c5	net: mscc: ocelot: add missing lock protection in ocelot_port_xmit_inj() ocelot_port_xmit_inj() calls ocelot_can_inject() and ocelot_port_inject_frame() without holding the injection group lock. Both functions contain lockdep_assert_held() for the injection lock, and the correct caller felix_port_deferred_xmit() properly acquires the lock using ocelot_lock_inj_grp() before calling these functions. Add ocelot_lock_inj_grp()/ocelot_unlock_inj_grp() around the register injection path to fix the missing lock protection. The FDMA path is not affected as it uses its own locking mechanism. Fixes: `c5e12ac3be` ("net: mscc: ocelot: serialize access to the injection/extraction groups") Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20260208225602.1339325-4-n7l8m4@u.northwestern.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-12 19:01:15 -08:00
Ziyi Guo	47f79b20e7	net: mscc: ocelot: split xmit into FDMA and register injection paths Split ocelot_port_xmit() into two separate functions: - ocelot_port_xmit_fdma(): handles the FDMA injection path - ocelot_port_xmit_inj(): handles the register-based injection path The top-level ocelot_port_xmit() now dispatches to the appropriate function based on the ocelot_fdma_enabled static key. This is a pure refactor with no behavioral change. Separating the two code paths makes each one simpler and prepares for adding proper locking to the register injection path without affecting the FDMA path. Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20260208225602.1339325-3-n7l8m4@u.northwestern.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-12 19:01:15 -08:00
Ziyi Guo	29372f07f7	net: mscc: ocelot: extract ocelot_xmit_timestamp() helper Extract the PTP timestamp handling logic from ocelot_port_xmit() into a separate ocelot_xmit_timestamp() helper function. This is a pure refactor with no behavioral change. The helper returns false if the skb was consumed (freed) due to a timestamp request failure, and true if the caller should continue with frame injection. The rew_op value is returned via pointer. This prepares for splitting ocelot_port_xmit() into separate FDMA and register injection paths in a subsequent patch. Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20260208225602.1339325-2-n7l8m4@u.northwestern.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-12 19:01:15 -08:00
Daniel Machon	6c28aa8dfd	net: sparx5/lan969x: fix DWRR cost max to match hardware register width DWRR (Deficit Weighted Round Robin) scheduling distributes bandwidth across traffic classes based on per-queue cost values, where lower cost means higher bandwidth share. The SPX5_DWRR_COST_MAX constant is 63 (6 bits) but the hardware register field HSCH_DWRR_ENTRY_DWRR_COST is GENMASK(24, 20), only 5 bits wide (max 31). This causes sparx5_weight_to_hw_cost() to compute cost values that silently overflow via FIELD_PREP, resulting in incorrect scheduling weights. Set SPX5_DWRR_COST_MAX to 31 to match the hardware register width. Fixes: `211225428d` ("net: microchip: sparx5: add support for offloading ets qdisc") Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260210-sparx5-fix-dwrr-cost-max-v1-1-58fbdbc25652@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-12 18:59:27 -08:00
Jie Zhang	babab1b42e	net: stmmac: fix oops when split header is enabled For GMAC4, when split header is enabled, in some rare cases, the hardware does not fill buf2 of the first descriptor with payload. Thus we cannot assume buf2 is always fully filled if it is not the last descriptor. Otherwise, the length of buf2 of the second descriptor will be calculated wrong and cause an oops: Unable to handle kernel paging request at virtual address ffff00019246bfc0 ... x2 : 0000000000000040 x1 : ffff00019246bfc0 x0 : ffff00009246c000 Call trace: dcache_inval_poc+0x28/0x58 (P) dma_direct_sync_single_for_cpu+0x38/0x6c __dma_sync_single_for_cpu+0x34/0x6c stmmac_napi_poll_rx+0x8f0/0xb60 __napi_poll.constprop.0+0x30/0x144 net_rx_action+0x160/0x274 handle_softirqs+0x1b8/0x1fc ... To fix this, the PL bit-field in RDES3 register is used for all descriptors, whether it is the last descriptor or not. Fixes: `ec222003bd` ("net: stmmac: Prepare to add Split Header support") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jie Zhang <jie.zhang@analog.com> Link: https://patch.msgid.link/20260209225037.589130-1-jie.zhang@analog.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-12 18:56:00 -08:00
Ethan Nelson-Moore	d03e094473	net: intel: fix PCI device ID conflict between i40e and ipw2200 The ID 8086:104f is matched by both i40e and ipw2200. The same device ID should not be in more than one driver, because in that case, which driver is used is unpredictable. Fix this by taking advantage of the fact that i40e devices use PCI_CLASS_NETWORK_ETHERNET and ipw2200 devices use PCI_CLASS_NETWORK_OTHER to differentiate the devices. Fixes: `2e45d3f467` ("i40e: Add support for X710 B/P & SFP+ cards") Cc: stable@vger.kernel.org Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Link: https://patch.msgid.link/20260210021235.16315-1-enelsonmoore@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-12 18:42:48 -08:00
Chen Ni	fe6515f595	bng_en: Remove duplicate include Remove duplicate inclusion of <net/netdev_queues.h>. Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com> Link: https://patch.msgid.link/20260211032021.2719742-1-nichen@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-12 18:40:57 -08:00
Chen Ni	7c375811b5	myri10ge: replace comma with semicolons Commit `fd24173439` ("myri10ge: avoid uninitialized variable use") added commas instead of semicolons Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260212055028.3248491-1-nichen@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-12 18:09:47 -08:00
Maxime Chevallier	4cebb26ac6	net: phy: phy_port: Correctly recompute the port's linkmodes a PHY-driven phy_port contains a 'supported' field containing the linkmodes available on this port. This is populated based on : - The PHY's reported features - The DT representation of the connector - The PHY's attach_mdi() callback As these different attrbution methods work in conjunction, the helper phy_port_update_supported() recomputes the final 'supported' value based on the populated mediums, linkmodes and pairs. However this recompute wasn't correctly implemented, and added more modes than necessary by or'ing the medium-specific modes to the existing support. Let's fix this and properly filter the modes. Fixes: `589e934d27` ("net: phy: Introduce PHY ports representation") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Link: https://patch.msgid.link/20260205092317.755906-4-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-12 17:44:55 -08:00
Maxime Chevallier	6248f3dc4e	net: phy: phy_port: Cleanup the of-parsing logic for phy_port We don't need to maintain a mediums bitfield, let's drop it and drop a bogus check for empty mediums, as we already check it above. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Link: https://patch.msgid.link/20260205092317.755906-3-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-12 17:44:55 -08:00
Maxime Chevallier	0561667105	net: phy: initialize the port support based on the PHY's for OF ports With the phy_port infrastructure came an ethernet-connector binding, allowing to represent the MDI of a PHY in devicetree. This allows specifying the mediums and pairs of a port. Let's initialize the port's supported list based on what the PHY reports, so that we can then filter it with what the connector allows using. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Link: https://patch.msgid.link/20260205092317.755906-2-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-12 17:44:55 -08:00
Linus Torvalds	136114e0ab	Merge tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" saves disk space by teaching ocfs2 to reclaim suballocator block group space (Heming Zhao) - "Add ARRAY_END(), and use it to fix off-by-one bugs" adds the ARRAY_END() macro and uses it in various places (Alejandro Colomar) - "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" makes the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the page size (Pnina Feder) - "kallsyms: Prevent invalid access when showing module buildid" cleans up kallsyms code related to module buildid and fixes an invalid access crash when printing backtraces (Petr Mladek) - "Address page fault in ima_restore_measurement_list()" fixes a kexec-related crash that can occur when booting the second-stage kernel on x86 (Harshit Mogalapalli) - "kho: ABI headers and Documentation updates" updates the kexec handover ABI documentation (Mike Rapoport) - "Align atomic storage" adds the __aligned attribute to atomic_t and atomic64_t definitions to get natural alignment of both types on csky, m68k, microblaze, nios2, openrisc and sh (Finn Thain) - "kho: clean up page initialization logic" simplifies the page initialization logic in kho_restore_page() (Pratyush Yadav) - "Unload linux/kernel.h" moves several things out of kernel.h and into more appropriate places (Yury Norov) - "don't abuse task_struct.group_leader" removes the usage of ->group_leader when it is "obviously unnecessary" (Oleg Nesterov) - "list private v2 & luo flb" adds some infrastructure improvements to the live update orchestrator (Pasha Tatashin) * tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (107 commits) watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency procfs: fix missing RCU protection when reading real_parent in do_task_stat() watchdog/softlockup: fix sample ring index wrap in need_counting_irqs() kcsan, compiler_types: avoid duplicate type issues in BPF Type Format kho: fix doc for kho_restore_pages() tests/liveupdate: add in-kernel liveupdate test liveupdate: luo_flb: introduce File-Lifecycle-Bound global state liveupdate: luo_file: Use private list list: add kunit test for private list primitives list: add primitives for private list manipulations delayacct: fix uapi timespec64 definition panic: add panic_force_cpu= parameter to redirect panic to a specific CPU netclassid: use thread_group_leader(p) in update_classid_task() RDMA/umem: don't abuse current->group_leader drm/pan*: don't abuse current->group_leader drm/amd: kill the outdated "Only the pthreads threading model is supported" checks drm/amdgpu: don't abuse current->group_leader android/binder: use same_thread_group(proc->tsk, current) in binder_mmap() android/binder: don't abuse current->group_leader kho: skip memoryless NUMA nodes when reserving scratch areas ...	2026-02-12 12:13:01 -08:00
Ralf Lici	b660b13d4c	ovpn: fix VPN TX bytes counting In ovpn_net_xmit, after GSO segmentation and segment processing, the first segment on the list is used to increment VPN TX statistics, which fails to account for any subsequent segments in the chain. Fix this by accumulating the length of every segment that successfully passes skb_share_check into a tx_bytes variable. This ensures the peer statistics accurately reflect the total data volume sent, regardless of whether the original packet was segmented. Fixes: `04ca14955f` ("ovpn: store tunnel and transport statistics") Signed-off-by: Ralf Lici <ralf@mandelbit.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Antonio Quartulli <antonio@openvpn.net>	2026-02-12 15:28:59 +01:00
Ralf Lici	a5ec7baa44	ovpn: fix possible use-after-free in ovpn_net_xmit When building the skb_list in ovpn_net_xmit, skb_share_check will free the original skb if it is shared. The current implementation continues to use the stale skb pointer for subsequent operations: - peer lookup, - skb_dst_drop (even though all segments produced by skb_gso_segment will have a dst attached), - ovpn_peer_stats_increment_tx. Fix this by moving the peer lookup and skb_dst_drop before segmentation so that the original skb is still valid when used. Return early if all segments fail skb_share_check and the list ends up empty. Also switch ovpn_peer_stats_increment_tx to use skb_list.next; the next patch fixes the stats logic. Fixes: `08857b5ec5` ("ovpn: implement basic TX path (UDP)") Signed-off-by: Ralf Lici <ralf@mandelbit.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Antonio Quartulli <antonio@openvpn.net>	2026-02-12 15:28:58 +01:00
Ralf Lici	93686c472e	ovpn: set sk_user_data before overriding callbacks During initialization, we override socket callbacks and set sk_user_data to an ovpn_socket instance. Currently, these two operations are decoupled: callbacks are overridden before sk_user_data is set. While existing callbacks perform safety checks for NULL or non-ovpn sk_user_data, this condition causes a "half-formed" state where valid packets arriving during attachment trigger error logs (e.g., "invoked on non ovpn socket"). Set sk_user_data before overriding the callbacks so that it can be accessed safely from them. Since we already check that the socket has no sk_user_data before setting it, this remains safe even if an interrupt accesses the socket after sk_user_data is set but before the callbacks are overridden. This also requires initializing all protocol-specific fields (such as tcp_tx_work and peer links) before calling ovpn_socket_attach, ensuring the ovpn_socket is fully formed before it becomes visible to any callback. Fixes: `f6226ae7a0` ("ovpn: introduce the ovpn_socket object") Signed-off-by: Ralf Lici <ralf@mandelbit.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Antonio Quartulli <antonio@openvpn.net>	2026-02-12 15:28:56 +01:00
Linus Torvalds	37a93dd5c4	Merge tag 'net-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "Core & protocols: - A significant effort all around the stack to guide the compiler to make the right choice when inlining code, to avoid unneeded calls for small helper and stack canary overhead in the fast-path. This generates better and faster code with very small or no text size increases, as in many cases the call generated more code than the actual inlined helper. - Extend AccECN implementation so that is now functionally complete, also allow the user-space enabling it on a per network namespace basis. - Add support for memory providers with large (above 4K) rx buffer. Paired with hw-gro, larger rx buffer sizes reduce the number of buffers traversing the stack, dincreasing single stream CPU usage by up to ~30%. - Do not add HBH header to Big TCP GSO packets. This simplifies the RX path, the TX path and the NIC drivers, and is possible because user-space taps can now interpret correctly such packets without the HBH hint. - Allow IPv6 routes to be configured with a gateway address that is resolved out of a different interface than the one specified, aligning IPv6 to IPv4 behavior. - Multi-queue aware sch_cake. This makes it possible to scale the rate shaper of sch_cake across multiple CPUs, while still enforcing a single global rate on the interface. - Add support for the nbcon (new buffer console) infrastructure to netconsole, enabling lock-free, priority-based console operations that are safer in crash scenarios. - Improve the TCP ipv6 output path to cache the flow information, saving cpu cycles, reducing cache line misses and stack use. - Improve netfilter packet tracker to resolve clashes for most protocols, avoiding unneeded drops on rare occasions. - Add IP6IP6 tunneling acceleration to the flowtable infrastructure. - Reduce tcp socket size by one cache line. - Notify neighbour changes atomically, avoiding inconsistencies between the notification sequence and the actual states sequence. - Add vsock namespace support, allowing complete isolation of vsocks across different network namespaces. - Improve xsk generic performances with cache-alignment-oriented optimizations. - Support netconsole automatic target recovery, allowing netconsole to reestablish targets when underlying low-level interface comes back online. Driver API: - Support for switching the working mode (automatic vs manual) of a DPLL device via netlink. - Introduce PHY ports representation to expose multiple front-facing media ports over a single MAC. - Introduce "rx-polarity" and "tx-polarity" device tree properties, to generalize polarity inversion requirements for differential signaling. - Add helper to create, prepare and enable managed clocks. Device drivers: - Add Huawei hinic3 PF etherner driver. - Add DWMAC glue driver for Motorcomm YT6801 PCIe ethernet controller. - Add ethernet driver for MaxLinear MxL862xx switches - Remove parallel-port Ethernet driver. - Convert existing driver timestamp configuration reporting to hwtstamp_get and remove legacy ioctl(). - Convert existing drivers to .get_rx_ring_count(), simplifing the RX ring count retrieval. Also remove the legacy fallback path. - Ethernet high-speed NICs: - Broadcom (bnxt, bng): - bnxt: add FW interface update to support FEC stats histogram and NVRAM defragmentation - bng: add TSO and H/W GRO support - nVidia/Mellanox (mlx5): - improve latency of channel restart operations, reducing the used H/W resources - add TSO support for UDP over GRE over VLAN - add flow counters support for hardware steering (HWS) rules - use a static memory area to store headers for H/W GRO, leading to 12% RX tput improvement - Intel (100G, ice, idpf): - ice: reorganizes layout of Tx and Rx rings for cacheline locality and utilizes __cacheline_group* macros on the new layouts - ice: introduces Synchronous Ethernet (SyncE) support - Meta (fbnic): - adds debugfs for firmware mailbox and tx/rx rings vectors - Ethernet virtual: - geneve: introduce GRO/GSO support for double UDP encapsulation - Ethernet NICs consumer, and embedded: - Synopsys (stmmac): - some code refactoring and cleanups - RealTek (r8169): - add support for RTL8127ATF (10G Fiber SFP) - add dash and LTR support - Airoha: - AN8811HB 2.5 Gbps phy support - Freescale (fec): - add XDP zero-copy support - Thunderbolt: - add get link setting support to allow bonding - Renesas: - add support for RZ/G3L GBETH SoC - Ethernet switches: - Maxlinear: - support R(G)MII slow rate configuration - add support for Intel GSW150 - Motorcomm (yt921x): - add DCB/QoS support - TI: - icssm-prueth: support bridging (STP/RSTP) via the switchdev framework - Ethernet PHYs: - Realtek: - enable SGMII and 2500Base-X in-band auto-negotiation - simplify and reunify C22/C45 drivers - Micrel: convert bindings to DT schema - CAN: - move skb headroom content into skb extensions, making CAN metadata access more robust - CAN drivers: - rcar_canfd: - add support for FD-only mode - add support for the RZ/T2H SoC - sja1000: cleanup the CAN state handling - WiFi: - implement EPPKE/802.1X over auth frames support - split up drop reasons better, removing generic RX_DROP - additional FTM capabilities: 6 GHz support, supported number of spatial streams and supported number of LTF repetitions - better mac80211 iterators to enumerate resources - initial UHR (Wi-Fi 8) support for cfg80211/mac80211 - WiFi drivers: - Qualcomm/Atheros: - ath11k: support for Channel Frequency Response measurement - ath12k: a significant driver refactor to support multi-wiphy devices and and pave the way for future device support in the same driver (rather than splitting to ath13k) - ath12k: support for the QCC2072 chipset - Intel: - iwlwifi: partial Neighbor Awareness Networking (NAN) support - iwlwifi: initial support for U-NII-9 and IEEE 802.11bn - RealTek (rtw89): - preparations for RTL8922DE support - Bluetooth: - implement setsockopt(BT_PHY) to set the connection packet type/PHY - set link_policy on incoming ACL connections - Bluetooth drivers: - btusb: add support for MediaTek7920, Realtek RTL8761BU and 8851BE - btqca: add WCN6855 firmware priority selection feature" * tag 'net-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1254 commits) bnge/bng_re: Add a new HSI net: macb: Fix tx/rx malfunction after phy link down and up af_unix: Fix memleak of newsk in unix_stream_connect(). net: ti: icssg-prueth: Add optional dependency on HSR net: dsa: add basic initial driver for MxL862xx switches net: mdio: add unlocked mdiodev C45 bus accessors net: dsa: add tag format for MxL862xx switches dt-bindings: net: dsa: add MaxLinear MxL862xx selftests: drivers: net: hw: Modify toeplitz.c to poll for packets octeontx2-pf: Unregister devlink on probe failure net: renesas: rswitch: fix forwarding offload statemachine ionic: Rate limit unknown xcvr type messages tcp: inet6_csk_xmit() optimization tcp: populate inet->cork.fl.u.ip6 in tcp_v6_syn_recv_sock() tcp: populate inet->cork.fl.u.ip6 in tcp_v6_connect() ipv6: inet6_csk_xmit() and inet6_csk_update_pmtu() use inet->cork.fl.u.ip6 ipv6: use inet->cork.fl.u.ip6 and np->final in ip6_datagram_dst_update() ipv6: use np->final in inet6_sk_rebuild_header() ipv6: add daddr/final storage in struct ipv6_pinfo net: stmmac: qcom-ethqos: fix qcom_ethqos_serdes_powerup() ...	2026-02-11 19:31:52 -08:00
Paolo Abeni	83310d6133	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Merge in late fixes in preparation for the net-next PR. Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-02-11 15:14:35 +01:00

1 2 3 4 5 ...

137794 Commits