linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-02-22 03:00:30 -05:00

Author	SHA1	Message	Date
Eric Dumazet	f10ab9d3a7	tcp: move tcp_rate_skb_sent() to tcp_output.c It is only called from __tcp_transmit_skb() and __tcp_retransmit_skb(). Move it in tcp_output.c and make it static. clang compiler is now able to inline it from __tcp_transmit_skb(). gcc compiler inlines it in the two callers, which is also fine. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Link: https://patch.msgid.link/20260114165109.1747722-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-17 15:43:16 -08:00
Toke Høiland-Jørgensen	2a85541d95	net/sched: cake: avoid separate allocation of struct cake_sched_config Paolo pointed out that we can avoid separately allocating struct cake_sched_config even in the non-mq case, by embedding it into struct cake_sched_data. This reduces the complexity of the logic that swaps the pointers and frees the old value, at the cost of adding 56 bytes to the latter. Since cake_sched_data is already almost 17k bytes, this seems like a reasonable tradeoff. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Fixes: `bc0ce2bad3` ("net/sched: sch_cake: Factor out config variables into separate struct") Link: https://patch.msgid.link/20260113143157.2581680-1-toke@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-17 15:30:28 -08:00
Eric Dumazet	2db009e4c8	net: minor __alloc_skb() optimization We can directly call __finalize_skb_around() instead of __build_skb_around() because @size is not zero. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260113131017.2310584-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-15 19:52:02 -08:00
Jakub Kicinski	c27022497d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.19-rc6). No conflicts, or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-15 18:02:48 -08:00
Linus Torvalds	9e995c573b	Merge tag 'net-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bluetooth, can and IPsec. Current release - regressions: - net: add net.core.qdisc_max_burst - can: propagate CAN device capabilities via ml_priv Previous releases - regressions: - dst: fix races in rt6_uncached_list_del() and rt_del_uncached_list() - ipv6: fix use-after-free in inet6_addr_del(). - xfrm: fix inner mode lookup in tunnel mode GSO segmentation - ip_tunnel: spread netdev_lockdep_set_classes() - ip6_tunnel: use skb_vlan_inet_prepare() in __ip6_tnl_rcv() - bluetooth: hci_sync: enable PA sync lost event - eth: virtio-net: - fix the deadlock when disabling rx NAPI - fix misalignment bug in struct virtnet_info Previous releases - always broken: - ipv4: ip_gre: make ipgre_header() robust - can: fix SSP_SRC in cases when bit-rate is higher than 1 MBit. - eth: - mlx5e: profile change fix - octeon_ep_vf: fix free_irq dev_id mismatch in IRQ rollback - macvlan: fix possible UAF in macvlan_forward_source()" * tag 'net-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (37 commits) virtio_net: Fix misalignment bug in struct virtnet_info net: can: j1939: j1939_xtp_rx_rts_session_active(): deactivate session upon receiving the second rts can: raw: instantly reject disabled CAN frames can: propagate CAN device capabilities via ml_priv Revert "can: raw: instantly reject unsupported CAN frames" net/sched: sch_qfq: do not free existing class in qfq_change_class() selftests: drv-net: fix RPS mask handling for high CPU numbers selftests: drv-net: fix RPS mask handling in toeplitz test ipv6: Fix use-after-free in inet6_addr_del(). dst: fix races in rt6_uncached_list_del() and rt_del_uncached_list() net: hv_netvsc: reject RSS hash key programming without RX indirection table tools: ynl: render event op docs correctly net: add net.core.qdisc_max_burst net: airoha: Fix typo in airoha_ppe_setup_tc_block_cb definition net: phy: motorcomm: fix duplex setting error for phy leds net: octeon_ep_vf: fix free_irq dev_id mismatch in IRQ rollback net/mlx5e: Restore destroying state bit after profile cleanup net/mlx5e: Pass netdev to mlx5e_destroy_netdev instead of priv net/mlx5e: Don't store mlx5e_priv in mlx5e_dev devlink priv net/mlx5e: Fix crash on profile change rollback failure ...	2026-01-15 10:11:11 -08:00
Paolo Abeni	851822aec1	Merge tag 'linux-can-fixes-for-6.19-20260115' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2026-01-15 this is a pull request of 4 patches for net/main, it super-seeds the "can 2026-01-14" pull request. The dev refcount leak in patch #3 is fixed. The first 3 patches are by Oliver Hartkopp and revert the approach to instantly reject unsupported CAN frames introduced in net-next-for-v6.19 and replace it by placing the needed data into the CAN specific ml_priv. The last patch is by Tetsuo Handa and fixes a J1939 refcount leak for j1939_session in session deactivation upon receiving the second RTS. linux-can-fixes-for-6.19-20260115 * tag 'linux-can-fixes-for-6.19-20260115' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: net: can: j1939: j1939_xtp_rx_rts_session_active(): deactivate session upon receiving the second rts can: raw: instantly reject disabled CAN frames can: propagate CAN device capabilities via ml_priv Revert "can: raw: instantly reject unsupported CAN frames" ==================== Link: https://patch.msgid.link/20260115090603.1124860-1-mkl@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-01-15 13:13:01 +01:00
Paolo Abeni	5ce234a8fe	Merge tag 'ipsec-2026-01-14' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2026-01-14 1) Fix inner mode lookup in tunnel mode GSO segmentation. The protocol was taken from the wrong field. 2) Set ipv4 no_pmtu_disc flag only on output SAs. The insertation of input SAs can fail if no_pmtu_disc is set. Please pull or let me know if there are problems. ipsec-2026-01-14 * tag 'ipsec-2026-01-14' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec: xfrm: set ipv4 no_pmtu_disc flag only on output sa when direction is set xfrm: Fix inner mode lookup in tunnel mode GSO segmentation ==================== Link: https://patch.msgid.link/20260114121817.1106134-1-steffen.klassert@secunet.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-01-15 12:46:12 +01:00
Eric Dumazet	d4596891e7	net: inline napi_skb_cache_get() clang is inlining it already, gcc (14.2) does not. Small space cost (215 bytes on x86_64) but faster sk_buff allocations. $ scripts/bloat-o-meter -t net/core/skbuff.gcc.before.o net/core/skbuff.gcc.after.o add/remove: 0/1 grow/shrink: 4/1 up/down: 359/-144 (215) Function old new delta __alloc_skb 471 611 +140 napi_build_skb 245 363 +118 napi_alloc_skb 331 416 +85 skb_copy_ubufs 1869 1885 +16 skb_shift 1445 1413 -32 napi_skb_cache_get 112 - -112 Total: Before=59941, After=60156, chg +0.36% Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260112131515.4051589-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-01-15 12:30:28 +01:00
Jason Xing	a2cb2e23b2	xsk: move cq_cached_prod_lock to avoid touching a cacheline in sending path We (Paolo and I) noticed that in the sending path touching an extra cacheline due to cq_cached_prod_lock will impact the performance. After moving the lock from struct xsk_buff_pool to struct xsk_queue, the performance is increased by ~5% which can be observed by xdpsock. An alternative approach [1] can be using atomic_try_cmpxchg() to have the same effect. But unfortunately I don't have evident performance numbers to prove the atomic approach is better than the current patch. The advantage is to save the contention time among multiple xsks sharing the same pool while the disadvantage is losing good maintenance. The full discussion can be found at the following link. [1]: https://lore.kernel.org/all/20251128134601.54678-1-kerneljasonxing@gmail.com/ Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jason Xing <kernelxing@tencent.com> Link: https://patch.msgid.link/20260104012125.44003-3-kerneljasonxing@gmail.com Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-01-15 10:07:45 +01:00
Jason Xing	cee715d907	xsk: advance cq/fq check when shared umem is used In the shared umem mode with different queues or devices, either uninitialized cq or fq is not allowed which was previously done in xp_assign_dev_shared(). The patch advances the check at the beginning so that 1) we can avoid a few memory allocation and stuff if cq or fq is NULL, 2) it can be regarded as preparation for the next patch in the series. Signed-off-by: Jason Xing <kernelxing@tencent.com> Link: https://patch.msgid.link/20260104012125.44003-2-kerneljasonxing@gmail.com Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-01-15 10:07:44 +01:00
Tetsuo Handa	1809c82aa0	net: can: j1939: j1939_xtp_rx_rts_session_active(): deactivate session upon receiving the second rts Since j1939_session_deactivate_activate_next() in j1939_tp_rxtimer() is called only when the timer is enabled, we need to call j1939_session_deactivate_activate_next() if we cancelled the timer. Otherwise, refcount for j1939_session leaks, which will later appear as \| unregister_netdevice: waiting for vcan0 to become free. Usage count = 2. problem. Reported-by: syzbot <syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Fixes: `9d71dd0c70` ("can: add support of SAE J1939 protocol") Link: https://patch.msgid.link/b1212653-8fa1-44e1-be9d-12f950fb3a07@I-love.SAKURA.ne.jp Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2026-01-15 09:52:39 +01:00
Oliver Hartkopp	faba5860fc	can: raw: instantly reject disabled CAN frames For real CAN interfaces the CAN_CTRLMODE_FD and CAN_CTRLMODE_XL control modes indicate whether an interface can handle those CAN FD/XL frames. In the case a CAN XL interface is configured in CANXL-only mode with disabled error-signalling neither CAN CC nor CAN FD frames can be sent. The checks are now performed on CAN_RAW sockets to give an instant feedback to the user when writing unsupported CAN frames to the interface or when the CAN interface is in read-only mode. Fixes: `1a620a7238` ("can: raw: instantly reject unsupported CAN frames") Cc: Marc Kleine-Budde <mkl@pengutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Vincent Mailhol <mailhol@kernel.org> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://patch.msgid.link/20260109144135.8495-4-socketcan@hartkopp.net [mkl: fix dev reference leak] Link: https://lore.kernel.org/all/0636c732-2e71-4633-8005-dfa85e1da445@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2026-01-15 09:52:04 +01:00
Oliver Hartkopp	4650ff58a1	Revert "can: raw: instantly reject unsupported CAN frames" This reverts commit `1a620a7238` and its follow-up fixes for the introduced dependency issues. commit `1a620a7238` ("can: raw: instantly reject unsupported CAN frames") commit `cb2dc6d286` ("can: Kconfig: select CAN driver infrastructure by default") commit `6abd4577bc` ("can: fix build dependency") commit `5a5aff6338` ("can: fix build dependency") The entire problem was caused by the requirement that a new network layer feature needed to know about the protocol capabilities of the CAN devices. Instead of accessing CAN device internal data structures which caused the dependency problems a better approach has been developed which makes use of CAN specific ml_priv data which is accessible from both sides. Cc: Marc Kleine-Budde <mkl@pengutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Vincent Mailhol <mailhol@kernel.org> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://patch.msgid.link/20260109144135.8495-2-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2026-01-15 09:52:04 +01:00
Linus Torvalds	c537e12dae	Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Alexei Starovoitov: - Fix incorrect usage of BPF_TRAMP_F_ORIG_STACK in riscv JIT (Menglong Dong) - Fix reference count leak in bpf_prog_test_run_xdp() (Tetsuo Handa) - Fix metadata size check in bpf_test_run() (Toke Høiland-Jørgensen) - Check that BPF insn array is not allowed as a map for const strings (Deepanshu Kartikey) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Fix reference count leak in bpf_prog_test_run_xdp() bpf: Reject BPF_MAP_TYPE_INSN_ARRAY in check_reg_const_str() selftests/bpf: Update xdp_context_test_run test to check maximum metadata size bpf, test_run: Subtract size of xdp_frame from allowed metadata size riscv, bpf: Fix incorrect usage of BPF_TRAMP_F_ORIG_STACK	2026-01-13 21:21:13 -08:00
Eric Dumazet	3879cffd9d	net/sched: sch_qfq: do not free existing class in qfq_change_class() Fixes qfq_change_class() error case. cl->qdisc and cl should only be freed if a new class and qdisc were allocated, or we risk various UAF. Fixes: `462dbc9101` ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost") Reported-by: syzbot+07f3f38f723c335f106d@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6965351d.050a0220.eaf7.00c5.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20260112175656.17605-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-13 19:36:56 -08:00
Kuniyuki Iwashima	ddf96c393a	ipv6: Fix use-after-free in inet6_addr_del(). syzbot reported use-after-free of inet6_ifaddr in inet6_addr_del(). [0] The cited commit accidentally moved ipv6_del_addr() for mngtmpaddr before reading its ifp->flags for temporary addresses in inet6_addr_del(). Let's move ipv6_del_addr() down to fix the UAF. [0]: BUG: KASAN: slab-use-after-free in inet6_addr_del.constprop.0+0x67a/0x6b0 net/ipv6/addrconf.c:3117 Read of size 4 at addr ffff88807b89c86c by task syz.3.1618/9593 CPU: 0 UID: 0 PID: 9593 Comm: syz.3.1618 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xcd/0x630 mm/kasan/report.c:482 kasan_report+0xe0/0x110 mm/kasan/report.c:595 inet6_addr_del.constprop.0+0x67a/0x6b0 net/ipv6/addrconf.c:3117 addrconf_del_ifaddr+0x11e/0x190 net/ipv6/addrconf.c:3181 inet6_ioctl+0x1e5/0x2b0 net/ipv6/af_inet6.c:582 sock_do_ioctl+0x118/0x280 net/socket.c:1254 sock_ioctl+0x227/0x6b0 net/socket.c:1375 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:597 [inline] __se_sys_ioctl fs/ioctl.c:583 [inline] __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f164cf8f749 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f164de64038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007f164d1e5fa0 RCX: 00007f164cf8f749 RDX: 0000200000000000 RSI: 0000000000008936 RDI: 0000000000000003 RBP: 00007f164d013f91 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007f164d1e6038 R14: 00007f164d1e5fa0 R15: 00007ffde15c8288 </TASK> Allocated by task 9593: kasan_save_stack+0x33/0x60 mm/kasan/common.c:56 kasan_save_track+0x14/0x30 mm/kasan/common.c:77 poison_kmalloc_redzone mm/kasan/common.c:397 [inline] __kasan_kmalloc+0xaa/0xb0 mm/kasan/common.c:414 kmalloc_noprof include/linux/slab.h:957 [inline] kzalloc_noprof include/linux/slab.h:1094 [inline] ipv6_add_addr+0x4e3/0x2010 net/ipv6/addrconf.c:1120 inet6_addr_add+0x256/0x9b0 net/ipv6/addrconf.c:3050 addrconf_add_ifaddr+0x1fc/0x450 net/ipv6/addrconf.c:3160 inet6_ioctl+0x103/0x2b0 net/ipv6/af_inet6.c:580 sock_do_ioctl+0x118/0x280 net/socket.c:1254 sock_ioctl+0x227/0x6b0 net/socket.c:1375 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:597 [inline] __se_sys_ioctl fs/ioctl.c:583 [inline] __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 6099: kasan_save_stack+0x33/0x60 mm/kasan/common.c:56 kasan_save_track+0x14/0x30 mm/kasan/common.c:77 kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:584 poison_slab_object mm/kasan/common.c:252 [inline] __kasan_slab_free+0x5f/0x80 mm/kasan/common.c:284 kasan_slab_free include/linux/kasan.h:234 [inline] slab_free_hook mm/slub.c:2540 [inline] slab_free_freelist_hook mm/slub.c:2569 [inline] slab_free_bulk mm/slub.c:6696 [inline] kmem_cache_free_bulk mm/slub.c:7383 [inline] kmem_cache_free_bulk+0x2bf/0x680 mm/slub.c:7362 kfree_bulk include/linux/slab.h:830 [inline] kvfree_rcu_bulk+0x1b7/0x1e0 mm/slab_common.c:1523 kvfree_rcu_drain_ready mm/slab_common.c:1728 [inline] kfree_rcu_monitor+0x1d0/0x2f0 mm/slab_common.c:1801 process_one_work+0x9ba/0x1b20 kernel/workqueue.c:3257 process_scheduled_works kernel/workqueue.c:3340 [inline] worker_thread+0x6c8/0xf10 kernel/workqueue.c:3421 kthread+0x3c5/0x780 kernel/kthread.c:463 ret_from_fork+0x983/0xb10 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246 Fixes: `00b5b7aab9` ("net/ipv6: delete temporary address if mngtmpaddr is removed or unmanaged") Reported-by: syzbot+72e610f4f1a930ca9d8a@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/696598e9.050a0220.3be5c5.0009.GAE@google.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260113010538.2019411-1-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-13 19:09:11 -08:00
Eric Dumazet	9a6f0c4d57	dst: fix races in rt6_uncached_list_del() and rt_del_uncached_list() syzbot was able to crash the kernel in rt6_uncached_list_flush_dev() in an interesting way [1] Crash happens in list_del_init()/INIT_LIST_HEAD() while writing list->prev, while the prior write on list->next went well. static inline void INIT_LIST_HEAD(struct list_head *list) { WRITE_ONCE(list->next, list); // This went well WRITE_ONCE(list->prev, list); // Crash, @list has been freed. } Issue here is that rt6_uncached_list_del() did not attempt to lock ul->lock, as list_empty(&rt->dst.rt_uncached) returned true because the WRITE_ONCE(list->next, list) happened on the other CPU. We might use list_del_init_careful() and list_empty_careful(), or make sure rt6_uncached_list_del() always grabs the spinlock whenever rt->dst.rt_uncached_list has been set. A similar fix is neeed for IPv4. [1] BUG: KASAN: slab-use-after-free in INIT_LIST_HEAD include/linux/list.h:46 [inline] BUG: KASAN: slab-use-after-free in list_del_init include/linux/list.h:296 [inline] BUG: KASAN: slab-use-after-free in rt6_uncached_list_flush_dev net/ipv6/route.c:191 [inline] BUG: KASAN: slab-use-after-free in rt6_disable_ip+0x633/0x730 net/ipv6/route.c:5020 Write of size 8 at addr ffff8880294cfa78 by task kworker/u8:14/3450 CPU: 0 UID: 0 PID: 3450 Comm: kworker/u8:14 Tainted: G L syzkaller #0 PREEMPT_{RT,(full)} Tainted: [L]=SOFTLOCKUP Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025 Workqueue: netns cleanup_net Call Trace: <TASK> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xca/0x240 mm/kasan/report.c:482 kasan_report+0x118/0x150 mm/kasan/report.c:595 INIT_LIST_HEAD include/linux/list.h:46 [inline] list_del_init include/linux/list.h:296 [inline] rt6_uncached_list_flush_dev net/ipv6/route.c:191 [inline] rt6_disable_ip+0x633/0x730 net/ipv6/route.c:5020 addrconf_ifdown+0x143/0x18a0 net/ipv6/addrconf.c:3853 addrconf_notify+0x1bc/0x1050 net/ipv6/addrconf.c:-1 notifier_call_chain+0x19d/0x3a0 kernel/notifier.c:85 call_netdevice_notifiers_extack net/core/dev.c:2268 [inline] call_netdevice_notifiers net/core/dev.c:2282 [inline] netif_close_many+0x29c/0x410 net/core/dev.c:1785 unregister_netdevice_many_notify+0xb50/0x2330 net/core/dev.c:12353 ops_exit_rtnl_list net/core/net_namespace.c:187 [inline] ops_undo_list+0x3dc/0x990 net/core/net_namespace.c:248 cleanup_net+0x4de/0x7b0 net/core/net_namespace.c:696 process_one_work kernel/workqueue.c:3257 [inline] process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246 </TASK> Allocated by task 803: kasan_save_stack mm/kasan/common.c:57 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:78 unpoison_slab_object mm/kasan/common.c:340 [inline] __kasan_slab_alloc+0x6c/0x80 mm/kasan/common.c:366 kasan_slab_alloc include/linux/kasan.h:253 [inline] slab_post_alloc_hook mm/slub.c:4953 [inline] slab_alloc_node mm/slub.c:5263 [inline] kmem_cache_alloc_noprof+0x18d/0x6c0 mm/slub.c:5270 dst_alloc+0x105/0x170 net/core/dst.c:89 ip6_dst_alloc net/ipv6/route.c:342 [inline] icmp6_dst_alloc+0x75/0x460 net/ipv6/route.c:3333 mld_sendpack+0x683/0xe60 net/ipv6/mcast.c:1844 mld_send_cr net/ipv6/mcast.c:2154 [inline] mld_ifc_work+0x83e/0xd60 net/ipv6/mcast.c:2693 process_one_work kernel/workqueue.c:3257 [inline] process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246 Freed by task 20: kasan_save_stack mm/kasan/common.c:57 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:78 kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:584 poison_slab_object mm/kasan/common.c:253 [inline] __kasan_slab_free+0x5c/0x80 mm/kasan/common.c:285 kasan_slab_free include/linux/kasan.h:235 [inline] slab_free_hook mm/slub.c:2540 [inline] slab_free mm/slub.c:6670 [inline] kmem_cache_free+0x18f/0x8d0 mm/slub.c:6781 dst_destroy+0x235/0x350 net/core/dst.c:121 rcu_do_batch kernel/rcu/tree.c:2605 [inline] rcu_core kernel/rcu/tree.c:2857 [inline] rcu_cpu_kthread+0xba5/0x1af0 kernel/rcu/tree.c:2945 smpboot_thread_fn+0x542/0xa60 kernel/smpboot.c:160 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246 Last potentially related work creation: kasan_save_stack+0x3e/0x60 mm/kasan/common.c:57 kasan_record_aux_stack+0xbd/0xd0 mm/kasan/generic.c:556 __call_rcu_common kernel/rcu/tree.c:3119 [inline] call_rcu+0xee/0x890 kernel/rcu/tree.c:3239 refdst_drop include/net/dst.h:266 [inline] skb_dst_drop include/net/dst.h:278 [inline] skb_release_head_state+0x71/0x360 net/core/skbuff.c:1156 skb_release_all net/core/skbuff.c:1180 [inline] __kfree_skb net/core/skbuff.c:1196 [inline] sk_skb_reason_drop+0xe9/0x170 net/core/skbuff.c:1234 kfree_skb_reason include/linux/skbuff.h:1322 [inline] tcf_kfree_skb_list include/net/sch_generic.h:1127 [inline] __dev_xmit_skb net/core/dev.c:4260 [inline] __dev_queue_xmit+0x26aa/0x3210 net/core/dev.c:4785 NF_HOOK_COND include/linux/netfilter.h:307 [inline] ip6_output+0x340/0x550 net/ipv6/ip6_output.c:247 NF_HOOK+0x9e/0x380 include/linux/netfilter.h:318 mld_sendpack+0x8d4/0xe60 net/ipv6/mcast.c:1855 mld_send_cr net/ipv6/mcast.c:2154 [inline] mld_ifc_work+0x83e/0xd60 net/ipv6/mcast.c:2693 process_one_work kernel/workqueue.c:3257 [inline] process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246 The buggy address belongs to the object at ffff8880294cfa00 which belongs to the cache ip6_dst_cache of size 232 The buggy address is located 120 bytes inside of freed 232-byte region [ffff8880294cfa00, ffff8880294cfae8) The buggy address belongs to the physical page: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x294cf memcg:ffff88803536b781 flags: 0x80000000000000(node=0\|zone=1) page_type: f5(slab) raw: 0080000000000000 ffff88802ff1c8c0 ffffea0000bf2bc0 dead000000000006 raw: 0000000000000000 00000000800c000c 00000000f5000000 ffff88803536b781 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52820(GFP_ATOMIC\|__GFP_NOWARN\|__GFP_NORETRY\|__GFP_COMP), pid 9, tgid 9 (kworker/0:0), ts 91119585830, free_ts 91088628818 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x234/0x290 mm/page_alloc.c:1857 prep_new_page mm/page_alloc.c:1865 [inline] get_page_from_freelist+0x28c0/0x2960 mm/page_alloc.c:3915 __alloc_frozen_pages_noprof+0x181/0x370 mm/page_alloc.c:5210 alloc_pages_mpol+0xd1/0x380 mm/mempolicy.c:2486 alloc_slab_page mm/slub.c:3075 [inline] allocate_slab+0x86/0x3b0 mm/slub.c:3248 new_slab mm/slub.c:3302 [inline] ___slab_alloc+0xb10/0x13e0 mm/slub.c:4656 __slab_alloc+0xc6/0x1f0 mm/slub.c:4779 __slab_alloc_node mm/slub.c:4855 [inline] slab_alloc_node mm/slub.c:5251 [inline] kmem_cache_alloc_noprof+0x101/0x6c0 mm/slub.c:5270 dst_alloc+0x105/0x170 net/core/dst.c:89 ip6_dst_alloc net/ipv6/route.c:342 [inline] icmp6_dst_alloc+0x75/0x460 net/ipv6/route.c:3333 mld_sendpack+0x683/0xe60 net/ipv6/mcast.c:1844 mld_send_cr net/ipv6/mcast.c:2154 [inline] mld_ifc_work+0x83e/0xd60 net/ipv6/mcast.c:2693 process_one_work kernel/workqueue.c:3257 [inline] process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158 page last free pid 5859 tgid 5859 stack trace: reset_page_owner include/linux/page_owner.h:25 [inline] free_pages_prepare mm/page_alloc.c:1406 [inline] __free_frozen_pages+0xfe1/0x1170 mm/page_alloc.c:2943 discard_slab mm/slub.c:3346 [inline] __put_partials+0x149/0x170 mm/slub.c:3886 __slab_free+0x2af/0x330 mm/slub.c:5952 qlink_free mm/kasan/quarantine.c:163 [inline] qlist_free_all+0x97/0x100 mm/kasan/quarantine.c:179 kasan_quarantine_reduce+0x148/0x160 mm/kasan/quarantine.c:286 __kasan_slab_alloc+0x22/0x80 mm/kasan/common.c:350 kasan_slab_alloc include/linux/kasan.h:253 [inline] slab_post_alloc_hook mm/slub.c:4953 [inline] slab_alloc_node mm/slub.c:5263 [inline] kmem_cache_alloc_noprof+0x18d/0x6c0 mm/slub.c:5270 getname_flags+0xb8/0x540 fs/namei.c:146 getname include/linux/fs.h:2498 [inline] do_sys_openat2+0xbc/0x200 fs/open.c:1426 do_sys_open fs/open.c:1436 [inline] __do_sys_openat fs/open.c:1452 [inline] __se_sys_openat fs/open.c:1447 [inline] __x64_sys_openat+0x138/0x170 fs/open.c:1447 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xec/0xf80 arch/x86/entry/syscall_64.c:94 Fixes: `8d0b94afdc` ("ipv6: Keep track of DST_NOCACHE routes in case of iface down/unregister") Fixes: `78df76a065` ("ipv4: take rt_uncached_lock only if needed") Reported-by: syzbot+179fc225724092b8b2b2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6964cdf2.050a0220.eaf7.009d.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20260112103825.3810713-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-13 19:08:18 -08:00
Ido Schimmel	b853b94e84	ipv6: Allow for nexthop device mismatch with "onlink" IPv4 allows for a nexthop device mismatch when the "onlink" keyword is specified: # ip link add name dummy1 up type dummy # ip address add 192.0.2.1/24 dev dummy1 # ip link add name dummy2 up type dummy # ip route add 198.51.100.0/24 nexthop via 192.0.2.2 dev dummy2 Error: Nexthop has invalid gateway. # ip route add 198.51.100.0/24 nexthop via 192.0.2.2 dev dummy2 onlink # echo $? 0 This seems to be consistent with the description of "onlink" in the ip-route man page: "Pretend that the nexthop is directly attached to this link, even if it does not match any interface prefix". On the other hand, IPv6 rejects a nexthop device mismatch, even when "onlink" is specified: # ip link add name dummy1 up type dummy # ip address add 2001:db8:1::1/64 dev dummy1 # ip link add name dummy2 up type dummy # ip route add 2001:db8:10::/64 nexthop via 2001:db8:1::2 dev dummy2 RTNETLINK answers: No route to host # ip route add 2001:db8:10::/64 nexthop via 2001:db8:1::2 dev dummy2 onlink Error: Nexthop has invalid gateway or device mismatch. This is intentional according to commit `fc1e64e109` ("net/ipv6: Add support for onlink flag") which added IPv6 "onlink" support and states that "any unicast gateway is allowed as long as the gateway is not a local address and if it resolves it must match the given device". The condition was later relaxed in commit `4ed591c8ab` ("net/ipv6: Allow onlink routes to have a device mismatch if it is the default route") to allow for a nexthop device mismatch if the gateway address is resolved via the default route: # ip link add name dummy1 up type dummy # ip route add ::/0 dev dummy1 # ip link add name dummy2 up type dummy # ip route add 2001:db8:10::/64 nexthop via 2001:db8:1::2 dev dummy2 RTNETLINK answers: No route to host # ip route add 2001:db8:10::/64 nexthop via 2001:db8:1::2 dev dummy2 onlink # echo $? 0 While the decision to forbid a nexthop device mismatch in IPv6 seems to be intentional, it is unclear why it was made. Especially when it differs from IPv4 and seems to go against the intended behavior of "onlink". Therefore, relax the condition further and allow for a nexthop device mismatch when "onlink" is specified: # ip link add name dummy1 up type dummy # ip address add 2001:db8:1::1/64 dev dummy1 # ip link add name dummy2 up type dummy # ip route add 2001:db8:10::/64 nexthop via 2001:db8:1::2 dev dummy2 onlink # echo $? 0 The motivating use case is the fact that FRR would like to be able to configure overlay routes of the following form: # ip route add <host-Z> vrf <VRF> encap ip id <ID> src <VTEP-A> dst <VTEP-Z> via <VTEP-Z> dev vxlan0 onlink Where vxlan0 is in the default VRF in which "VTEP-Z" is reachable via one of the underlay routes (e.g., via swpX). Without this patch, the above only works with IPv4, but not with IPv6. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20260111120813.159799-5-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-13 18:57:34 -08:00
Maxime Chevallier	589e934d27	net: phy: Introduce PHY ports representation Ethernet provides a wide variety of layer 1 protocols and standards for data transmission. The front-facing ports of an interface have their own complexity and configurability. Introduce a representation of these front-facing ports. The current code is minimalistic and only support ports controlled by PHY devices, but the plan is to extend that to SFP as well as raw Ethernet MACs that don't use PHY devices. This minimal port representation allows describing the media and number of pairs of a BaseT port. From that information, we can derive the linkmodes usable on the port, which can be used to limit the capabilities of an interface. For now, the port pairs and medium is derived from devicetree, defined by the PHY driver, or populated with default values (as we assume that all PHYs expose at least one port). The typical example is 100M ethernet. 100BaseTX works using only 2 pairs on a Cat 5 cables. However, in the situation where a 10/100/1000 capable PHY is wired to its RJ45 port through 2 pairs only, we have no way of detecting that. The "max-speed" DT property can be used, but a more accurate representation can be used : mdi { connector-0 { media = "BaseT"; pairs = <2>; }; }; From that information, we can derive the max speed reachable on the port. Another benefit of having that is to avoid vendor-specific DT properties (micrel,fiber-mode or ti,fiber-mode). This basic representation is meant to be expanded, by the introduction of port ops, userspace listing of ports, and support for multi-port devices. Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260108080041.553250-4-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-13 18:52:34 -08:00
Maxime Chevallier	3f25ff7409	net: ethtool: Introduce ETHTOOL_LINK_MEDIUM_* values In an effort to have a better representation of Ethernet ports, introduce enumeration values representing the various ethernet Mediums. This is part of the 802.3 naming convention, for example : 1000 Base T 4 \| \| \| \| \| \| \| \_ pairs (4) \| \| \___ Medium (T == Twisted Copper Pairs) \| \_______ Baseband transmission \____________ Speed Other example : 10000 Base K X 4 \| \| \_ lanes (4) \| \___ encoding (BaseX is 8b/10b while BaseR is 66b/64b) \_____ Medium (K is backplane ethernet) In the case of representing a physical port, only the medium and number of pairs should be relevant. One exception would be 1000BaseX, which is currently also used as a medium in what appears to be any of 1000BaseSX, 1000BaseCX, 1000BaseLX, 1000BaseEX, 1000BaseBX10 and some other. This was reflected in the mediums associated with the 1000BaseX linkmode. These mediums are set in the net/ethtool/common.c lookup table that maintains a list of all linkmodes with their number of pairs, medium, encoding, speed and duplex. One notable exception to this is 100BaseT Ethernet. It emcompasses 100BaseTX, which is a 2-pairs protocol but also 100BaseT4, that will also work on 4-pairs cables. As we don't make a disctinction between these, the lookup table contains 2 sets of pair numbers, indicating the min number of pairs for a protocol to work and the "nominal" number of pairs as well. Another set of exceptions are linkmodes such 100000baseLR4_ER4, where the same link mode seems to represent 100GBaseLR4 and 100GBaseER4. The macro __DEFINE_LINK_MODE_PARAMS_MEDIUMS is here used to populate the .mediums bitfield with all appropriate mediums. Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260108080041.553250-3-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-13 18:52:34 -08:00
Allison Henderson	4716af3897	net/rds: Give each connection path its own workqueue RDS was written to require ordered workqueues for "cp->cp_wq": Work is executed in the order scheduled, one item at a time. If these workqueues are shared across connections, then work executed on behalf of one connection blocks work scheduled for a different and unrelated connection. Luckily we don't need to share these workqueues. While it obviously makes sense to limit the number of workers (processes) that ought to be allocated on a system, a workqueue that doesn't have a rescue worker attached, has a tiny footprint compared to the connection as a whole: A workqueue costs ~900 bytes, including the workqueue_struct, pool_workqueue, workqueue_attrs, wq_node_nr_active and the node_nr_active flex array. Each connection can have up to 8 (RDS_MPATH_WORKERS) paths for a worst case of ~7 KBytes per connection. While an RDS/IB connection totals only ~5 MBytes. So we're getting a signficant performance gain (90% of connections fail over under 3 seconds vs. 40%) for a less than 0.02% overhead. RDS doesn't even benefit from the additional rescue workers: of all the reasons that RDS blocks workers, allocation under memory pressue is the least of our concerns. And even if RDS was stalling due to the memory-reclaim process, the work executed by the rescue workers are highly unlikely to free up any memory. If anything, they might try to allocate even more. By giving each connection path its own workqueues, we allow RDS to better utilize the unbound workers that the system has available. Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Signed-off-by: Allison Henderson <allison.henderson@oracle.com> Link: https://patch.msgid.link/20260109224843.128076-3-achender@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-01-13 12:27:03 +01:00
Allison Henderson	d327e2e74a	net/rds: Add per cp work queue This patch adds a per connection workqueue which can be initialized and used independently of the globally shared rds_wq. This patch is the first in a series that aims to address tcp ack timeouts during the tcp socket shutdown sequence. This initial refactoring lays the ground work needed to alleviate queue congestion during heavy reads and writes. The independently managed queues will allow shutdowns and reconnects respond more quickly before the peer(s) timeout waiting for the proper acks. Signed-off-by: Allison Henderson <allison.henderson@oracle.com> Link: https://patch.msgid.link/20260109224843.128076-2-achender@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-01-13 12:27:03 +01:00
Jonas Köppeler	1bddd758ba	net/sched: sch_cake: share shaper state across sub-instances of cake_mq This commit adds shared shaper state across the cake instances beneath a cake_mq qdisc. It works by periodically tracking the number of active instances, and scaling the configured rate by the number of active queues. The scan is lockless and simply reads the qlen and the last_active state variable of each of the instances configured beneath the parent cake_mq instance. Locking is not required since the values are only updated by the owning instance, and eventual consistency is sufficient for the purpose of estimating the number of active queues. The interval for scanning the number of active queues is set to 200 us. We found this to be a good tradeoff between overhead and response time. For a detailed analysis of this aspect see the Netdevconf talk: https://netdevconf.info/0x19/docs/netdev-0x19-paper16-talk-paper.pdf Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Jonas Köppeler <j.koeppeler@tu-berlin.de> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20260109-mq-cake-sub-qdisc-v8-5-8d613fece5d8@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-01-13 11:54:29 +01:00
Toke Høiland-Jørgensen	87826c0183	net/sched: sch_cake: Share config across cake_mq sub-qdiscs This adds support for configuring the cake_mq instance directly, sharing the config across the cake sub-qdiscs. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20260109-mq-cake-sub-qdisc-v8-4-8d613fece5d8@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-01-13 11:54:29 +01:00
Toke Høiland-Jørgensen	ebc65a873e	net/sched: sch_cake: Add cake_mq qdisc for using cake on mq devices Add a cake_mq qdisc which installs cake instances on each hardware queue on a multi-queue device. This is just a copy of sch_mq that installs cake instead of the default qdisc on each queue. Subsequent commits will add sharing of the config between cake instances, as well as a multi-queue aware shaper algorithm. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20260109-mq-cake-sub-qdisc-v8-3-8d613fece5d8@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-01-13 11:54:29 +01:00
Toke Høiland-Jørgensen	bc0ce2bad3	net/sched: sch_cake: Factor out config variables into separate struct Factor out all the user-configurable variables into a separate struct and embed it into struct cake_sched_data. This is done in preparation for sharing the configuration across multiple instances of cake in an mq setup. No functional change is intended with this patch. Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20260109-mq-cake-sub-qdisc-v8-2-8d613fece5d8@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-01-13 11:54:29 +01:00
Toke Høiland-Jørgensen	8b27fd66f5	net/sched: Export mq functions for reuse To enable the cake_mq qdisc to reuse code from the mq qdisc, export a bunch of functions from sch_mq. Split common functionality out from some functions so it can be composed with other code, and export other functions wholesale. To discourage wanton reuse, put the symbols into a new NET_SCHED_INTERNAL namespace, and a sch_priv.h header file. No functional change intended. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20260109-mq-cake-sub-qdisc-v8-1-8d613fece5d8@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-01-13 11:54:29 +01:00
Eric Dumazet	ffe4ccd359	net: add net.core.qdisc_max_burst In blamed commit, I added a check against the temporary queue built in __dev_xmit_skb(). Idea was to drop packets early, before any spinlock was acquired. if (unlikely(defer_count > READ_ONCE(q->limit))) { kfree_skb_reason(skb, SKB_DROP_REASON_QDISC_DROP); return NET_XMIT_DROP; } It turned out that HTB Qdisc has a zero q->limit. HTB limits packets on a per-class basis. Some of our tests became flaky. Add a new sysctl : net.core.qdisc_max_burst to control how many packets can be stored in the temporary lockless queue. Also add a new QDISC_BURST_DROP drop reason to better diagnose future issues. Thanks Neal ! Fixes: `100dfa74ca` ("net: dev_queue_xmit() llist adoption") Reported-and-bisected-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Link: https://patch.msgid.link/20260107104159.3669285-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-01-13 10:12:11 +01:00
Eric Dumazet	0391ab577c	net: add skbuff_clear() helper clang is unable to inline the memset() calls in net/core/skbuff.c when initializing allocated sk_buff. memset(skb, 0, offsetof(struct sk_buff, tail)); This is unfortunate, because: 1) calling external memset_orig() helper adds a call/ret and typical setup cost. 2) offsetof(struct sk_buff, tail) == 0xb8 = 0x80 + 0x38 On x86_64, memset_orig() performs two 64 bytes clear, then has to loop 7 times to clear the final 56 bytes. skbuff_clear() makes sure the minimal and optimal code is generated. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260109203836.1667441-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-12 19:30:57 -08:00
Thorsten Blum	e405b3c9d4	net: ipconfig: Remove outdated comment and indent code block The comment has been around ever since commit `1da177e4c3` ("Linux-2.6.12-rc2") and can be removed. Remove it and indent the code block accordingly. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://patch.msgid.link/20260109121128.170020-2-thorsten.blum@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-12 19:11:35 -08:00
Bobby Eshleman	5c024716f5	net: devmem: convert binding refcount to percpu_ref Convert net_devmem_dmabuf_binding refcount from refcount_t to percpu_ref to optimize common-case reference counting on the hot path. The typical devmem workflow involves binding a dmabuf to a queue (acquiring the initial reference on binding->ref), followed by high-volume traffic where every skb fragment acquires a reference. Eventually traffic stops and the unbind operation releases the initial reference. Additionally, the high traffic hot path is often multi-core. This access pattern is ideal for percpu_ref as the first and last reference during bind/unbind normally book-ends activity in the hot path. __net_devmem_dmabuf_binding_free becomes the percpu_ref callback invoked when the last reference is dropped. kperf test: - 4MB message sizes - 60s of workload each run - 5 runs - 4 flows Throughput: Before: 45.31 GB/s (+/- 3.17 GB/s) After: 48.67 GB/s (+/- 0.01 GB/s) Picking throughput-matched kperf runs (both before and after matched at ~48 GB/s) for apples-to-apples comparison: Summary (averaged across 4 workers): TX worker CPU idle %: Before: 34.44% After: 87.13% RX worker CPU idle %: Before: 5.38% After: 9.73% kperf before: client: == Source client: Tx 98.100 Gbps (735764807680 bytes in 60001149 usec) client: Tx102.798 Gbps (770996961280 bytes in 60001149 usec) client: Tx101.534 Gbps (761517834240 bytes in 60001149 usec) client: Tx 82.794 Gbps (620966707200 bytes in 60001149 usec) client: net CPU 56: usr: 0.01% sys: 0.12% idle:17.06% iow: 0.00% irq: 9.89% sirq:72.91% client: app CPU 60: usr: 0.08% sys:63.30% idle:36.24% iow: 0.00% irq: 0.30% sirq: 0.06% client: net CPU 57: usr: 0.03% sys: 0.08% idle:75.68% iow: 0.00% irq: 2.96% sirq:21.23% client: app CPU 61: usr: 0.06% sys:67.67% idle:31.94% iow: 0.00% irq: 0.28% sirq: 0.03% client: net CPU 58: usr: 0.01% sys: 0.06% idle:76.87% iow: 0.00% irq: 2.84% sirq:20.19% client: app CPU 62: usr: 0.06% sys:69.78% idle:29.79% iow: 0.00% irq: 0.30% sirq: 0.05% client: net CPU 59: usr: 0.06% sys: 0.16% idle:74.97% iow: 0.00% irq: 3.76% sirq:21.03% client: app CPU 63: usr: 0.06% sys:59.82% idle:39.80% iow: 0.00% irq: 0.25% sirq: 0.05% client: == Target client: Rx 98.092 Gbps (735764807680 bytes in 60006084 usec) client: Rx102.785 Gbps (770962161664 bytes in 60006084 usec) client: Rx101.523 Gbps (761499566080 bytes in 60006084 usec) client: Rx 82.783 Gbps (620933136384 bytes in 60006084 usec) client: net CPU 2: usr: 0.00% sys: 0.01% idle:24.51% iow: 0.00% irq: 1.67% sirq:73.79% client: app CPU 6: usr: 1.51% sys:96.43% idle: 1.13% iow: 0.00% irq: 0.36% sirq: 0.55% client: net CPU 1: usr: 0.00% sys: 0.01% idle:25.18% iow: 0.00% irq: 1.99% sirq:72.80% client: app CPU 5: usr: 2.21% sys:94.54% idle: 2.54% iow: 0.00% irq: 0.38% sirq: 0.30% client: net CPU 3: usr: 0.00% sys: 0.01% idle:26.34% iow: 0.00% irq: 2.12% sirq:71.51% client: app CPU 7: usr: 2.22% sys:94.28% idle: 2.52% iow: 0.00% irq: 0.59% sirq: 0.37% client: net CPU 0: usr: 0.00% sys: 0.03% idle: 0.00% iow: 0.00% irq:10.44% sirq:89.51% client: app CPU 4: usr: 2.39% sys:81.46% idle:15.33% iow: 0.00% irq: 0.50% sirq: 0.30% kperf after: client: == Source client: Tx 99.257 Gbps (744447016960 bytes in 60001303 usec) client: Tx101.013 Gbps (757617131520 bytes in 60001303 usec) client: Tx 88.179 Gbps (661357854720 bytes in 60001303 usec) client: Tx101.002 Gbps (757533245440 bytes in 60001303 usec) client: net CPU 56: usr: 0.00% sys: 0.01% idle: 6.22% iow: 0.00% irq: 8.68% sirq:85.06% client: app CPU 60: usr: 0.08% sys:12.56% idle:87.21% iow: 0.00% irq: 0.08% sirq: 0.05% client: net CPU 57: usr: 0.00% sys: 0.05% idle:69.53% iow: 0.00% irq: 2.02% sirq:28.38% client: app CPU 61: usr: 0.11% sys:13.40% idle:86.36% iow: 0.00% irq: 0.08% sirq: 0.03% client: net CPU 58: usr: 0.00% sys: 0.03% idle:70.04% iow: 0.00% irq: 3.38% sirq:26.53% client: app CPU 62: usr: 0.10% sys:11.46% idle:88.31% iow: 0.00% irq: 0.08% sirq: 0.03% client: net CPU 59: usr: 0.01% sys: 0.06% idle:71.18% iow: 0.00% irq: 1.97% sirq:26.75% client: app CPU 63: usr: 0.10% sys:13.10% idle:86.64% iow: 0.00% irq: 0.10% sirq: 0.05% client: == Target client: Rx 99.250 Gbps (744415182848 bytes in 60003297 usec) client: Rx101.006 Gbps (757589737472 bytes in 60003297 usec) client: Rx 88.171 Gbps (661319475200 bytes in 60003297 usec) client: Rx100.996 Gbps (757514792960 bytes in 60003297 usec) client: net CPU 2: usr: 0.00% sys: 0.01% idle:28.02% iow: 0.00% irq: 1.95% sirq:70.00% client: app CPU 6: usr: 2.03% sys:87.20% idle:10.04% iow: 0.00% irq: 0.37% sirq: 0.33% client: net CPU 3: usr: 0.00% sys: 0.00% idle:27.63% iow: 0.00% irq: 1.90% sirq:70.45% client: app CPU 7: usr: 1.78% sys:89.70% idle: 7.79% iow: 0.00% irq: 0.37% sirq: 0.34% client: net CPU 0: usr: 0.00% sys: 0.01% idle: 0.00% iow: 0.00% irq: 9.96% sirq:90.01% client: app CPU 4: usr: 2.33% sys:83.51% idle:13.24% iow: 0.00% irq: 0.64% sirq: 0.26% client: net CPU 1: usr: 0.00% sys: 0.01% idle:27.60% iow: 0.00% irq: 1.94% sirq:70.43% client: app CPU 5: usr: 1.88% sys:89.61% idle: 7.86% iow: 0.00% irq: 0.35% sirq: 0.27% Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260107-upstream-precpu-ref-v2-v2-1-a709f098b3dc@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-12 18:00:12 -08:00
Jakub Kicinski	669aa3e3fa	Merge tag 'wireless-next-2026-01-12' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Johannes Berg says: ==================== First set of changes for the current -next cycle, of note: - ath12k gets an overhaul to support multi-wiphy device wiphy and pave the way for future device support in the same driver (rather than splitting to ath13k) - mac80211 gets some better iteration macros * tag 'wireless-next-2026-01-12' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (120 commits) wifi: mac80211: remove width argument from ieee80211_parse_bitrates wifi: mac80211_hwsim: remove NAN by default wifi: mac80211: improve station iteration ergonomics wifi: mac80211: improve interface iteration ergonomics wifi: cfg80211: include S1G_NO_PRIMARY flag when sending channel wifi: mac80211: unexport ieee80211_get_bssid() wl1251: Replace strncpy with strscpy in wl1251_acx_fw_version wifi: iwlegacy: 3945-rs: remove redundant pointer check in il3945_rs_tx_status() and il3945_rs_get_rate() wifi: mac80211: don't send an unused argument to ieee80211_check_combinations wifi: libertas: fix WARNING in usb_tx_block wifi: mwifiex: Allocate dev name earlier for interface workqueue name wifi: wlcore: sdio: Use pm_ptr instead of #ifdef CONFIG_PM wifi: cfg80211: Fix use_for flag update on BSS refresh wifi: brcmfmac: rename function that frees vif wifi: brcmfmac: fix/add kernel-doc comments wifi: mac80211: Update csa_finalize to use link_id wifi: cfg80211: add cfg80211_stop_link() for per-link teardown wifi: ath12k: Skip DP peer creation for scan vdev wifi: ath12k: move firmware stats request outside of atomic context wifi: ath12k: add the missing RCU lock in ath12k_dp_tx_free_txbuf() ... ==================== Link: https://patch.msgid.link/20260112185836.378736-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-12 17:02:02 -08:00
Tetsuo Handa	ec69daabe4	bpf: Fix reference count leak in bpf_prog_test_run_xdp() syzbot is reporting unregister_netdevice: waiting for sit0 to become free. Usage count = 2 problem. A debug printk() patch found that a refcount is obtained at xdp_convert_md_to_buff() from bpf_prog_test_run_xdp(). According to commit `ec94670fcb` ("bpf: Support specifying ingress via xdp_md context in BPF_PROG_TEST_RUN"), the refcount obtained by xdp_convert_md_to_buff() will be released by xdp_convert_buff_to_md(). Therefore, we can consider that the error handling path introduced by commit `1c19499825` ("bpf: introduce frags support to bpf_prog_test_run_xdp()") forgot to call xdp_convert_buff_to_md(). Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 Fixes: `1c19499825` ("bpf: introduce frags support to bpf_prog_test_run_xdp()") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/af090e53-9d9b-4412-8acb-957733b3975c@I-love.SAKURA.ne.jp Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-12 16:37:40 -08:00
Miri Korenblit	46e7ced3ef	wifi: mac80211: remove width argument from ieee80211_parse_bitrates The width parameter in ieee80211_parse_bitrates() is unused. Remove it. While at it, use the already fetched sband pointer as an argument instead of dereferencing it once again. Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20260108143257.d13dbbda93f0.Ie70b24af583e3812883b4004ce227e7af1646855@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-01-12 19:48:41 +01:00
Johannes Berg	f813117f20	wifi: mac80211: improve station iteration ergonomics Right now, the only way to iterate stations is to declare an iterator function, possibly data structure to use, and pass all that to the iteration helper function. This is annoying, and there's really no inherent need for it. Add a new for_each_station() macro that does the iteration in a more ergonomic way. To avoid even more exported functions, do the old ieee80211_iterate_stations_mtx() as an inline using the new way, which may also let the compiler optimise it a bit more, e.g. via inlining the iterator function. Link: https://patch.msgid.link/20260108143431.d2b641f6f6af.I4470024f7404446052564b15bcf8b3f1ada33655@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-01-12 19:48:17 +01:00
Johannes Berg	6b3bafa2bd	wifi: mac80211: improve interface iteration ergonomics Right now, the only way to iterate interfaces is to declare an iterator function, possibly data structure to use, and pass all that to the iteration helper function. This is annoying, and there's really no inherent need for it, except it was easier to implement with the iflist mutex, but that's not used much now. Add a new for_each_interface() macro that does the iteration in a more ergonomic way. To avoid even more exported functions, do the old ieee80211_iterate_active_interfaces_mtx() as an inline using the new way, which may also let the compiler optimise it a bit more, e.g. via inlining the iterator function. Also provide for_each_active_interface() for the common case of just iterating active interfaces. Link: https://patch.msgid.link/20260108143431.f2581e0c381a.Ie387227504c975c109c125b3c57f0bb3fdab2835@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-01-12 19:48:17 +01:00
Lachlan Hodges	e1cbdf78f6	wifi: cfg80211: include S1G_NO_PRIMARY flag when sending channel When sending a channel ensure we include the IEEE80211_CHAN_S1G_NO_PRIMARY flag. Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com> Link: https://patch.msgid.link/20260109081439.3168-1-lachlan.hodges@morsemicro.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-01-12 19:47:35 +01:00
Johannes Berg	391234eb48	wifi: mac80211: unexport ieee80211_get_bssid() This is only used within mac80211, and not even declared in a public header file. Don't export it. Link: https://patch.msgid.link/20260109095029.2b4d2fe53fc9.I9f5fa5c84cd42f749be0b87cc61dac8631c4c6d0@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-01-12 19:47:25 +01:00
Miri Korenblit	33821a2b20	wifi: mac80211: don't send an unused argument to ieee80211_check_combinations When ieee80211_check_combinations is called with NULL as the chandef, the chanmode argument is not relevant. Send a don't care (0) instead. Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20260111192411.9aa743647b43.I407b3d878d94464ce01e25f16c6e2b687bcd8b5a@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-01-12 19:42:39 +01:00
Jakub Kicinski	c8a49a2f91	Merge tag 'for-net-2026-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - hci_sync: enable PA Sync Lost event * tag 'for-net-2026-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: hci_sync: enable PA Sync Lost event ==================== Link: https://patch.msgid.link/20260109211949.236218-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-10 14:32:01 -08:00
Eric Dumazet	e67c577d89	ipv4: ip_gre: make ipgre_header() robust Analog to commit `db5b4e39c4` ("ip6_gre: make ip6gre_header() robust") Over the years, syzbot found many ways to crash the kernel in ipgre_header() [1]. This involves team or bonding drivers ability to dynamically change their dev->needed_headroom and/or dev->hard_header_len In this particular crash mld_newpack() allocated an skb with a too small reserve/headroom, and by the time mld_sendpack() was called, syzbot managed to attach an ipgre device. [1] skbuff: skb_under_panic: text:ffffffff89ea3cb7 len:2030915468 put:2030915372 head:ffff888058b43000 data:ffff887fdfa6e194 tail:0x120 end:0x6c0 dev:team0 kernel BUG at net/core/skbuff.c:213 ! Oops: invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 1 UID: 0 PID: 1322 Comm: kworker/1:9 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025 Workqueue: mld mld_ifc_work RIP: 0010:skb_panic+0x157/0x160 net/core/skbuff.c:213 Call Trace: <TASK> skb_under_panic net/core/skbuff.c:223 [inline] skb_push+0xc3/0xe0 net/core/skbuff.c:2641 ipgre_header+0x67/0x290 net/ipv4/ip_gre.c:897 dev_hard_header include/linux/netdevice.h:3436 [inline] neigh_connected_output+0x286/0x460 net/core/neighbour.c:1618 NF_HOOK_COND include/linux/netfilter.h:307 [inline] ip6_output+0x340/0x550 net/ipv6/ip6_output.c:247 NF_HOOK+0x9e/0x380 include/linux/netfilter.h:318 mld_sendpack+0x8d4/0xe60 net/ipv6/mcast.c:1855 mld_send_cr net/ipv6/mcast.c:2154 [inline] mld_ifc_work+0x83e/0xd60 net/ipv6/mcast.c:2693 process_one_work kernel/workqueue.c:3257 [inline] process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421 kthread+0x711/0x8a0 kernel/kthread.c:463 ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246 Fixes: `c544193214` ("GRE: Refactor GRE tunneling code.") Reported-by: syzbot+7c134e1c3aa3283790b9@syzkaller.appspotmail.com Closes: https://www.spinics.net/lists/netdev/msg1147302.html Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260108190214.1667040-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-10 12:06:22 -08:00
Eric Dumazet	eb74c19fe1	net: update netdev_lock_{type,name} Add missing entries in netdev_lock_type[] and netdev_lock_name[] : CAN, MCTP, RAWIP, CAIF, IP6GRE, 6LOWPAN, NETLINK, VSOCKMON, IEEE802154_MONITOR. Also add a WARN_ONCE() in netdev_lock_pos() to help future bug hunting next time a protocol is added without updating these arrays. Fixes: `1a33e10e4a` ("net: partially revert dynamic lockdep key changes") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260108093244.830280-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-09 18:14:32 -08:00
Eric Dumazet	81c734dae2	ip6_tunnel: use skb_vlan_inet_prepare() in __ip6_tnl_rcv() Blamed commit did not take care of VLAN encapsulations as spotted by syzbot [1]. Use skb_vlan_inet_prepare() instead of pskb_inet_may_pull(). [1] BUG: KMSAN: uninit-value in __INET_ECN_decapsulate include/net/inet_ecn.h:253 [inline] BUG: KMSAN: uninit-value in INET_ECN_decapsulate include/net/inet_ecn.h:275 [inline] BUG: KMSAN: uninit-value in IP6_ECN_decapsulate+0x7a8/0x1fa0 include/net/inet_ecn.h:321 __INET_ECN_decapsulate include/net/inet_ecn.h:253 [inline] INET_ECN_decapsulate include/net/inet_ecn.h:275 [inline] IP6_ECN_decapsulate+0x7a8/0x1fa0 include/net/inet_ecn.h:321 ip6ip6_dscp_ecn_decapsulate+0x16f/0x1b0 net/ipv6/ip6_tunnel.c:729 __ip6_tnl_rcv+0xed9/0x1b50 net/ipv6/ip6_tunnel.c:860 ip6_tnl_rcv+0xc3/0x100 net/ipv6/ip6_tunnel.c:903 gre_rcv+0x1529/0x1b90 net/ipv6/ip6_gre.c:-1 ip6_protocol_deliver_rcu+0x1c89/0x2c60 net/ipv6/ip6_input.c:438 ip6_input_finish+0x1f4/0x4a0 net/ipv6/ip6_input.c:489 NF_HOOK include/linux/netfilter.h:318 [inline] ip6_input+0x9c/0x330 net/ipv6/ip6_input.c:500 ip6_mc_input+0x7ca/0xc10 net/ipv6/ip6_input.c:590 dst_input include/net/dst.h:474 [inline] ip6_rcv_finish+0x958/0x990 net/ipv6/ip6_input.c:79 NF_HOOK include/linux/netfilter.h:318 [inline] ipv6_rcv+0xf1/0x3c0 net/ipv6/ip6_input.c:311 __netif_receive_skb_one_core net/core/dev.c:6139 [inline] __netif_receive_skb+0x1df/0xac0 net/core/dev.c:6252 netif_receive_skb_internal net/core/dev.c:6338 [inline] netif_receive_skb+0x57/0x630 net/core/dev.c:6397 tun_rx_batched+0x1df/0x980 drivers/net/tun.c:1485 tun_get_user+0x5c0e/0x6c60 drivers/net/tun.c:1953 tun_chr_write_iter+0x3e9/0x5c0 drivers/net/tun.c:1999 new_sync_write fs/read_write.c:593 [inline] vfs_write+0xbe2/0x15d0 fs/read_write.c:686 ksys_write fs/read_write.c:738 [inline] __do_sys_write fs/read_write.c:749 [inline] __se_sys_write fs/read_write.c:746 [inline] __x64_sys_write+0x1fb/0x4d0 fs/read_write.c:746 x64_sys_call+0x30ab/0x3e70 arch/x86/include/generated/asm/syscalls_64.h:2 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xd3/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Uninit was created at: slab_post_alloc_hook mm/slub.c:4960 [inline] slab_alloc_node mm/slub.c:5263 [inline] kmem_cache_alloc_node_noprof+0x9e7/0x17a0 mm/slub.c:5315 kmalloc_reserve+0x13c/0x4b0 net/core/skbuff.c:586 __alloc_skb+0x805/0x1040 net/core/skbuff.c:690 alloc_skb include/linux/skbuff.h:1383 [inline] alloc_skb_with_frags+0xc5/0xa60 net/core/skbuff.c:6712 sock_alloc_send_pskb+0xacc/0xc60 net/core/sock.c:2995 tun_alloc_skb drivers/net/tun.c:1461 [inline] tun_get_user+0x1142/0x6c60 drivers/net/tun.c:1794 tun_chr_write_iter+0x3e9/0x5c0 drivers/net/tun.c:1999 new_sync_write fs/read_write.c:593 [inline] vfs_write+0xbe2/0x15d0 fs/read_write.c:686 ksys_write fs/read_write.c:738 [inline] __do_sys_write fs/read_write.c:749 [inline] __se_sys_write fs/read_write.c:746 [inline] __x64_sys_write+0x1fb/0x4d0 fs/read_write.c:746 x64_sys_call+0x30ab/0x3e70 arch/x86/include/generated/asm/syscalls_64.h:2 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xd3/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f CPU: 0 UID: 0 PID: 6465 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(none) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025 Fixes: `8d975c15c0` ("ip6_tunnel: make sure to pull inner header in __ip6_tnl_rcv()") Reported-by: syzbot+d4dda070f833dc5dc89a@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/695e88b2.050a0220.1c677c.036d.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260107163109.4188620-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-09 18:14:08 -08:00
Eric Dumazet	b25a0b4a21	net: bridge: annotate data-races around fdb->{updated,used} fdb->updated and fdb->used are read and written locklessly. Add READ_ONCE()/WRITE_ONCE() annotations. Fixes: `31cbc39b63` ("net: bridge: add option to allow activity notifications for any fdb entries") Reported-by: syzbot+bfab43087ad57222ce96@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/695e3d74.050a0220.1c677c.035f.GAE@google.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20260108093806.834459-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-09 17:36:21 -08:00
Linus Torvalds	4621c338d3	Merge tag 'ceph-for-6.19-rc5' of https://github.com/ceph/ceph-client Pull ceph fixes from Ilya Dryomov: "A bunch of libceph fixes split evenly between memory safety and implementation correctness issues (all marked for stable) and a change in maintainers for CephFS: Slava and Alex have formally taken over Xiubo's role" * tag 'ceph-for-6.19-rc5' of https://github.com/ceph/ceph-client: libceph: make calc_target() set t->paused, not just clear it libceph: reset sparse-read state in osd_fault() libceph: return the handler error from mon_handle_auth_done() libceph: make free_choose_arg_map() resilient to partial allocation ceph: update co-maintainers list in MAINTAINERS libceph: replace overzealous BUG_ON in osdmap_apply_incremental() libceph: prevent potential out-of-bounds reads in handle_auth_done()	2026-01-09 15:05:19 -10:00
Yang Li	ab749bfe6a	Bluetooth: hci_sync: enable PA Sync Lost event Enable the PA Sync Lost event mask to ensure PA sync loss is properly reported and handled. Fixes: `485e0626e5` ("Bluetooth: hci_event: Fix not handling PA Sync Lost event") Signed-off-by: Yang Li <yang.li@amlogic.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-01-09 16:03:57 -05:00
Eric Dumazet	872ac785e7	ipv4: ip_tunnel: spread netdev_lockdep_set_classes() Inspired by yet another syzbot report. IPv6 tunnels call netdev_lockdep_set_classes() for each tunnel type, while IPv4 currently centralizes netdev_lockdep_set_classes() call from ip_tunnel_init(). Make ip_tunnel_init() a macro, so that we have different lockdep classes per tunnel type. Fixes: `0bef512012` ("net: add netdev_lockdep_set_classes() to virtual drivers") Reported-by: syzbot+1240b33467289f5ab50b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/695d439f.050a0220.1c677c.0347.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260106172426.1760721-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-08 18:02:35 -08:00
Jakub Kicinski	59ba823e68	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.19-rc5). No conflicts, or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-08 11:38:33 -08:00
Eric Dumazet	c92510f5e3	arp: do not assume dev_hard_header() does not change skb->head arp_create() is the only dev_hard_header() caller making assumption about skb->head being unchanged. A recent commit broke this assumption. Initialize @arp pointer after dev_hard_header() call. Fixes: `db5b4e39c4` ("ip6_gre: make ip6gre_header() robust") Reported-by: syzbot+58b44a770a1585795351@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260107212250.384552-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-08 09:04:24 -08:00
Jakub Kicinski	804809ae40	Merge tag 'wireless-2026-01-08' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Couple of fixes: - mac80211: - long-standing injection bug due to chanctx rework - more recent interface iteration issue - collect statistics before removing stations - hwsim: - fix NAN frequency typo (potential NULL ptr deref) - fix locking of radio lock (needs softirqs disabled) - wext: - ancient issue with compat and events copying some uninitialized stack data to userspace * tag 'wireless-2026-01-08' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: mac80211: collect station statistics earlier when disconnect wifi: mac80211: restore non-chanctx injection behaviour wifi: mac80211_hwsim: disable BHs for hwsim_radio_lock wifi: mac80211: don't iterate not running interfaces wifi: mac80211_hwsim: fix typo in frequency notification wifi: avoid kernel-infoleak from struct iw_point ==================== Link: https://patch.msgid.link/20260108140141.139687-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-08 08:49:24 -08:00

1 2 3 4 5 ...

82775 Commits