linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-14 10:02:33 -04:00

Author	SHA1	Message	Date
Ryan Wanner	dce32ece3b	net: cadence: macb: Expose REFCLK as a device tree property The RMII and RGMII can both support internal or external provided REFCLKs 50MHz and 125MHz respectively. Since this is dependent on the board that the SoC is on this needs to be set via the device tree. This property flag is checked in the MACB DT node so the REFCLK cap is configured the correct way for the RMII or RGMII is configured on the board. Signed-off-by: Ryan Wanner <Ryan.Wanner@microchip.com> Link: https://patch.msgid.link/7f9b65896d6b7b48275bc527b72a16347f8ce10a.1752510727.git.Ryan.Wanner@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-16 17:28:37 -07:00
Ryan Wanner	1b7531c094	dt-bindings: net: cdns,macb: Add external REFCLK property REFCLK can be provided by an external source so this should be exposed by a DT property. The REFCLK is used for RMII and in some SoCs that use this driver the RGMII 125MHz clk can also be provided by an external source. Signed-off-by: Ryan Wanner <Ryan.Wanner@microchip.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/d558467c4d5b27fb3135ffdead800b14cd9c6c0a.1752510727.git.Ryan.Wanner@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-16 17:28:36 -07:00
Jakub Kicinski	27b0286d00	Merge branch 'selftest-net-add-selftest-for-netpoll' Breno Leitao says: ==================== selftest: net: Add selftest for netpoll I am submitting a new selftest for the netpoll subsystem specifically targeting the case where the RX is polling in the TX path, which is a case that we don't have any test in the tree today. This is done when netpoll_poll_dev() called, and this test creates a scenario when that is probably. The test does the following: 1) Configuring a single RX/TX queue to increase contention on the interface. 2) Generating background traffic to saturate the network, mimicking real-world congestion. 3) Sending netconsole messages to trigger netpoll polling and monitor its behavior. 4) Using dynamic netconsole targets via configfs, with the ability to delete and recreate targets during the test. 5) Running bpftrace in parallel to verify that netpoll_poll_dev() is called when expected. If it is called, then the test passes, otherwise the test is marked as skipped. In order to achieve it, I stole Jakub's bpftrace helper from [1], and did some small changes that I found useful to use the helper. So, this patchset basically contains: 1) The code stolen from Jakub 2) Improvements on bpftrace() helper 3) The selftest itself Link: https://lore.kernel.org/all/20250421222827.283737-22-kuba@kernel.org/ [1] ==================== Link: https://patch.msgid.link/20250714-netpoll_test-v7-0-c0220cfaa63e@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-16 17:25:51 -07:00
Breno Leitao	b3019343e4	selftests: net: add netpoll basic functionality test Add a basic selftest for the netpoll polling mechanism, specifically targeting the netpoll poll() side. The test creates a scenario where network transmission is running at maximum speed, and netpoll needs to poll the NIC. This is achieved by: 1. Configuring a single RX/TX queue to create contention 2. Generating background traffic to saturate the interface 3. Sending netconsole messages to trigger netpoll polling 4. Using dynamic netconsole targets via configfs 5. Delete and create new netconsole targets after some messages 6. Start a bpftrace in parallel to make sure netpoll_poll_dev() is called 7. If bpftrace exists and netpoll_poll_dev() was called, stop. The test validates a critical netpoll code path by monitoring traffic flow and ensuring netpoll_poll_dev() is called when the normal TX path is blocked. This addresses a gap in netpoll test coverage for a path that is tricky for the network stack. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250714-netpoll_test-v7-3-c0220cfaa63e@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-16 17:25:49 -07:00
Breno Leitao	fd2aadcefb	selftests: drv-net: Strip '@' prefix from bpftrace map keys The '@' prefix in bpftrace map keys is specific to bpftrace and can be safely removed when processing results. This patch modifies the bpftrace utility to strip the '@' from map keys before storing them in the result dictionary, making the keys more consistent with Python conventions. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250714-netpoll_test-v7-2-c0220cfaa63e@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-16 17:25:49 -07:00
Jakub Kicinski	3c561c547c	selftests: drv-net: add helper/wrapper for bpftrace bpftrace is very useful for low level driver testing. perf or trace-cmd would also do for collecting data from tracepoints, but they require much more post-processing. Add a wrapper for running bpftrace and sanitizing its output. bpftrace has JSON output, which is great, but it prints loose objects and in a slightly inconvenient format. We have to read the objects line by line, and while at it return them indexed by the map name. Reviewed-by: Breno Leitao <leitao@debian.org> Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250714-netpoll_test-v7-1-c0220cfaa63e@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-16 17:25:49 -07:00
Yue Haibing	6c628ed95e	ipv6: mcast: Simplify mld_clear_{report\|query}() Use __skb_queue_purge() instead of re-implementing it. Note that it uses kfree_skb_reason() instead of kfree_skb() internally, and pass SKB_DROP_REASON_QUEUE_PURGE drop reason to the kfree_skb tracepoint. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20250715120709.3941510-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-16 16:14:42 -07:00
Stefano Garzarella	47ee43e4bf	vsock/test: fix vsock_ioctl_int() check for unsupported ioctl `vsock_do_ioctl` returns -ENOIOCTLCMD if an ioctl support is not implemented, like for SIOCINQ before commit `f7c7226592` ("vsock: Add support for SIOCINQ ioctl"). In net/socket.c, -ENOIOCTLCMD is re-mapped to -ENOTTY for the user space. So, our test suite, without that commit applied, is failing in this way: 34 - SOCK_STREAM ioctl(SIOCINQ) functionality...ioctl(21531): Inappropriate ioctl for device Return false in vsock_ioctl_int() to skip the test in this case as well, instead of failing. Fixes: `53548d6bff` ("test/vsock: Add retry mechanism to ioctl wrapper") Cc: niuxuewei.nxw@antgroup.com Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Xuewei Niu <niuxuewei.nxw@antgroup.com> Link: https://patch.msgid.link/20250715093233.94108-1-sgarzare@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-16 16:14:00 -07:00
Paolo Abeni	7eeabfb237	tcp: fix UaF in tcp_prune_ofo_queue() The CI reported a UaF in tcp_prune_ofo_queue(): BUG: KASAN: slab-use-after-free in tcp_prune_ofo_queue+0x55d/0x660 Read of size 4 at addr ffff8880134729d8 by task socat/20348 CPU: 0 UID: 0 PID: 20348 Comm: socat Not tainted 6.16.0-rc5-virtme #1 PREEMPT(full) Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: <TASK> dump_stack_lvl+0x82/0xd0 print_address_description.constprop.0+0x2c/0x400 print_report+0xb4/0x270 kasan_report+0xca/0x100 tcp_prune_ofo_queue+0x55d/0x660 tcp_try_rmem_schedule+0x855/0x12e0 tcp_data_queue+0x4dd/0x2260 tcp_rcv_established+0x5e8/0x2370 tcp_v4_do_rcv+0x4ba/0x8c0 __release_sock+0x27a/0x390 release_sock+0x53/0x1d0 tcp_sendmsg+0x37/0x50 sock_write_iter+0x3c1/0x520 vfs_write+0xc09/0x1210 ksys_write+0x183/0x1d0 do_syscall_64+0xc1/0x380 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fcf73ef2337 Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 RSP: 002b:00007ffd4f924708 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fcf73ef2337 RDX: 0000000000002000 RSI: 0000555f11d1a000 RDI: 0000000000000008 RBP: 0000555f11d1a000 R08: 0000000000002000 R09: 0000000000000000 R10: 0000000000000040 R11: 0000000000000246 R12: 0000000000000008 R13: 0000000000002000 R14: 0000555ee1a44570 R15: 0000000000002000 </TASK> Allocated by task 20348: kasan_save_stack+0x24/0x50 kasan_save_track+0x14/0x30 __kasan_slab_alloc+0x59/0x70 kmem_cache_alloc_node_noprof+0x110/0x340 __alloc_skb+0x213/0x2e0 tcp_collapse+0x43f/0xff0 tcp_try_rmem_schedule+0x6b9/0x12e0 tcp_data_queue+0x4dd/0x2260 tcp_rcv_established+0x5e8/0x2370 tcp_v4_do_rcv+0x4ba/0x8c0 __release_sock+0x27a/0x390 release_sock+0x53/0x1d0 tcp_sendmsg+0x37/0x50 sock_write_iter+0x3c1/0x520 vfs_write+0xc09/0x1210 ksys_write+0x183/0x1d0 do_syscall_64+0xc1/0x380 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 20348: kasan_save_stack+0x24/0x50 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x60 __kasan_slab_free+0x38/0x50 kmem_cache_free+0x149/0x330 tcp_prune_ofo_queue+0x211/0x660 tcp_try_rmem_schedule+0x855/0x12e0 tcp_data_queue+0x4dd/0x2260 tcp_rcv_established+0x5e8/0x2370 tcp_v4_do_rcv+0x4ba/0x8c0 __release_sock+0x27a/0x390 release_sock+0x53/0x1d0 tcp_sendmsg+0x37/0x50 sock_write_iter+0x3c1/0x520 vfs_write+0xc09/0x1210 ksys_write+0x183/0x1d0 do_syscall_64+0xc1/0x380 entry_SYSCALL_64_after_hwframe+0x77/0x7f The buggy address belongs to the object at ffff888013472900 which belongs to the cache skbuff_head_cache of size 232 The buggy address is located 216 bytes inside of freed 232-byte region [ffff888013472900, ffff8880134729e8) The buggy address belongs to the physical page: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x13472 head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0x80000000000040(head\|node=0\|zone=1) page_type: f5(slab) raw: 0080000000000040 ffff88800198fb40 ffffea0000347b10 ffffea00004f5290 raw: 0000000000000000 0000000000120012 00000000f5000000 0000000000000000 head: 0080000000000040 ffff88800198fb40 ffffea0000347b10 ffffea00004f5290 head: 0000000000000000 0000000000120012 00000000f5000000 0000000000000000 head: 0080000000000001 ffffea00004d1c81 00000000ffffffff 00000000ffffffff head: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888013472880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888013472900: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff888013472980: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc ^ ffff888013472a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888013472a80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb Indeed tcp_prune_ofo_queue() is reusing the skb dropped a few lines above. The caller wants to enqueue 'in_skb', lets check space vs the latter. Fixes: `1d2fbaad7c` ("tcp: stronger sk_rcvbuf checks") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: syzbot+865aca08c0533171bf6a@syzkaller.appspotmail.com Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/b78d2d9bdccca29021eed9a0e7097dd8dc00f485.1752567053.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-16 16:13:26 -07:00
Jakub Kicinski	511ad4c264	selftests: packetdrill: correct the expected timing in tcp_rcv_big_endseq Commit `f5fda1a868` ("selftests/net: packetdrill: add tcp_rcv_big_endseq.pkt") added this test recently, but it's failing with: # tcp_rcv_big_endseq.pkt:41: error handling packet: timing error: expected outbound packet at 1.230105 sec but happened at 1.190101 sec; tolerance 0.005046 sec # script packet: 1.230105 . 1:1(0) ack 54001 win 0 # actual packet: 1.190101 . 1:1(0) ack 54001 win 0 It's unclear why the test expects the ack to be delayed. Correct it. Link: https://patch.msgid.link/20250715142849.959444-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-16 15:05:56 -07:00
Gal Pressman	410b0ace88	ethtool: Don't check for RXFH fields conflict when no input_xfrm is requested The requirement of ->get_rxfh_fields() in ethtool_set_rxfh() is there to verify that we have no conflict of input_xfrm with the RSS fields options, there is no point in doing it if input_xfrm is not supported/requested. This is under the assumption that a driver that supports input_xfrm will also support ->get_rxfh_fields(), so add a WARN_ON() to ethtool_check_ops() to verify it, and remove the op NULL check. This fixes the following error in mlx4_en, which doesn't support getting/setting RXFH fields. $ ethtool --set-rxfh-indir eth2 hfunc xor Cannot set RX flow hash configuration: Operation not supported Fixes: `72792461c8` ("net: ethtool: don't mux RXFH via rxnfc callbacks") Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250715140754.489677-1-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-16 15:03:56 -07:00
Hangbin Liu	3047957cc7	selftests: rtnetlink: fix addrlft test flakiness on power-saving systems Jakub reported that the rtnetlink test for the preferred lifetime of an address has become quite flaky. The issue started appearing around the 6.16 merge window in May, and the test fails with: FAIL: preferred_lft addresses remaining The flakiness might be related to power-saving behavior, as address expiration is handled by a "power-efficient" workqueue. To address this, use slowwait to check more frequently whether the address still exists. This reduces the likelihood of the system entering a low-power state during the test, improving reliability. Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20250715043459.110523-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-16 15:03:26 -07:00
Jakub Kicinski	c3886ccaad	Merge branch 'net-hns3-use-seq_file-for-debugfs' Jijie Shao says: ==================== net: hns3: use seq_file for debugfs Arnd reported that there are two build warning for on-stasck buffer oversize. As Arnd's suggestion, using seq file way to avoid the stack buffer or kmalloc buffer allocating. v2: https://lore.kernel.org/20250711061725.225585-1-shaojijie@huawei.com v1: https://lore.kernel.org/20250708130029.1310872-1-shaojijie@huawei.com ==================== Link: https://patch.msgid.link/20250714061037.2616413-1-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-15 16:50:07 -07:00
Jian Shen	b0aabb3b1e	net: hns3: use seq_file for files in tx_bd_info/ and rx_bd_info/ in debugfs This patch use seq_file for the following nodes: tx_bd_queue_/rx_bd_queue_ This patch is the last modification to debugfs file. Unused functions and variables are removed together. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20250714061037.2616413-11-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-15 16:50:05 -07:00
Yonglong Liu	9e1545b488	net: hns3: use seq_file for files in common/ of hclge layer This patch use seq_file for the following nodes: mng_tbl/loopback/interrupt_info/reset_info/imp_info/ncl_config/ mac_tnl_status/service_task_info/vlan_config/ptp_info This patch is the last modification to debugfs file of hclge layer. Unused functions and variables are removed together. Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20250714061037.2616413-10-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-15 16:50:05 -07:00
Jijie Shao	3945d94c9f	net: hns3: use seq_file for files in fd/ in debugfs This patch use seq_file for the following nodes: fd_tcam/fd_counter Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20250714061037.2616413-9-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-15 16:50:05 -07:00
Jijie Shao	2363145ad8	net: hns3: use seq_file for files in reg/ in debugfs This patch use seq_file for the following nodes: bios_common/ssu/igu_egu/rpu/ncsi/rtc/ppp/rcb/tqp/mac/dcb Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20250714061037.2616413-8-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-15 16:50:05 -07:00
Yonglong Liu	00f9ea261d	net: hns3: use seq_file for files in mac_list/ in debugfs This patch use seq_file for the following nodes: uc/mc Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20250714061037.2616413-7-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-15 16:50:05 -07:00
Jian Shen	08a6476e28	net: hns3: use seq_file for files in tm/ in debugfs Use seq_file for files in debugfs. This is the first modification for reading in hclge_debugfs.c. This patch use seq_file for the following nodes: tm_nodes/tm_priority/tm_qset/tm_map/tm_pg/tm_port/tc_sch_info/ qos_pause_cfg/qos_pri_map/qos_dscp_map/qos_buf_cfg Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20250714061037.2616413-6-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-15 16:50:04 -07:00
Jijie Shao	2b65524d10	net: hns3: use seq_file for files in common/ of hns3 layer This patch use seq_file for the following nodes: dev_info/coalesce_info/page_pool_info Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20250714061037.2616413-5-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-15 16:50:04 -07:00
Jian Shen	eced3d1c41	net: hns3: use seq_file for files in queue/ in debugfs This patch use seq_file for the following nodes: rx_queue_info/queue_map Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20250714061037.2616413-4-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-15 16:50:04 -07:00
Jian Shen	c557c18326	net: hns3: clean up the build warning in debugfs by use seq file Arnd reported that there are two build warning for on-stasck buffer oversize. As Arnd's suggestion, using seq file way to avoid the stack buffer or kmalloc buffer allocating. Reported-by: Arnd Bergmann <arnd@kernel.org> Closes: https://lore.kernel.org/all/20250610092113.2639248-1-arnd@kernel.org/ Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20250714061037.2616413-3-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-15 16:50:04 -07:00
Jijie Shao	277ed0cc9d	net: hns3: remove tx spare info from debugfs. The tx spare info in debugfs is not very useful, and there are related statistics available for troubleshooting. This patch removes the tx spare info from debugfs. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20250714061037.2616413-2-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-15 16:50:04 -07:00
Yue Haibing	ce6030afe4	ipv6: mcast: Remove unnecessary null check in ip6_mc_find_dev() These is no need to check null for idev before return NULL. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250714081732.3109764-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-15 16:34:01 -07:00
Al Viro	5cc7fce349	don't open-code kernel_accept() in rds_tcp_accept_one() rds_tcp_accept_one() starts with a pretty much verbatim copy of kernel_accept(). Might as well use the real thing... That code went into mainline in 2009, kernel_accept() had been added in Aug 2006, the copyright on rds/tcp_listen.c is "Copyright (c) 2006 Oracle", so it's entirely possible that it predates the introduction of kernel_accept(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://patch.msgid.link/20250713180134.GC1880847@ZenIV Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-15 16:19:54 -07:00
Andy Gospodarek	c34632dbb2	bnxt: move bnxt_hsi.h to include/linux/bnxt/hsi.h This moves bnxt_hsi.h contents to a common location so it can be properly referenced by bnxt_en, bnxt_re, and bnge. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20250714170202.39688-1-gospo@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-15 16:03:24 -07:00
Paolo Abeni	55e8757c69	Merge branch 'net-mctp-improved-bind-handling' Matt Johnston says: ==================== net: mctp: Improved bind handling This series improves a couple of aspects of MCTP bind() handling. MCTP wasn't checking whether the same MCTP type was bound by multiple sockets. That would result in messages being received by an arbitrary socket, which isn't useful behaviour. Instead it makes more sense to have the duplicate binds fail, the same as other network protocols. An exception is made for more-specific binds to particular MCTP addresses. It is also useful to be able to limit a bind to only receive incoming request messages (MCTP TO bit set) from a specific peer+type, so that individual processes can communicate with separate MCTP peers. One example is a PLDM firmware update requester, which will initiate communication with a device, and then the device will connect back to the requester process. These limited binds are implemented by a connect() call on the socket prior to bind. connect() isn't used in the general case for MCTP, since a plain send() wouldn't provide the required MCTP tag argument for addressing. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> ==================== Link: https://patch.msgid.link/20250710-mctp-bind-v4-0-8ec2f6460c56@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-15 12:08:41 +02:00
Matt Johnston	e6d8e7dbc5	net: mctp: Add bind lookup test Test the preference order of bound socket matches with a series of test packets. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://patch.msgid.link/20250710-mctp-bind-v4-8-8ec2f6460c56@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-15 12:08:39 +02:00
Matt Johnston	b7e28129b6	net: mctp: Test conflicts of connect() with bind() The addition of connect() adds new conflict cases to test. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://patch.msgid.link/20250710-mctp-bind-v4-7-8ec2f6460c56@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-15 12:08:39 +02:00
Matt Johnston	3549eb08e5	net: mctp: Allow limiting binds to a peer address Prior to calling bind() a program may call connect() on a socket to restrict to a remote peer address. Using connect() is the normal mechanism to specify a remote network peer, so we use that here. In MCTP connect() is only used for bound sockets - send() is not available for MCTP since a tag must be provided for each message. The smctp_type must match between connect() and bind() calls. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://patch.msgid.link/20250710-mctp-bind-v4-6-8ec2f6460c56@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-15 12:08:39 +02:00
Matt Johnston	1aeed732f4	net: mctp: Use hashtable for binds Ensure that a specific EID (remote or local) bind will match in preference to a MCTP_ADDR_ANY bind. This adds infrastructure for binding a socket to receive messages from a specific remote peer address, a future commit will expose an API for this. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://patch.msgid.link/20250710-mctp-bind-v4-5-8ec2f6460c56@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-15 12:08:39 +02:00
Matt Johnston	4ec4b7fc04	net: mctp: Add test for conflicting bind()s Test pairwise combinations of bind addresses and types. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://patch.msgid.link/20250710-mctp-bind-v4-4-8ec2f6460c56@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-15 12:08:39 +02:00
Matt Johnston	5000268c29	net: mctp: Treat MCTP_NET_ANY specially in bind() When a specific EID is passed as a bind address, it only makes sense to interpret with an actual network ID, so resolve that to the default network at bind time. For bind address of MCTP_ADDR_ANY, we want to be able to capture traffic to any network and address, so keep the current behaviour of matching traffic from any network interface (don't interpret MCTP_NET_ANY as the default network ID). Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://patch.msgid.link/20250710-mctp-bind-v4-3-8ec2f6460c56@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-15 12:08:39 +02:00
Matt Johnston	3954502377	net: mctp: Prevent duplicate binds Disallow bind() calls that have the same arguments as existing bound sockets. Previously multiple sockets could bind() to the same type/local address, with an arbitrary socket receiving matched messages. This is only a partial fix, a future commit will define precedence order for MCTP_ADDR_ANY versus specific EID bind(), which are allowed to exist together. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://patch.msgid.link/20250710-mctp-bind-v4-2-8ec2f6460c56@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-15 12:08:39 +02:00
Matt Johnston	3558ab79a2	net: mctp: mctp_test_route_extaddr_input cleanup The sock was not being released. Other than leaking, the stale socket will conflict with subsequent bind() calls in unrelated MCTP tests. Fixes: `46ee16462f` ("net: mctp: test: Add extaddr routing output test") Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://patch.msgid.link/20250710-mctp-bind-v4-1-8ec2f6460c56@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-15 12:08:39 +02:00
Yue Haibing	a8594c956c	ipv6: mcast: Avoid a duplicate pointer check in mld_del_delrec() Avoid duplicate non-null pointer check for pmc in mld_del_delrec(). No functional changes. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250714081949.3109947-1-yuehaibing@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-15 11:02:53 +02:00
Jakub Kicinski	06baf9bfa6	Merge branch 'tcp-receiver-changes' Eric Dumazet says: ==================== tcp: receiver changes Before accepting an incoming packet: - Make sure to not accept a packet beyond advertized RWIN. If not, increment a new SNMP counter (LINUX_MIB_BEYOND_WINDOW) - ooo packets should update rcv_mss and tp->scaling_ratio. - Make sure to not accept packet beyond sk_rcvbuf limit. This series includes three associated packetdrill tests. ==================== Link: https://patch.msgid.link/20250711114006.480026-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-14 18:41:43 -07:00
Eric Dumazet	906893cf2c	selftests/net: packetdrill: add tcp_rcv_toobig.pkt Check that TCP receiver behavior after "tcp: stronger sk_rcvbuf checks" Too fat packet is dropped unless receive queue is empty. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250711114006.480026-9-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-14 18:41:43 -07:00
Eric Dumazet	1d2fbaad7c	tcp: stronger sk_rcvbuf checks Currently, TCP stack accepts incoming packet if sizes of receive queues are below sk->sk_rcvbuf limit. This can cause memory overshoot if the packet is big, like an 1/2 MB BIG TCP one. Refine the check to take into account the incoming skb truesize. Note that we still accept the packet if the receive queue is empty, to not completely freeze TCP flows in pathological conditions. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250711114006.480026-8-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-14 18:41:43 -07:00
Eric Dumazet	75dff0584c	tcp: add const to tcp_try_rmem_schedule() and sk_rmem_schedule() skb These functions to not modify the skb, add a const qualifier. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250711114006.480026-7-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-14 18:41:43 -07:00
Eric Dumazet	445e0cc38d	selftests/net: packetdrill: add tcp_ooo_rcv_mss.pkt We make sure tcpi_rcv_mss and tp->scaling_ratio are correctly updated if no in-order packet has been received yet. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250711114006.480026-6-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-14 18:41:43 -07:00
Eric Dumazet	38d7e44433	tcp: call tcp_measure_rcv_mss() for ooo packets tcp_measure_rcv_mss() is used to update icsk->icsk_ack.rcv_mss (tcpi_rcv_mss in tcp_info) and tp->scaling_ratio. Calling it from tcp_data_queue_ofo() makes sure these fields are updated, and permits a better tuning of sk->sk_rcvbuf, in the case a new flow receives many ooo packets. Fixes: `dfa2f04833` ("tcp: get rid of sysctl_tcp_adv_win_scale") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250711114006.480026-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-14 18:41:43 -07:00
Eric Dumazet	f5fda1a868	selftests/net: packetdrill: add tcp_rcv_big_endseq.pkt This test checks TCP behavior when receiving a packet beyond the window. It checks the new TcpExtBeyondWindow SNMP counter. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250711114006.480026-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-14 18:41:43 -07:00
Eric Dumazet	6c758062c6	tcp: add LINUX_MIB_BEYOND_WINDOW Add a new SNMP MIB : LINUX_MIB_BEYOND_WINDOW Incremented when an incoming packet is received beyond the receiver window. nstat -az \| grep TcpExtBeyondWindow Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250711114006.480026-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-14 18:41:42 -07:00
Eric Dumazet	9ca48d616e	tcp: do not accept packets beyond window Currently, TCP accepts incoming packets which might go beyond the offered RWIN. Add to tcp_sequence() the validation of packet end sequence. Add the corresponding check in the fast path. We relax this new constraint if the receive queue is empty, to not freeze flows from buggy peers. Add a new drop reason : SKB_DROP_REASON_TCP_INVALID_END_SEQUENCE. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250711114006.480026-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-14 18:41:15 -07:00
Arnd Bergmann	a86eb2a60d	net: wangxun: fix LIBWX dependencies again Two more drivers got added that use LIBWX and cause a build warning WARNING: unmet direct dependencies detected for LIBWX Depends on [m]: NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_WANGXUN [=y] && PTP_1588_CLOCK_OPTIONAL [=m] Selected by [y]: - NGBEVF [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_WANGXUN [=y] && PCI_MSI [=y] Selected by [m]: - NGBE [=m] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_WANGXUN [=y] && PCI [=y] && PTP_1588_CLOCK_OPTIONAL [=m] ld: drivers/net/ethernet/wangxun/libwx/wx_lib.o: in function `wx_clean_tx_irq': wx_lib.c:(.text+0x5a68): undefined reference to `ptp_schedule_worker' ld: drivers/net/ethernet/wangxun/libwx/wx_ethtool.o: in function `wx_nway_reset': wx_ethtool.c:(.text+0x880): undefined reference to `phylink_ethtool_nway_reset' Add the same dependency on PTP_1588_CLOCK_OPTIONAL to the two driver using this library module, following the pattern from commit `8fa19c2c69` ("net: wangxun: fix LIBWX dependencies"). Fixes: `377d180bd7` ("net: wangxun: add txgbevf build") Fixes: `a0008a3658` ("net: wangxun: add ngbevf build") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Simon Horman <horms@kernel.org> # build-tested Link: https://patch.msgid.link/20250711082339.1372821-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-14 18:15:32 -07:00
Samiullah Khawaja	2677010e77	Add support to set NAPI threaded for individual NAPI A net device has a threaded sysctl that can be used to enable threaded NAPI polling on all of the NAPI contexts under that device. Allow enabling threaded NAPI polling at individual NAPI level using netlink. Extend the netlink operation `napi-set` and allow setting the threaded attribute of a NAPI. This will enable the threaded polling on a NAPI context. Add a test in `nl_netdev.py` that verifies various cases of threaded NAPI being set at NAPI and at device level. Tested ./tools/testing/selftests/net/nl_netdev.py TAP version 13 1..7 ok 1 nl_netdev.empty_check ok 2 nl_netdev.lo_check ok 3 nl_netdev.page_pool_check ok 4 nl_netdev.napi_list_check ok 5 nl_netdev.dev_set_threaded ok 6 nl_netdev.napi_set_threaded ok 7 nl_netdev.nsim_rxq_reset_down # Totals: pass:7 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Samiullah Khawaja <skhawaja@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250710211203.3979655-1-skhawaja@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-14 18:02:37 -07:00
Sean Anderson	a44312d58e	net: phy: Don't register LEDs for genphy If a PHY has no driver, the genphy driver is probed/removed directly in phy_attach/detach. If the PHY's ofnode has an "leds" subnode, then the LEDs will be (un)registered when probing/removing the genphy driver. This could occur if the leds are for a non-generic driver that isn't loaded for whatever reason. Synchronously removing the PHY device in phy_detach leads to the following deadlock: rtnl_lock() ndo_close() ... phy_detach() phy_remove() phy_leds_unregister() led_classdev_unregister() led_trigger_set() netdev_trigger_deactivate() unregister_netdevice_notifier() rtnl_lock() There is a corresponding deadlock on the open/register side of things (and that one is reported by lockdep), but it requires a race while this one is deterministic. Regular drivers do not have this problem since they are probed asynchronously (without RTNL held). Generic PHYs do not support LEDs anyway, so don't bother registering them. [JakubL this is a net-next version of commit `f0f2b992d8` ("net: phy: Don't register LEDs for genphy"), which uses APIs removed in -next.] Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Link: https://patch.msgid.link/20250710201454.1280277-1-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-14 17:54:06 -07:00
Breno Leitao	ff2ac4df58	netdevsim: implement peer queue flow control Add flow control mechanism between paired netdevsim devices to stop the TX queue during high traffic scenarios. When a receive queue becomes congested (approaching NSIM_RING_SIZE limit), the corresponding transmit queue on the peer device is stopped using netif_subqueue_try_stop(). Once the receive queue has sufficient capacity again, the peer's transmit queue is resumed with netif_tx_wake_queue(). Key changes: * Add nsim_stop_peer_tx_queue() to pause peer TX when RX queue is full * Add nsim_start_peer_tx_queue() to resume peer TX when RX queue drains * Implement queue mapping validation to ensure TX/RX queue count match * Wake all queues during device unlinking to prevent stuck queues * Use RCU protection when accessing peer device references * wake the queues when changing the queue numbers * Remove IFF_NO_QUEUE given it will enqueue packets now The flow control only activates when devices have matching TX/RX queue counts to ensure proper queue mapping. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250711-netdev_flow_control-v3-1-aa1d5a155762@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-14 17:36:50 -07:00
Oscar Maes	5777d1871b	selftests: net: add test for variable PMTU in broadcast routes Added a test for variable PMTU in broadcast routes. This test uses iputils' ping and attempts to send a ping between two peers, which should result in a regular echo reply. This test will fail when the receiving peer does not receive the echo request due to a lack of packet fragmentation. Signed-off-by: Oscar Maes <oscmaes92@gmail.com> Link: https://patch.msgid.link/20250710142714.12986-2-oscmaes92@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-14 17:29:41 -07:00

1 2 3 4 5 ...

1370072 Commits