linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-03 21:45:08 -04:00

Author	SHA1	Message	Date
Russell King (Oracle)	98f9928843	net: stmmac: qcom-ethqos: use rgmii_clock() to set the link clock The link clock operates at twice the RGMII clock rate. Therefore, we can use the rgmii_clock() helper to set this clock rate. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Vinod Koul <vkoul@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/E1tlRMK-004Vsx-Ss@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-24 12:53:48 -08:00
Jason Wang	e13b6da704	virtio-net: tweak for better TX performance in NAPI mode There are several issues existed in start_xmit(): - Transmitted packets need to be freed before sending a packet, this introduces delay and increases the average packets transmit time. This also increase the time that spent in holding the TX lock. - Notification is enabled after free_old_xmit_skbs() which will introduce unnecessary interrupts if TX notification happens on the same CPU that is doing the transmission now (actually, virtio-net driver are optimized for this case). So this patch tries to avoid those issues by not cleaning transmitted packets in start_xmit() when TX NAPI is enabled and disable notifications even more aggressively. Notification will be since the beginning of the start_xmit(). But we can't enable delayed notification after TX is stopped as we will lose the notifications. Instead, the delayed notification needs is enabled after the virtqueue is kicked for best performance. Performance numbers: 1) single queue 2 vcpus guest with pktgen_sample03_burst_single_flow.sh (burst 256) + testpmd (rxonly) on the host: - When pinning TX IRQ to pktgen VCPU: split virtqueue PPS were increased 55% from 6.89 Mpps to 10.7 Mpps and 32% TX interrupts were eliminated. Packed virtqueue PPS were increased 50% from 7.09 Mpps to 10.7 Mpps, 99% TX interrupts were eliminated. - When pinning TX IRQ to VCPU other than pktgen: split virtqueue PPS were increased 96% from 5.29 Mpps to 10.4 Mpps and 45% TX interrupts were eliminated; Packed virtqueue PPS were increased 78% from 6.12 Mpps to 10.9 Mpps and 99% TX interrupts were eliminated. 2) single queue 1 vcpu guest + vhost-net/TAP on the host: single session netperf from guest to host shows 82% improvement from 31Gb/s to 58Gb/s, %stddev were reduced from 34.5% to 1.9% and 88% of TX interrupts were eliminated. Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2025-02-24 08:10:24 +00:00
Jakub Kicinski	b66e19dcf6	Merge branch 'mctp-add-mctp-over-usb-hardware-transport-binding' Jeremy Kerr says: ==================== mctp: Add MCTP-over-USB hardware transport binding Add an implementation of the DMTF standard DSP0283, providing an MCTP channel over high-speed USB. This is a fairly trivial first implementation, in that we only submit one tx and one rx URB at a time. We do accept multi-packet transfers, but do not yet generate them on transmit. Of course, questions and comments are most welcome, particularly on the USB interfaces. v2: https://lore.kernel.org/20250212-dev-mctp-usb-v2-0-76e67025d764@codeconstruct.com.au v1: https://lore.kernel.org/20250206-dev-mctp-usb-v1-0-81453fe26a61@codeconstruct.com.au ==================== Link: https://patch.msgid.link/20250221-dev-mctp-usb-v3-0-3353030fe9cc@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 16:45:26 -08:00
Jeremy Kerr	0791c0327a	net: mctp: Add MCTP USB transport driver Add an implementation for DMTF DSP0283, which defines a MCTP-over-USB transport. As per that spec, we're restricted to full speed mode, requiring 512-byte transfers. Each MCTP-over-USB interface is a peer-to-peer link to a single MCTP endpoint, so no physical addressing is required (of course, that MCTP endpoint may then bridge to further MCTP endpoints). Consequently, interfaces will report with no lladdr data: # mctp link dev lo index 1 address 00:00:00:00:00:00 net 1 mtu 65536 up dev mctpusb0 index 6 address none net 1 mtu 68 up This is a simple initial implementation, with single rx & tx urbs, and no multi-packet tx transfers - although we do accept multi-packet rx from the device. Includes suggested fixes from Santosh Puranik <spuranik@nvidia.com>. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Cc: Santosh Puranik <spuranik@nvidia.com> Link: https://patch.msgid.link/20250221-dev-mctp-usb-v3-2-3353030fe9cc@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 16:45:21 -08:00
Jeremy Kerr	dcc35baae7	usb: Add base USB MCTP definitions Upcoming changes will add a USB host (and later gadget) driver for the MCTP-over-USB protocol. Add a header that provides common definitions for protocol support: the packet header format and a few framing definitions. Add a define for the MCTP class code, as per https://usb.org/defined-class-codes. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/20250221-dev-mctp-usb-v3-1-3353030fe9cc@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 16:45:21 -08:00
Sean Anderson	e6a532185d	net: cadence: macb: Implement BQL Implement byte queue limits to allow queuing disciplines to account for packets enqueued in the ring buffer but not yet transmitted. There are a separate set of transmit functions for AT91 that I haven't touched since I don't have hardware to test on. Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Link: https://patch.msgid.link/20250220164257.96859-1-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 16:39:08 -08:00
Russell King (Oracle)	3e401818c8	net: stmmac: print stmmac_init_dma_engine() errors using netdev_err() stmmac_init_dma_engine() uses dev_err() which leads to errors being reported as e.g: dwc-eth-dwmac 2490000.ethernet: Failed to reset the dma dwc-eth-dwmac 2490000.ethernet eth0: stmmac_hw_setup: DMA engine initialization failed stmmac_init_dma_engine() is only called from stmmac_hw_setup() which itself uses netdev_err(), and we will have a net_device setup. So, change the dev_err() to netdev_err() to give consistent error messages. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tl5y1-004UgG-8X@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 16:29:05 -08:00
Hangbin Liu	465b210fdc	selftests: fib_nexthops: do not mark skipped tests as failed The current test marks all unexpected return values as failed and sets ret to 1. If a test is skipped, the entire test also returns 1, incorrectly indicating failure. To fix this, add a skipped variable and set ret to 4 if it was previously 0. Otherwise, keep ret set to 1. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20250220085326.1512814-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 16:23:29 -08:00
Jakub Kicinski	27422c3738	Merge branch 'net-fib_rules-add-dscp-mask-support' Ido Schimmel says: ==================== net: fib_rules: Add DSCP mask support In some deployments users would like to encode path information into certain bits of the IPv6 flow label, the UDP source port and the DSCP field and use this information to route packets accordingly. Redirecting traffic to a routing table based on specific bits in the DSCP field is not currently possible. Only exact match is currently supported by FIB rules. This patchset extends FIB rules to match on the DSCP field with an optional mask. Patches #1-#5 gradually extend FIB rules to match on the DSCP field with an optional mask. Patch #6 adds test cases for the new functionality. iproute2 support can be found here [1]. [1] https://github.com/idosch/iproute2/tree/submit/fib_rule_mask_v1 ==================== Link: https://patch.msgid.link/20250220080525.831924-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 16:08:55 -08:00
Ido Schimmel	e818d1d1a6	selftests: fib_rule_tests: Add DSCP mask match tests Add tests for FIB rules that match on DSCP with a mask. Test both good and bad flows and both the input and output paths. # ./fib_rule_tests.sh IPv6 FIB rule tests [...] TEST: rule6 check: dscp redirect to table [ OK ] TEST: rule6 check: dscp no redirect to table [ OK ] TEST: rule6 del by pref: dscp redirect to table [ OK ] TEST: rule6 check: iif dscp redirect to table [ OK ] TEST: rule6 check: iif dscp no redirect to table [ OK ] TEST: rule6 del by pref: iif dscp redirect to table [ OK ] TEST: rule6 check: dscp masked redirect to table [ OK ] TEST: rule6 check: dscp masked no redirect to table [ OK ] TEST: rule6 del by pref: dscp masked redirect to table [ OK ] TEST: rule6 check: iif dscp masked redirect to table [ OK ] TEST: rule6 check: iif dscp masked no redirect to table [ OK ] TEST: rule6 del by pref: iif dscp masked redirect to table [ OK ] [...] Tests passed: 316 Tests failed: 0 Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/20250220080525.831924-7-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 16:08:49 -08:00
Ido Schimmel	0df1328eaf	netlink: specs: Add FIB rule DSCP mask attribute Add new DSCP mask attribute to the spec. Example: # ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/rt_rule.yaml \ --do newrule \ --json '{"family": 2, "dscp": 10, "dscp-mask": 63, "action": 1, "table": 1}' None $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/rt_rule.yaml \ --dump getrule --json '{"family": 2}' --output-json \| jq '.[]' [...] { "table": 1, "suppress-prefixlen": "0xffffffff", "protocol": 0, "priority": 32765, "dscp": 10, "dscp-mask": "0x3f", "family": 2, "dst-len": 0, "src-len": 0, "tos": 0, "action": "to-tbl", "flags": 0 } [...] Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/20250220080525.831924-6-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 16:08:48 -08:00
Ido Schimmel	ea8af1affd	net: fib_rules: Enable DSCP mask usage Allow user space to configure FIB rules that match on DSCP with a mask, now that support has been added to the IPv4 and IPv6 address families. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/20250220080525.831924-5-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 16:08:48 -08:00
Ido Schimmel	c29165c272	ipv6: fib_rules: Add DSCP mask matching Extend IPv6 FIB rules to match on DSCP using a mask. Unlike IPv4, also initialize the DSCP mask when a non-zero 'tos' is specified as there is no difference in matching between 'tos' and 'dscp'. As a side effect, this makes it possible to match on 'dscp 0', like in IPv4. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/20250220080525.831924-4-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 16:08:48 -08:00
Ido Schimmel	2ae00699b3	ipv4: fib_rules: Add DSCP mask matching Extend IPv4 FIB rules to match on DSCP using a mask. The mask is only set in rules that match on DSCP (not TOS) and initialized to cover the entire DSCP field if the mask attribute is not specified. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/20250220080525.831924-3-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 16:08:47 -08:00
Ido Schimmel	ca4edd969a	net: fib_rules: Add DSCP mask attribute Add an attribute that allows matching on DSCP with a mask. Matching on DSCP with a mask is needed in deployments where users encode path information into certain bits of the DSCP field. Temporarily set the type of the attribute to 'NLA_REJECT' while support is being added. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/20250220080525.831924-2-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 16:08:47 -08:00
Jakub Kicinski	e87700965a	Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2025-02-20 We've added 19 non-merge commits during the last 8 day(s) which contain a total of 35 files changed, 1126 insertions(+), 53 deletions(-). The main changes are: 1) Add TCP_RTO_MAX_MS support to bpf_set/getsockopt, from Jason Xing 2) Add network TX timestamping support to BPF sock_ops, from Jason Xing 3) Add TX metadata Launch Time support, from Song Yoong Siang * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: igc: Add launch time support to XDP ZC igc: Refactor empty frame insertion for launch time support net: stmmac: Add launch time support to XDP ZC selftests/bpf: Add launch time request to xdp_hw_metadata xsk: Add launch time hardware offload support to XDP Tx metadata selftests/bpf: Add simple bpf tests in the tx path for timestamping feature bpf: Support selective sampling for bpf timestamping bpf: Add BPF_SOCK_OPS_TSTAMP_SENDMSG_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_ACK_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SND_HW_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SND_SW_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SCHED_CB callback net-timestamp: Prepare for isolating two modes of SO_TIMESTAMPING bpf: Disable unsafe helpers in TX timestamping callbacks bpf: Prevent unsafe access to the sock fields in the BPF timestamping callback bpf: Prepare the sock_ops ctx and call bpf prog for TX timestamping bpf: Add networking timestamping support to bpf_get/setsockopt() selftests/bpf: Add rto max for bpf_setsockopt test bpf: Support TCP_RTO_MAX_MS for bpf_setsockopt ==================== Link: https://patch.msgid.link/20250221022104.386462-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:59:47 -08:00
Ziwei Xiao	4b9c7d8fa1	gve: Add RSS cache for non RSS device option scenario Not all the devices have the capability for the driver to query for the registered RSS configuration. The driver can discover this by checking the relevant device option during setup. If it cannot, the driver needs to store the RSS config cache and directly return such cache when queried by the ethtool. RSS config is inited when driver probes. Also the default RSS config will be adjusted when there is RX queue count change. At this point, only keys of GVE_RSS_KEY_SIZE and indirection tables of GVE_RSS_INDIR_SIZE are supported. Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Link: https://patch.msgid.link/20250219200451.3348166-1-jeroendb@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:55:39 -08:00
Thorsten Blum	c451715d78	net/rds: Replace deprecated strncpy() with strscpy_pad() strncpy() is deprecated for NUL-terminated destination buffers. Use strscpy_pad() instead and remove the manual NUL-termination. Compile-tested only. Link: https://github.com/KSPP/linux/issues/90 Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Tested-by: Allison Henderson <allison.henderson@oracle.com> Link: https://patch.msgid.link/20250219224730.73093-2-thorsten.blum@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:51:26 -08:00
Jakub Kicinski	376cd9a2ab	Merge branch 'net-improve-netns-handling-in-rtnetlink' Xiao Liang says: ==================== net: Improve netns handling in rtnetlink This patch series includes some netns-related improvements and fixes for rtnetlink, to make link creation more intuitive: 1) Creating link in another net namespace doesn't conflict with link names in current one. 2) Refector rtnetlink link creation. Create link in target namespace directly. So that # ip link add netns ns1 link-netns ns2 tun0 type gre ... will create tun0 in ns1, rather than create it in ns2 and move to ns1. And don't conflict with another interface named "tun0" in current netns. Patch 01 avoids link name conflict in different netns. To achieve 2), there're mainly 3 steps: - Patch 02 packs newlink() parameters into a struct, including the original "src_net" along with more netns context. No semantic changes are introduced. - Patch 03 ~ 09 converts device drivers to use the explicit netns extracted from params. - Patch 10 ~ 11 removes the old netns parameter, and converts rtnetlink to create device in target netns directly. Patch 12 ~ 13 adds some tests for link name and link netns. --- Please note there're some issues found in current code: - In amt_newlink() drivers/net/amt.c: amt->net = net; ... amt->stream_dev = dev_get_by_index(net, ... Uses net, but amt_lookup_upper_dev() only searches in dev_net. So the AMT device may not be properly deleted if it's in a different netns from lower dev. - In lowpan_newlink() in net/ieee802154/6lowpan/core.c: wdev = dev_get_by_index(dev_net(ldev), nla_get_u32(tb[IFLA_LINK])); Looks for IFLA_LINK in dev_net, but in theory the ifindex is defined in link netns. And thanks to Kuniyuki for fixing related issues in gtp and pfcp: https://lore.kernel.org/netdev/20250110014754.33847-1-kuniyu@amazon.com/ v9: https://lore.kernel.org/20250210133002.883422-1-shaw.leon@gmail.com v8: https://lore.kernel.org/20250113143719.7948-1-shaw.leon@gmail.com v7: https://lore.kernel.org/20250104125732.17335-1-shaw.leon@gmail.com v6: https://lore.kernel.org/20241218130909.2173-1-shaw.leon@gmail.com v5: https://lore.kernel.org/20241209140151.231257-1-shaw.leon@gmail.com v4: https://lore.kernel.org/20241118143244.1773-1-shaw.leon@gmail.com v3: https://lore.kernel.org/20241113125715.150201-1-shaw.leon@gmail.com v2: https://lore.kernel.org/20241107133004.7469-1-shaw.leon@gmail.com v1: https://lore.kernel.org/20241023023146.372653-1-shaw.leon@gmail.com ==================== Link: https://patch.msgid.link/20250219125039.18024-1-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:28:07 -08:00
Xiao Liang	85cb3711ac	selftests: net: Add test cases for link and peer netns - Add test for creating link in another netns when a link of the same name and ifindex exists in current netns. - Add test to verify that link is created in target netns directly - no link new/del events should be generated in link netns or current netns. - Add test cases to verify that link-netns is set as expected for various drivers and combination of namespace-related parameters. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Link: https://patch.msgid.link/20250219125039.18024-14-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:28:03 -08:00
Xiao Liang	0303294162	selftests: net: Add python context manager for netns entering Change netns of current thread and switch back on context exit. For example: with NetNSEnter("ns1"): ip("link add dummy0 type dummy") The command be executed in netns "ns1". Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Link: https://patch.msgid.link/20250219125039.18024-13-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:28:03 -08:00
Xiao Liang	7ca486d08a	rtnetlink: Create link directly in target net namespace Make rtnl_newlink_create() create device in target namespace directly. Avoid extra netns change when link netns is provided. Device drivers has been converted to be aware of link netns, that is not assuming device netns is and link netns is the same when ops->newlink() is called. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250219125039.18024-12-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:28:03 -08:00
Xiao Liang	9c0fc091dc	rtnetlink: Remove "net" from newlink params Now that devices have been converted to use the specific netns instead of ambiguous "net", let's remove it from newlink parameters. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250219125039.18024-11-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:28:03 -08:00
Xiao Liang	5314e3d684	net: xfrm: Use link netns in newlink() of rtnl_link_ops When link_net is set, use it as link netns instead of dev_net(). This prepares for rtnetlink core to create device in target netns directly, in which case the two namespaces may be different. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250219125039.18024-10-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:28:03 -08:00
Xiao Liang	5e72ce3e39	net: ipv6: Use link netns in newlink() of rtnl_link_ops When link_net is set, use it as link netns instead of dev_net(). This prepares for rtnetlink core to create device in target netns directly, in which case the two namespaces may be different. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250219125039.18024-9-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:28:02 -08:00
Xiao Liang	db014522f3	net: ipv6: Init tunnel link-netns before registering dev Currently some IPv6 tunnel drivers set tnl->net to dev_net(dev) in ndo_init(), which is called in register_netdevice(). However, it lacks the context of link-netns when we enable cross-net tunnels at device registration time. Let's move the init of tunnel link-netns before register_netdevice(). ip6_gre has already initialized netns, so just remove the redundant assignment. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250219125039.18024-8-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:28:02 -08:00
Xiao Liang	eacb116053	net: ip_tunnel: Use link netns in newlink() of rtnl_link_ops When link_net is set, use it as link netns instead of dev_net(). This prepares for rtnetlink core to create device in target netns directly, in which case the two namespaces may be different. Convert common ip_tunnel_newlink() to accept an extra link netns argument. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250219125039.18024-7-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:28:02 -08:00
Xiao Liang	9e17b2a1a0	net: ip_tunnel: Don't set tunnel->net in ip_tunnel_init() ip_tunnel_init() is called from register_netdevice(). In all code paths reaching here, tunnel->net should already have been set (either in ip_tunnel_newlink() or __ip_tunnel_create()). So don't set it again. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250219125039.18024-6-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:28:02 -08:00
Xiao Liang	3533717581	ieee802154: 6lowpan: Validate link netns in newlink() of rtnl_link_ops Device denoted by IFLA_LINK is in link_net (IFLA_LINK_NETNSID) or source netns by design, but 6lowpan uses dev_net. Note dev->netns_local is set to true and currently link_net is implemented via a netns change. These together effectively reject IFLA_LINK_NETNSID. This patch adds a validation to ensure link_net is either NULL or identical to dev_net. Thus it would be fine to continue using dev_net when rtnetlink core begins to create devices directly in target netns. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250219125039.18024-5-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:28:02 -08:00
Xiao Liang	cf517ac16a	net: Use link/peer netns in newlink() of rtnl_link_ops Add two helper functions - rtnl_newlink_link_net() and rtnl_newlink_peer_net() for netns fallback logic. Peer netns falls back to link netns, and link netns falls back to source netns. Convert the use of params->net in netdevice drivers to one of the helper functions for clarity. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250219125039.18024-4-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:28:02 -08:00
Xiao Liang	69c7be1b90	rtnetlink: Pack newlink() params into struct There are 4 net namespaces involved when creating links: - source netns - where the netlink socket resides, - target netns - where to put the device being created, - link netns - netns associated with the device (backend), - peer netns - netns of peer device. Currently, two nets are passed to newlink() callback - "src_net" parameter and "dev_net" (implicitly in net_device). They are set as follows, depending on netlink attributes in the request. +------------+-------------------+---------+---------+ \| peer netns \| IFLA_LINK_NETNSID \| src_net \| dev_net \| +------------+-------------------+---------+---------+ \| \| absent \| source \| target \| \| absent +-------------------+---------+---------+ \| \| present \| link \| link \| +------------+-------------------+---------+---------+ \| \| absent \| peer \| target \| \| present +-------------------+---------+---------+ \| \| present \| peer \| link \| +------------+-------------------+---------+---------+ When IFLA_LINK_NETNSID is present, the device is created in link netns first and then moved to target netns. This has some side effects, including extra ifindex allocation, ifname validation and link events. These could be avoided if we create it in target netns from the beginning. On the other hand, the meaning of src_net parameter is ambiguous. It varies depending on how parameters are passed. It is the effective link (or peer netns) by design, but some drivers ignore it and use dev_net instead. To provide more netns context for drivers, this patch packs existing newlink() parameters, along with the source netns, link netns and peer netns, into a struct. The old "src_net" is renamed to "net" to avoid confusion with real source netns, and will be deprecated later. The use of src_net are converted to params->net trivially. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250219125039.18024-3-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:28:02 -08:00
Xiao Liang	ec061546c6	rtnetlink: Lookup device in target netns when creating link When creating link, lookup for existing device in target net namespace instead of current one. For example, two links created by: # ip link add dummy1 type dummy # ip link add netns ns1 dummy1 type dummy should have no conflict since they are in different namespaces. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250219125039.18024-2-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:28:01 -08:00
Jakub Kicinski	4fe67dd2d5	Merge branch 'dt-bindings-net-realtek-rtl9301-switch' Chris Packham says: ==================== dt-bindings: net: realtek,rtl9301-switch (schema part) This is my attempt at trying to sort out the mess I've created with the RTL9300 switch dt-bindings. Some context is available on [1] and [2]. The first patch just moves the binding from mfd/ to net/ (with an adjustment of the internal path name). The next two patches are successors to patches already sent as part of the series [3]. [1] - https://lore.kernel.org/lkml/20250204-eccentric-deer-of-felicity-02b7ee@krzk-bin/ [2] - https://lore.kernel.org/lkml/4e3c5d83-d215-4eff-bf02-6d420592df8f@alliedtelesis.co.nz/ [3] - https://lore.kernel.org/lkml/20250204030249.1965444-1-chris.packham@alliedtelesis.co.nz/ ==================== Link: https://patch.msgid.link/20250218195216.1034220-1-chris.packham@alliedtelesis.co.nz Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:08:05 -08:00
Chris Packham	96757457da	dt-bindings: net: Add Realtek MDIO controller Add dtschema for the MDIO controller found in the RTL9300 Ethernet switch. The controller is slightly unusual in that direct MDIO communication is not possible. We model the MDIO controller with the MDIO buses as child nodes and the PHYs as children of the buses. The mapping of switch port number to MDIO bus/addr requires the ethernet-ports sibling to provide the mapping via the phy-handle property. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250218195216.1034220-4-chris.packham@alliedtelesis.co.nz Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:07:15 -08:00
Chris Packham	92575a2182	dt-bindings: net: Add switch ports and interrupts to RTL9300 Add bindings for the ethernet-switch and interrupt properties for the RTL9300. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250218195216.1034220-3-chris.packham@alliedtelesis.co.nz Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:07:15 -08:00
Chris Packham	3fa337651d	dt-bindings: net: Move realtek,rtl9301-switch to net Initially realtek,rtl9301-switch was placed under mfd/ because it had some non-switch related blocks (specifically i2c and reset) but with a bit more review it has become apparent that this was wrong and the binding should live under net/. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Acked-by: Lee Jones <lee@kernel.org> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250218195216.1034220-2-chris.packham@alliedtelesis.co.nz Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:07:15 -08:00
Birger Koblitz	a850355610	net: sfp: add quirk for 2.5G OEM BX SFP The OEM SFP-2.5G-BX10-D/U SFP module pair is meant to operate with 2500Base-X. However, in their EEPROM they incorrectly specify: Transceiver codes : 0x00 0x12 0x00 0x00 0x12 0x00 0x01 0x05 0x00 BR, Nominal : 2500MBd Use sfp_quirk_2500basex for this module to allow 2500Base-X mode anyway. Tested on BananaPi R3. Signed-off-by: Birger Koblitz <mail@birger-koblitz.de> Reviewed-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/20250218-b4-lkmsub-v1-1-1e51dcabed90@birger-koblitz.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:04:56 -08:00
Heiner Kallweit	bb3bb6c92e	net: phy: remove unused feature array declarations After `12d5151be0` ("net: phy: remove leftovers from switch to linkmode bitmaps") the following declarations are unused and can be removed too. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/b2883c75-4108-48f2-ab73-e81647262bc2@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 18:16:21 -08:00
Jakub Kicinski	56b06a71fc	Merge branch 'selftests-drv-net-improve-the-queue-test-for-xsk' Jakub Kicinski says: ==================== selftests: drv-net: improve the queue test for XSK We see some flakes in the the XSK test: Exception\| Traceback (most recent call last): Exception\| File "/home/virtme/testing-18/tools/testing/selftests/net/lib/py/ksft.py", line 218, in ksft_run Exception\| case(*args) Exception\| File "/home/virtme/testing-18/tools/testing/selftests/drivers/net/./queues.py", line 53, in check_xdp Exception\| ksft_eq(q['xsk'], {}) Exception\| KeyError: 'xsk' I think it's because the method of running the helper in the background is racy. Add more solid infra for waiting for a background helper to be initialized. v1: https://lore.kernel.org/20250218195048.74692-1-kuba@kernel.org ==================== Link: https://patch.msgid.link/20250219234956.520599-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	932a9249f7	selftests: drv-net: rename queues check_xdp to check_xsk The test is for AF_XDP, we refer to AF_XDP as XSK. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-8-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	4fde839846	selftests: drv-net: improve the use of ksft helpers in XSK queue test Avoid exceptions when xsk attr is not present, and add a proper ksft helper for "not in" condition. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	7147713799	selftests: drv-net: add a way to wait for a local process We use wait_port_listen() extensively to wait for a process we spawned to be ready. Not all processes will open listening sockets. Add a method of explicitly waiting for a child to be ready. Pass a FD to the spawned process and wait for it to write a message to us. FD number is passed via KSFT_READY_FD env variable. Similarly use KSFT_WAIT_FD to let the child process for a sign that we are done and child should exit. Sending a signal to a child with shell=True can get tricky. Make use of this method in the queues test to make it less flaky. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	d3726ab45c	selftests: drv-net: probe for AF_XDP sockets more explicitly Separate the support check from socket binding for easier refactoring. Use: ./helper - - just to probe if we can open the socket. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	bab59dcf71	selftests: drv-net: add missing new line in xdp_helper Kurt and Joe report missing new line at the end of Usage. Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	dabd31baa3	selftests: drv-net: use cfg.rpath() in netlink xsk attr test The cfg.rpath() helper was been recently added to make formatting paths for helper binaries easier. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	846742f7e3	selftests: drv-net: add a warning for bkg + shell + terminate Joe Damato reports that some shells will fork before running the command when python does "sh -c $cmd", while bash on my machine does an exec of $cmd directly. This will have implications for our ability to terminate the child process on various configurations of bash and other shells. Warn about using bkg(... shell=True, termininate=True) most background commands can hopefully exit cleanly (exit_wait). Link: https://lore.kernel.org/Z7Yld21sv_Ip3gQx@LQ3V64L9R2 Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:57:29 -08:00
Arnd Bergmann	ca57d1c56f	octeontx2: hide unused label A previous patch introduces a build-time warning when CONFIG_DCB is disabled: drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c: In function 'otx2_probe': drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c:3217:1: error: label 'err_free_zc_bmap' defined but not used [-Werror=unused-label] drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c: In function 'otx2vf_probe': drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c:740:1: error: label 'err_free_zc_bmap' defined but not used [-Werror=unused-label] Add the same #ifdef check around it. Fixes: `efabce2901` ("octeontx2-pf: AF_XDP zero copy receive support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Suman Ghosh <sumang@marvell.com> Link: https://patch.msgid.link/20250219162239.1376865-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:38:02 -08:00
Charalampos Mitrodimas	8279a8dacf	net: phy: qt2025: Fix hardware revision check comment Correct the hardware revision check comment in the QT2025 driver. The revision value was documented as 0x3b instead of the correct 0xb3, which matches the actual comparison logic in the code. Reviewed-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Trevor Gross <tmgross@umich.edu> Link: https://patch.msgid.link/20250219-qt2025-comment-fix-v2-1-029f67696516@posteo.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:37:32 -08:00
Jakub Kicinski	ac8f0aff41	Merge branch 'mlx5-misc-enhancements-2025-02-19' Tariq Toukan says: ==================== mlx5 misc enhancements 2025-02-19 This small series enhances the mlx5 ethtool link speed code (no functional change), in addition to a Kconfig description enhancement. ==================== Link: https://patch.msgid.link/20250219114112.403808-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:36:12 -08:00
Shahar Shitrit	9c362aafda	net/mlx5e: Separate extended link modes request from link modes type selection The function ext_requested() serves two distinct purposes: it checks if extended link modes were requested, and it selects whether to use extended or legacy link modes. This change separates these two purposes. Now, ext_link_mode_requested() is used directly for checking if extended link modes are requested, while the selection of extended modes is handled independently based on the autonegotiation status. By making this distinction, the logic for determining whether to select extended or legacy link modes is clearer. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250219114112.403808-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:36:09 -08:00

1 2 3 4 5 ...

1336830 Commits