linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-13 19:59:34 -04:00

Author	SHA1	Message	Date
Heiner Kallweit	5f790208d6	net: phy: fixed_phy: remove two function stubs Remove stubs for fixed_phy_set_link_update() and fixed_phy_change_carrier() because all callers (actually just one per function) select config symbol FIXED_PHY. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/8729170d-cf39-48d9-aabc-c9aa4acda070@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-11 17:19:40 -07:00
Paolo Abeni	5adf6f2b99	Merge branch 'ipv4-icmp-fix-source-ip-derivation-in-presence-of-vrfs' Ido Schimmel says: ==================== ipv4: icmp: Fix source IP derivation in presence of VRFs Align IPv4 with IPv6 and in the presence of VRFs generate ICMP error messages with a source IP that is derived from the receiving interface and not from its VRF master. This is especially important when the error messages are "Time Exceeded" messages as it means that utilities like traceroute will show an incorrect packet path. Patches #1-#2 are preparations. Patch #3 is the actual change. Patches #4-#7 make small improvements in the existing traceroute test. Patch #8 extends the traceroute test with VRF test cases for both IPv4 and IPv6. Changes since v1 [1]: * Rebase. [1] https://lore.kernel.org/netdev/20250901083027.183468-1-idosch@nvidia.com/ ==================== Link: https://patch.msgid.link/20250908073238.119240-1-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-11 12:22:40 +02:00
Ido Schimmel	f7240999de	selftests: traceroute: Add VRF tests Create versions of the existing test cases where the routers generating the ICMP error messages are using VRFs. Check that the source IPs of these messages do not change in the presence of VRFs. IPv6 always behaved correctly, but IPv4 fails when reverting "ipv4: icmp: Fix source IP derivation in presence of VRFs". Without IPv4 change: # ./traceroute.sh TEST: IPv6 traceroute [ OK ] TEST: IPv6 traceroute with VRF [ OK ] TEST: IPv4 traceroute [ OK ] TEST: IPv4 traceroute with VRF [FAIL] traceroute did not return 1.0.3.1 $ echo $? 1 The test fails because the ICMP error message is sent with the VRF device's IP (1.0.4.1): # traceroute -n -s 1.0.1.3 1.0.2.4 traceroute to 1.0.2.4 (1.0.2.4), 30 hops max, 60 byte packets 1 1.0.4.1 0.165 ms 0.110 ms 0.103 ms 2 1.0.2.4 0.098 ms 0.085 ms 0.078 ms # traceroute -n -s 1.0.3.3 1.0.2.4 traceroute to 1.0.2.4 (1.0.2.4), 30 hops max, 60 byte packets 1 1.0.4.1 0.201 ms 0.138 ms 0.129 ms 2 1.0.2.4 0.123 ms 0.105 ms 0.098 ms With IPv4 change: # ./traceroute.sh TEST: IPv6 traceroute [ OK ] TEST: IPv6 traceroute with VRF [ OK ] TEST: IPv4 traceroute [ OK ] TEST: IPv4 traceroute with VRF [ OK ] $ echo $? 0 Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250908073238.119240-9-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-11 12:22:38 +02:00
Ido Schimmel	2e6428100b	selftests: traceroute: Test traceroute with different source IPs When generating ICMP error messages, the kernel will prefer a source IP that is on the same subnet as the destination IP (see inet_select_addr()). Test this behavior by invoking traceroute with different source IPs and checking that the ICMP error message is generated with a source IP in the same subnet. Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250908073238.119240-8-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-11 12:22:38 +02:00
Ido Schimmel	5c9c78224f	selftests: traceroute: Reword comment Both of the addresses are configured as primary addresses, but the kernel is expected to choose 10.0.1.1/24 as the source IP of the ICMP error message since it is on the same subnet as the destination IP of the message (10.0.1.3/24). Reword the comment to reflect that. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250908073238.119240-7-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-11 12:22:38 +02:00
Ido Schimmel	47efbac9b7	selftests: traceroute: Use require_command() Use require_command() so that the test will return SKIP (4) when a required command is not present. Before: # ./traceroute.sh SKIP: Could not run IPV6 test without traceroute6 SKIP: Could not run IPV4 test without traceroute $ echo $? 0 After: # ./traceroute.sh TEST: traceroute6 not installed [SKIP] $ echo $? 4 Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250908073238.119240-6-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-11 12:22:38 +02:00
Ido Schimmel	c068ba9d3d	selftests: traceroute: Return correct value on failure The test always returns success even if some tests were modified to fail. Fix by converting the test to use the appropriate library functions instead of using its own functions. Before: # ./traceroute.sh TEST: IPV6 traceroute [FAIL] TEST: IPV4 traceroute [ OK ] Tests passed: 1 Tests failed: 1 $ echo $? 0 After: # ./traceroute.sh TEST: IPv6 traceroute [FAIL] traceroute6 did not return 2000:102::2 TEST: IPv4 traceroute [ OK ] $ echo $? 1 Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250908073238.119240-5-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-11 12:22:38 +02:00
Ido Schimmel	4a8c416602	ipv4: icmp: Fix source IP derivation in presence of VRFs When the "icmp_errors_use_inbound_ifaddr" sysctl is enabled, the source IP of ICMP error messages should be the "primary address of the interface that received the packet that caused the icmp error". The IPv4 ICMP code determines this interface using inet_iif() which in the input path translates to skb->skb_iif. If the interface that received the packet is a VRF port, skb->skb_iif will contain the ifindex of the VRF device and not that of the receiving interface. This is because in the input path the VRF driver overrides skb->skb_iif with the ifindex of the VRF device itself (see vrf_ip_rcv()). As such, the source IP that will be chosen for the ICMP error message is either an address assigned to the VRF device itself (if present) or an address assigned to some VRF port, not necessarily the input or output interface. This behavior is especially problematic when the error messages are "Time Exceeded" messages as it means that utilities like traceroute will show an incorrect packet path. Solve this by determining the input interface based on the iif field in the control block, if present. This field is set in the input path to skb->skb_iif and is not later overridden by the VRF driver, unlike skb->skb_iif. This behavior is consistent with the IPv6 counterpart that already uses the iif from the control block. Reported-by: Andy Roulin <aroulin@nvidia.com> Reported-by: Rajkumar Srinivasan <rajsrinivasa@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250908073238.119240-4-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-11 12:22:38 +02:00
Ido Schimmel	0d3c4a4416	ipv4: icmp: Pass IPv4 control block structure as an argument to __icmp_send() __icmp_send() is used to generate ICMP error messages in response to various situations such as MTU errors (i.e., "Fragmentation Required") and too many hops (i.e., "Time Exceeded"). The skb that generated the error does not necessarily come from the IPv4 layer and does not always have a valid IPv4 control block in skb->cb. Therefore, commit `9ef6b42ad6` ("net: Add __icmp_send helper.") changed the function to take the IP options structure as argument instead of deriving it from the skb's control block. Some callers of this function such as icmp_send() pass the IP options structure from the skb's control block as in these call paths the control block is known to be valid, but other callers simply pass a zeroed structure. A subsequent patch will need __icmp_send() to access more information from the IPv4 control block (specifically, the ifindex of the input interface). As a preparation for this change, change the function to take the IPv4 control block structure as an argument instead of the IP options structure. This makes the function similar to its IPv6 counterpart that already takes the IPv6 control block structure as an argument. No functional changes intended. Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250908073238.119240-3-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-11 12:22:38 +02:00
Ido Schimmel	cda276bcb9	ipv4: cipso: Simplify IP options handling in cipso_v4_error() When __ip_options_compile() is called with an skb, the IP options are parsed from the skb data into the provided IP option argument. This is in contrast to the case where the skb argument is NULL and the options are parsed from opt->__data. Given that cipso_v4_error() always passes an skb to __ip_options_compile(), there is no need to allocate an extra 40 bytes (maximum IP options size). Therefore, simplify the function by removing these extra bytes and make the function similar to ipv4_send_dest_unreach() which also calls both __ip_options_compile() and __icmp_send(). This is a preparation for changing the arguments being passed to __icmp_send(). No functional changes intended. Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250908073238.119240-2-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-11 12:22:38 +02:00
Paolo Abeni	7f0b763b81	Merge branch 'net-xdp-handle-frags-with-unreadable-memory' Jakub Kicinski says: ==================== net: xdp: handle frags with unreadable memory Make XDP helpers compatible with unreadable memory. This is very similar to how we handle pfmemalloc frags today. Record the info in xdp_buf flags as frags get added and then update the skb once allocated. This series adds the unreadable memory metadata tracking to drivers using xdp_build_skb_from*() with no changes on the driver side - hence the only driver changes here are refactoring. Obviously, unreadable memory is incompatible with XDP today, but thanks to xdp_build_skb_from_buf() increasing number of drivers have a unified datapath, whether XDP is enabled or not. RFC: https://lore.kernel.org/20250812161528.835855-1-kuba@kernel.org ==================== Link: https://patch.msgid.link/20250905221539.2930285-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-11 12:00:26 +02:00
Jakub Kicinski	6bffdc0f88	net: xdp: handle frags with unreadable memory We don't expect frags with unreadable memory to be presented to XDP programs today, but the XDP helpers are designed to be usable whether XDP is enabled or not. Support handling frags with unreadable memory. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250905221539.2930285-3-kuba@kernel.org Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-11 12:00:20 +02:00
Jakub Kicinski	1827f773e4	net: xdp: pass full flags to xdp_update_skb_shared_info() xdp_update_skb_shared_info() needs to update skb state which was maintained in xdp_buff / frame. Pass full flags into it, instead of breaking it out bit by bit. We will need to add a bit for unreadable frags (even tho XDP doesn't support those the driver paths may be common), at which point almost all call sites would become: xdp_update_skb_shared_info(skb, num_frags, sinfo->xdp_frags_size, MY_PAGE_SIZE * num_frags, xdp_buff_is_frag_pfmemalloc(xdp), xdp_buff_is_frag_unreadable(xdp)); Keep a helper for accessing the flags, in case we need to transform them somehow in the future (e.g. to cover up xdp_buff vs xdp_frame differences). While we are touching call callers - rename the helper to xdp_update_skb_frags_info(), previous name may have implied that it's shinfo that's updated. We are updating flags in struct sk_buff based on frags that got attched. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://patch.msgid.link/20250905221539.2930285-2-kuba@kernel.org Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-11 12:00:20 +02:00
Marc Harvey	db1b600666	selftests: net: Add tests to verify team driver option set and get. There are currently no kernel tests that verify setting and getting options of the team driver. In the future, options may be added that implicitly change other options, which will make it useful to have tests like these that show nothing breaks. There will be a follow up patch to this that adds new "rx_enabled" and "tx_enabled" options, which will implicitly affect the "enabled" option value and vice versa. The tests use teamnl to first set options to specific values and then gets them to compare to the set values. Signed-off-by: Marc Harvey <marcharvey@google.com> Link: https://patch.msgid.link/20250905040441.2679296-1-marcharvey@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-11 11:07:55 +02:00
Matthieu Baerts (NGI0)	1f24a24097	doc: mptcp: fix Netlink specs link The Netlink specs RST files are no longer generated inside the source tree. In other words, the path to mptcp_pm.rst has changed, and needs to be updated to the new location. Fixes: `1ce4da3dd9` ("docs: use parser_yaml extension to handle Netlink specs") Reported-by: Kory Maincent <kory.maincent@bootlin.com> Closes: https://lore.kernel.org/20250828185037.07873d04@kmaincent-XPS-13-7390 Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250909-net-next-mptcp-pm-link-v1-1-0f1c4b8439c6@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-10 18:28:41 -07:00
Jakub Kicinski	15c068cb21	selftests: net: replace sleeps in fcnal-test with waits fcnal-test.sh already includes lib.sh, use relevant helpers instead of sleeping. Replace sleep after starting nettest as a server with wait_local_port_listen. Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250909223837.863217-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-10 18:14:02 -07:00
Jakub Kicinski	4be708d0c4	Merge branch 'tools-ynl-fix-errors-reported-by-ruff' Matthieu Baerts says: ==================== tools: ynl: fix errors reported by Ruff When looking at the YNL code to add a new feature, my text editor automatically executed 'ruff check', and found out at least one interesting error: one variable was used while not being defined. I then decided to fix this error, and all the other ones reported by Ruff. After this series, 'ruff check' reports no more errors with version 0.12.12. ==================== Link: https://patch.msgid.link/20250909-net-next-ynl-ruff-v1-0-238c2bccdd99@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-10 18:09:02 -07:00
Matthieu Baerts (NGI0)	f6259ba70e	tools: ynl: check for membership with 'not in' It is better to use 'not in' instead of 'not {element} in {collection}' according to Ruff. This is linked to Ruff error E713 [1]: Testing membership with {element} not in {collection} is more readable. Link: https://docs.astral.sh/ruff/rules/not-in-test/ [1] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Link: https://patch.msgid.link/20250909-net-next-ynl-ruff-v1-8-238c2bccdd99@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-10 18:09:00 -07:00
Matthieu Baerts (NGI0)	10d32b0ddc	tools: ynl: use 'cond is None' It is better to use the 'is' keyword instead of comparing to None according to Ruff. This is linked to Ruff error E711 [1]: According to PEP 8, "Comparisons to singletons like None should always be done with is or is not, never the equality operators." Link: https://docs.astral.sh/ruff/rules/none-comparison/ [1] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Link: https://patch.msgid.link/20250909-net-next-ynl-ruff-v1-7-238c2bccdd99@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-10 18:09:00 -07:00
Matthieu Baerts (NGI0)	616129d6b4	tools: ynl: remove unnecessary semicolons These semicolons are not required according to Ruff. Simply remove them. This is linked to Ruff error E703 [1]: A trailing semicolon is unnecessary and should be removed. Link: https://docs.astral.sh/ruff/rules/useless-semicolon/ [1] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Link: https://patch.msgid.link/20250909-net-next-ynl-ruff-v1-6-238c2bccdd99@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-10 18:09:00 -07:00
Matthieu Baerts (NGI0)	389712b0da	tools: ynl: remove unused imports These imports are not used according to Ruff, and can be safely removed. This is linked to Ruff error F401 [1]: Unused imports add a performance overhead at runtime, and risk creating import cycles. They also increase the cognitive load of reading the code. There is one exception with 'YnlDocGenerator' which is added in __all__: it is used by ynl_gen_rst.py. Link: https://docs.astral.sh/ruff/rules/unused-import/ [1] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Link: https://patch.msgid.link/20250909-net-next-ynl-ruff-v1-5-238c2bccdd99@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-10 18:08:59 -07:00
Matthieu Baerts (NGI0)	d8e0e25406	tools: ynl: remove f-string without any placeholders 'f-strings' without any placeholders don't need to be marked as such according to Ruff. This 'f' can be safely removed. This is linked to Ruff error F541 [1]: f-strings are a convenient way to format strings, but they are not necessary if there are no placeholder expressions to format. In this case, a regular string should be used instead, as an f-string without placeholders can be confusing for readers, who may expect such a placeholder to be present. Link: https://docs.astral.sh/ruff/rules/f-string-missing-placeholders/ [1] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Link: https://patch.msgid.link/20250909-net-next-ynl-ruff-v1-4-238c2bccdd99@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-10 18:08:59 -07:00
Matthieu Baerts (NGI0)	02962ddb39	tools: ynl: remove assigned but never used variable These variables are assigned but never used according to Ruff. They can then be safely removed. This is linked to Ruff error F841 [1]: A variable that is defined but not used is likely a mistake, and should be removed to avoid confusion. Link: https://docs.astral.sh/ruff/rules/unused-variable/ [1] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Link: https://patch.msgid.link/20250909-net-next-ynl-ruff-v1-3-238c2bccdd99@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-10 18:08:59 -07:00
Matthieu Baerts (NGI0)	287bc89bb4	tools: ynl: avoid bare except This 'except' was used without specifying the exception class according to Ruff. Here, only the ValueError class is expected and handled. This is linked to Ruff error E722 [1]: A bare except catches BaseException which includes KeyboardInterrupt, SystemExit, Exception, and others. Catching BaseException can make it hard to interrupt the program (e.g., with Ctrl-C) and can disguise other problems. Link: https://docs.astral.sh/ruff/rules/bare-except/ [1] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Link: https://patch.msgid.link/20250909-net-next-ynl-ruff-v1-2-238c2bccdd99@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-10 18:08:59 -07:00
Matthieu Baerts (NGI0)	7a3aaaa9fc	tools: ynl: fix undefined variable name This variable used in the error path was not defined according to Ruff. msg_format.attr_set is used instead, presumably the one that was supposed to be used originally. This is linked to Ruff error F821 [1]: An undefined name is likely to raise NameError at runtime. Fixes: `1769e2be4b` ("tools/net/ynl: Add 'sub-message' attribute decoding to ynl") Link: https://docs.astral.sh/ruff/rules/undefined-name/ [1] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Link: https://patch.msgid.link/20250909-net-next-ynl-ruff-v1-1-238c2bccdd99@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-10 18:08:59 -07:00
Russell King (Oracle)	724b22d38a	net: stmmac: dwc-qos: use PHY WoL Mark Tegra platforms to use PHY's wake-on-Lan capabilities rather than the stmmac wake-on-Lan. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/E1uw0ff-00000004IQJ-3AMp@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-10 18:06:54 -07:00
Niklas Söderlund	9c02ea544a	net: sh_eth: Disable WoL if system can not suspend The MAC can't facilitate WoL if the system can't go to sleep. Gate the WoL support callbacks in ethtool at compile time using CONFIG_PM_SLEEP. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20250909085849.3808169-1-niklas.soderlund+renesas@ragnatech.se Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-10 17:58:35 -07:00
Saurabh Sengar	38611e5ada	net: mana: Remove redundant netdev_lock_ops_to_full() calls NET_SHAPER is always selected for MANA driver. When NET_SHAPER is enabled, netdev_lock_ops_to_full() reduces effectively to only an assert for lock, which is always held in the path when NET_SHAPER is enabled. Remove the redundant netdev_lock_ops_to_full() call. Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Link: https://patch.msgid.link/1757393830-20837-1-git-send-email-ssengar@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-10 17:57:09 -07:00
Rohan G Thomas	deb105f498	net: phy: marvell: Fix 88e1510 downshift counter errata The 88e1510 PHY has an erratum where the phy downshift counter is not cleared after phy being suspended(BMCR_PDOWN set) and then later resumed(BMCR_PDOWN cleared). This can cause the gigabit link to intermittently downshift to a lower speed. Disabling and re-enabling the downshift feature clears the counter, allowing the PHY to retry gigabit link negotiation up to the programmed retry count times before downshifting. This behavior has been observed on copper links. Signed-off-by: Rohan G Thomas <rohan.g.thomas@altera.com> Reviewed-by: Matthew Gerlach <matthew.gerlach@altera.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250906-marvell_fix-v2-1-f6efb286937f@altera.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 19:30:51 -07:00
Jakub Kicinski	214da63451	Merge branch 'ptp-add-pulse-signal-loopback-support-for-debugging' Wei Fang says: ==================== ptp: add pulse signal loopback support for debugging Some PTP devices support looping back the periodic pulse signal for debugging. For example, the PTP device of QorIQ platform and the NETC v4 Timer has the ability to loop back the pulse signal and record the extts events for the loopback signal. So we can make sure that the pulse intervals and their phase alignment are correct from the perspective of the emitting PHC's time base. In addition, we can use this loopback feature as a built-in extts event generator when we have no external equipment which does that. Therefore, add the generic debugfs interfaces to the ptp_clock driver. The first two patch are separated from the previous patch set [1]. The third patch is new added. [1]: https://lore.kernel.org/imx/20250827063332.1217664-1-wei.fang@nxp.com/ #patch 3 and 9 ==================== Link: https://patch.msgid.link/20250905030711.1509648-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 19:28:55 -07:00
Wei Fang	f3164840a1	ptp: qoriq: convert to use generic interfaces to set loopback mode Since the generic debugfs interfaces for setting the periodic pulse signal loopback have been added to the ptp_clock driver, so convert the vendor-defined debugfs interfaces to the generic interfaces. Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250905030711.1509648-4-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 19:28:52 -07:00
Wei Fang	67ac836373	ptp: netc: add the periodic output signal loopback support The NETC Timer supports looping back the output pulse signal of Fiper-n into Trigger-n input, so that users can leverage this feature to validate some other features without external hardware support. For example, users can use it to test external trigger stamp (EXTTS). And users can combine EXTTS with loopback mode to check whether the generation time of PPS is aligned with an integral second of PHC, or the periodic output signal (PTP_CLK_REQ_PEROUT) whether is generated at the specified time. Since ptp_clock_info::perout_loopback() has been added to the ptp_clock driver as a generic interface to enable or disable the periodic output signal loopback, therefore, netc_timer_perout_loopback() is added as a callback of ptp_clock_info::perout_loopback(). Test the generation time of PPS event: $ echo 0 1 > /sys/kernel/debug/ptp0/perout_loopback $ echo 1 > /sys/class/ptp/ptp0/pps_enable $ testptp -d /dev/ptp0 -e 3 external time stamp request okay event index 0 at 63.000000017 event index 0 at 64.000000017 event index 0 at 65.000000017 Test the generation time of the periodic output signal: $ echo 0 1 > /sys/kernel/debug/ptp0/perout_loopback $ echo 0 150 0 1 500000000 > /sys/class/ptp/ptp0/period $ testptp -d /dev/ptp0 -e 3 external time stamp request okay event index 0 at 150.000000014 event index 0 at 151.500000015 event index 0 at 153.000000014 Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250905030711.1509648-3-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 19:28:52 -07:00
Wei Fang	e096a7cc0b	ptp: add debugfs interfaces to loop back the periodic output signal For some PTP devices, they have the capability to loop back the periodic output signal for debugging, such as the ptp_qoriq device. So add the generic interfaces to set the periodic output signal loopback, rather than each vendor having a different implementation. Show how many channels support the periodic output signal loopback: $ cat /sys/kernel/debug/ptp<N>/n_perout_loopback Enable the loopback of the periodic output signal of channel X: $ echo <X> 1 > /sys/kernel/debug/ptp<N>/perout_loopback Disable the loopback of the periodic output signal of channel X: $ echo <X> 0 > /sys/kernel/debug/ptp<N>/perout_loopback Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20250905030711.1509648-2-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 19:28:52 -07:00
Jakub Kicinski	cf71bdf686	Merge branch 'net-mlx5e-add-pcie-congestion-event-extras' Tariq Toukan says: ==================== net/mlx5e: Add pcie congestion event extras This small series by Dragos covers gaps requested in the initial pcie congestion series [1]: - Make pcie congestion thresholds configurable via devlink. - Add a counter for stale pcie congestion events. [1] https://lore.kernel.org/1752130292-22249-1-git-send-email-tariqt@nvidia.com ==================== Link: https://patch.msgid.link/1757237976-531416-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 19:21:32 -07:00
Dragos Tatulea	cdc492746e	net/mlx5e: Add stale counter for PCIe congestion events This ethtool counter is meant to help with observing how many times the congestion event was triggered but on query there was no state change. This would help to indicate when a work item was scheduled to run too late and in the meantime the congestion state changed back to previous state. While at it, do a driveby typo fix in documentation for pci_bw_inbound_high. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1757237976-531416-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 19:21:30 -07:00
Dragos Tatulea	f4053490a6	net/mlx5e: Make PCIe congestion event thresholds configurable Add devlink driverinit parameters for configuring the thresholds for PCIe congestion events. These parameters are registered only when the firmware supports this feature. Update the mlx5 devlink docs as well on these new params. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1757237976-531416-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 19:21:30 -07:00
Jakub Kicinski	04d1ff1d75	Merge branch 'devlink-mlx5-add-new-parameters-for-link-management-and-sriov-eswitch-configurations' Saeed Mahameed says: ==================== devlink, mlx5: Add new parameters for link management and SRIOV/eSwitch configurations [part] This patch series introduces several devlink parameters improving device configuration capabilities, link management, and SRIOV/eSwitch, by adding NV config boot time parameters. Implement the following parameters: a) total_vfs Parameter: ----------------------- Adds support for managing the number of VFs (total_vfs) and enabling SR-IOV (enable_sriov for mlx5) through devlink. These additions enhance user control over virtualization features directly from standard kernel interfaces without relying on additional external tools. total_vfs functionality is critical for environments that require flexible num VF configuration. b) CQE Compression Type: ------------------------ Introduces a new devlink parameter, cqe_compress_type, to configure the rate of CQE compression based on PCIe bus conditions. This setting provides a balance between compression efficiency and overall NIC performance under different traffic loads. ==================== Link: https://patch.msgid.link/20250907012953.301746-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 19:14:32 -07:00
Vlad Dumitrescu	a4c49611cf	net/mlx5: Implement devlink total_vfs parameter Some devices support both symmetric (same value for all PFs) and asymmetric, while others only support symmetric configuration. This implementation prefers asymmetric, since it is closer to the devlink model (per function settings), but falls back to symmetric when needed. Example usage: devlink dev param set pci/0000:01:00.0 name total_vfs value <u16> cmode permanent devlink dev reload pci/0000:01:00.0 action fw_activate echo 1 >/sys/bus/pci/devices/0000:01:00.0/remove echo 1 >/sys/bus/pci/rescan cat /sys/bus/pci/devices/0000:01:00.0/sriov_totalvfs Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Tested-by: Kamal Heib <kheib@redhat.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250907012953.301746-5-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 19:14:24 -07:00
Vlad Dumitrescu	95a0af146d	net/mlx5: Implement devlink enable_sriov parameter Example usage: devlink dev param set pci/0000:01:00.0 name enable_sriov value {true, false} cmode permanent devlink dev reload pci/0000:01:00.0 action fw_activate echo 1 >/sys/bus/pci/devices/0000:01:00.0/remove echo 1 >/sys/bus/pci/rescan grep ^ /sys/bus/pci/devices/0000:01:00.0/sriov_* Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com> Tested-by: Kamal Heib <kheib@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250907012953.301746-4-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 19:14:24 -07:00
Saeed Mahameed	bf2da4799f	net/mlx5: Implement cqe_compress_type via devlink params Selects which algorithm should be used by the NIC in order to decide rate of CQE compression dependeng on PCIe bus conditions. Supported values: 1) balanced, merges fewer CQEs, resulting in a moderate compression ratio but maintaining a balance between bandwidth savings and performance 2) aggressive, merges more CQEs into a single entry, achieving a higher compression rate and maximizing performance, particularly under high traffic loads. Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250907012953.301746-3-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 19:14:23 -07:00
Vlad Dumitrescu	ce0b015e26	devlink: Add 'total_vfs' generic device param NICs are typically configured with total_vfs=0, forcing users to rely on external tools to enable SR-IOV (a widely used and essential feature). Add total_vfs parameter to devlink for SR-IOV max VF configurability. Enables standard kernel tools to manage SR-IOV, addressing the need for flexible VF configuration. Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com> Tested-by: Kamal Heib <kheib@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250907012953.301746-2-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 19:14:23 -07:00
Jakub Kicinski	b90c7ca4f9	Merge branch 'mptcp-make-add_addr-retransmission-timeout-adaptive' Matthieu Baerts says: ==================== mptcp: make ADD_ADDR retransmission timeout adaptive Currently, the MPTCP ADD_ADDR notifications are retransmitted after a fixed timeout controlled by the net.mptcp.add_addr_timeout sysctl knob, if the corresponding "echo" packets are not received before. This can be too slow (or too quick), especially with a too cautious default value set to 2 minutes. - Patch 1: make ADD_ADDR retransmission timeout adaptive, using the TCP's retransmission timeout. The corresponding sysctl knob is now used as a maximum value. - Patch 2: now that these ADD_ADDR retransmissions can happen faster, all MPTCP Join subtests checking ADD_ADDR counters accept more ADD_ADDR than expected (if any). This is aligned with the previous behaviour, when the ADD_ADDR RTO was lowered down to 1 second. - Patch 3: Some CIs have reported that some MPTCP Join signalling tests were unstable. It seems that it is due to the time it can take in slow environments to send a bunch of ADD_ADDR notifications and wait each time for their echo reply. Use a longer transfer to avoid such errors. v1: https://lore.kernel.org/d5397026-92eb-4a43-9534-954b43ab9305@kernel.org ==================== Link: https://patch.msgid.link/20250907-net-next-mptcp-add_addr-retrans-adapt-v1-0-824cc805772b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 18:57:49 -07:00
Matthieu Baerts (NGI0)	e2cda6343b	selftests: mptcp: join: allow more time to send ADD_ADDR When many ADD_ADDR need to be sent, it can take some time to send each of them, and create new subflows. Some CIs seem to occasionally have issues with these tests, especially with "debug" kernels. Two subtests will now run for a slightly longer time: the last two where 3 or more ADD_ADDR are sent during the test. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250907-net-next-mptcp-add_addr-retrans-adapt-v1-3-824cc805772b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 18:57:45 -07:00
Matthieu Baerts (NGI0)	63c31d42cf	selftests: mptcp: join: tolerate more ADD_ADDR ADD_ADDR can be retransmitted, and with, the parent commit, these retransmissions can be sent quicker: from 2 minutes to less than one second. To avoid false positives where retransmitted ADD_ADDR causes higher counters than expected, it is required to be more tolerant. Errors are now only reported when fewer ADD_ADDRs have been sent/received, except if no ADD_ADDR are expected. Before the parent commit, the tolerance was present for each tests where the ADD_ADDR could be retransmitted in a reasonable time (1 sec). Now that all tests can have retransmitted ADD_ADDR, it is normal to apply the same tolerance for all tests. An alternative could be to disable the ADD_ADDR retransmissions by default, but that's changing the default kernel behaviour. Plus, ADD_ADDR retransmissions can be required for some tests. To avoid adding exceptions to many tests, it seems better to increase the tolerance. Later, we could add a new MIB counter to identify the ADD_ADDR retransmissions, and remove the tolerance when this counter is available. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250907-net-next-mptcp-add_addr-retrans-adapt-v1-2-824cc805772b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 18:57:45 -07:00
Geliang Tang	30549eebc4	mptcp: make ADD_ADDR retransmission timeout adaptive Currently the ADD_ADDR option is retransmitted with a fixed timeout. This patch makes the retransmission timeout adaptive by using the maximum RTO among all the subflows, while still capping it at the configured maximum value (add_addr_timeout_max). This improves responsiveness when establishing new subflows. Specifically: 1. Adds mptcp_adjust_add_addr_timeout() helper to compute the adaptive timeout. 2. Uses maximum subflow RTO (icsk_rto) when available. 3. Applies exponential backoff based on retransmission count. 4. Maintains fallback to configured max timeout when no RTO data exists. This slightly changes the behaviour of the MPTCP "add_addr_timeout" sysctl knob to be used as a maximum instead of a fixed value. But this is seen as an improvement: the ADD_ADDR might be sent quicker than before to improve the overall MPTCP connection. Also, the default value is set to 2 min, which was already way too long, and caused the ADD_ADDR not to be retransmitted for connections shorter than 2 minutes. Suggested-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/576 Reviewed-by: Christoph Paasch <cpaasch@openai.com> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250907-net-next-mptcp-add_addr-retrans-adapt-v1-1-824cc805772b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 18:57:45 -07:00
Jakub Kicinski	4ea83b7573	Merge branch '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== idpf: add XDP support Alexander Lobakin says: Add XDP support (w/o XSk for now) to the idpf driver using the libeth_xdp sublib. All possible verdicts, .ndo_xdp_xmit(), multi-buffer etc. are here. In general, nothing outstanding comparing to ice, except performance -- let's say, up to 2x for .ndo_xdp_xmit() on certain platforms and scenarios. idpf doesn't support VLAN Rx offload, so only the hash hint is available for now. Patches 1-7 are prereqs, without which XDP would either not work at all or work slower/worse/... * '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: idpf: add XDP RSS hash hint idpf: add support for .ndo_xdp_xmit() idpf: add support for XDP on Rx idpf: use generic functions to build xdp_buff and skb idpf: implement XDP_SETUP_PROG in ndo_bpf for splitq idpf: prepare structures to support XDP idpf: add support for nointerrupt queues idpf: remove SW marker handling from NAPI idpf: add 4-byte completion descriptor definition idpf: link NAPIs to queues idpf: use a saner limit for default number of queues to allocate idpf: fix Rx descriptor ready check barrier in splitq xdp, libeth: make the xdp_init_buff() micro-optimization generic ==================== Link: https://patch.msgid.link/20250908195748.1707057-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 18:44:07 -07:00
Ido Schimmel	ce6adea19a	vxlan: Make vxlan_fdb_find_uc() more robust against NPDs first_remote_rcu() can return NULL if the FDB entry points to an FDB nexthop group instead of a remote destination. However, unlike other users of first_remote_rcu(), NPD cannot currently happen in vxlan_fdb_find_uc() as it is only invoked by one driver which vetoes the creation of FDB nexthops. Make the function more robust by making sure the remote destination is only dereferenced if it is not NULL. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Wang Liang <wangliang74@huawei.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20250908075141.125087-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 18:27:33 -07:00
Vladimir Oltean	051b62b71e	net: phy: aquantia: delete aqr_firmware_read_fingerprint() prototype This is a development artifact of commit `a76f26f7a8` ("net: phy: aquantia: support phy-mode = "10g-qxgmii" on NXP SPF-30841 (AQR412C)"). This function name isn't used. Instead we have aqr_build_fingerprint() in aquantia_main.c. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250908134313.315406-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 18:24:17 -07:00
Jakub Kicinski	0d0766a47c	Merge branch 'net-phy-fixed_phy-improvements' Heiner Kallweit says: ==================== net: phy: fixed_phy: improvements This series contains a number of improvements. No functional change intended. ==================== Link: https://patch.msgid.link/e81be066-cc23-4055-aed7-2fbc86da1ff7@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 18:11:58 -07:00
Heiner Kallweit	2983825579	net: phy: fixed_phy: remove struct fixed_mdio_bus Use two separate static variables instead of the struct, this allows to simplify the code. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-09 18:11:55 -07:00

1 2 3 4 5 ...

1383575 Commits