linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-24 07:29:55 -04:00

Author	SHA1	Message	Date
Subbaraya Sundeep	b5dcdde074	octeontx2-af: Add cn20k NIX block contexts New CN20K silicon has NIX hardware context structures different from previous silicons. Add NIX send and completion queue context definitions for cn20k. Extend NIX context handling support to cn20k. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://patch.msgid.link/1761388367-16579-3-git-send-email-sbhatta@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-10-30 10:44:08 +01:00
Subbaraya Sundeep	85708c5d5f	octeontx2-af: Simplify context writing and reading to hardware Simplify NIX context reading and writing by using hardware maximum context size instead of using individual sizes of each context type. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://patch.msgid.link/1761388367-16579-2-git-send-email-sbhatta@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-10-30 10:44:08 +01:00
Jakub Kicinski	1bae0fd900	Merge branch 'net-phy-add-iterator-mdiobus_for_each_phy' Heiner Kallweit says: ==================== net: phy: add iterator mdiobus_for_each_phy Add and use an iterator for all PHY's on a MII bus, and phy_find_next() as a prerequisite. ==================== Link: https://patch.msgid.link/07fc63e8-53fd-46aa-853e-96187bba9d44@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 19:00:37 -07:00
Heiner Kallweit	d4780abb8c	net: phy: use new iterator mdiobus_for_each_phy in mdiobus_prevent_c45_scan Use new iterator mdiobus_for_each_phy() to simplify the code. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/6d792b1e-d23d-4b7e-a94f-89c6617b620f@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 19:00:34 -07:00
Heiner Kallweit	4575875065	net: davinci_mdio: use new iterator mdiobus_for_each_phy Use new iterator mdiobus_for_each_phy() to simplify the code. Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/326d1337-2c22-42e3-a152-046ac5c43095@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 19:00:34 -07:00
Heiner Kallweit	0514010d55	net: fec: use new iterator mdiobus_for_each_phy Use new iterator mdiobus_for_each_phy() to simplify the code. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/65eb9490-5666-4b4a-8d26-3fca738b1315@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 19:00:34 -07:00
Heiner Kallweit	26888de97b	net: phy: add iterator mdiobus_for_each_phy Add an iterator for all PHY's on a MII bus, and phy_find_next() as a prerequisite. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/cd112f15-401a-43d9-8525-9ff0965a68cd@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 19:00:34 -07:00
Heiner Kallweit	cf35f4347d	net: stmmac: mdio: fix incorrect phy address check max_addr is the max number of addresses, not the highest possible address, therefore check phydev->mdio.addr > max_addr isn't correct. To fix this change the semantics of max_addr, so that it represents the highest possible address. IMO this is also a little bit more intuitive wrt name max_addr. Fixes: `4a107a0e83` ("net: stmmac: mdio: use phy_find_first to simplify stmmac_mdio_register") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Reported-by: Simon Horman <horms@kernel.org> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/e869999b-2d4b-4dc1-9890-c2d3d1e8d0f8@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 18:58:01 -07:00
Sakari Ailus	10c7b9be47	net: wwan: Remove redundant pm_runtime_mark_last_busy() calls pm_runtime_put_autosuspend(), pm_runtime_put_sync_autosuspend(), pm_runtime_autosuspend() and pm_request_autosuspend() now include a call to pm_runtime_mark_last_busy(). Remove the now-reduntant explicit call to pm_runtime_mark_last_busy(). Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Link: https://patch.msgid.link/20251027115022.390997-4-sakari.ailus@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 18:55:56 -07:00
Sakari Ailus	a5d937dd0e	net: ipa: Remove redundant pm_runtime_mark_last_busy() calls pm_runtime_put_autosuspend(), pm_runtime_put_sync_autosuspend(), pm_runtime_autosuspend() and pm_request_autosuspend() now include a call to pm_runtime_mark_last_busy(). Remove the now-reduntant explicit call to pm_runtime_mark_last_busy(). Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://patch.msgid.link/20251027115022.390997-2-sakari.ailus@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 18:55:56 -07:00
Sakari Ailus	9f2674e1c3	net: ethernet: Remove redundant pm_runtime_mark_last_busy() calls pm_runtime_put_autosuspend(), pm_runtime_put_sync_autosuspend(), pm_runtime_autosuspend() and pm_request_autosuspend() now include a call to pm_runtime_mark_last_busy(). Remove the now-reduntant explicit call to pm_runtime_mark_last_busy(). Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Link: https://patch.msgid.link/20251027115022.390997-1-sakari.ailus@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 18:55:50 -07:00
Jakub Kicinski	b5171b8996	Merge branch 'net-enetc-add-i-mx94-enetc-support' Wei Fang says: ==================== net: enetc: Add i.MX94 ENETC support i.MX94 NETC has two kinds of ENETCs, one is the same as i.MX95, which can be used as a standalone network port. The other one is an internal ENETC, it connects to the CPU port of NETC switch through the pseudo MAC. Also, i.MX94 have multiple PTP Timers, which is different from i.MX95. Any PTP Timer can be bound to a specified standalone ENETC by the IERB ETBCR registers. Currently, this patch only add ENETC support and Timer support for i.MX94. The switch will be added by a separate patch set. In addition, note that i.MX94 SoC is launched after i.MX95, its NETC has a higher version, so the driver support is added after i.MX95. ==================== Link: https://patch.msgid.link/20251029013900.407583-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 18:44:21 -07:00
Wei Fang	2d673b0e2f	net: enetc: add standalone ENETC support for i.MX94 The revision of i.MX94 ENETC is changed to v4.3, so add this revision to enetc_info to support i.MX94 ENETC. And add PTP suspport for i.MX94. Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20251029013900.407583-7-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 18:44:19 -07:00
Wei Fang	5175c1e4ad	net: enetc: add basic support for the ENETC with pseudo MAC for i.MX94 The ENETC with pseudo MAC is an internal port which connects to the CPU port of the switch. The switch CPU/host ENETC is fully integrated with the switch and does not require a back-to-back MAC, instead a light weight "pseudo MAC" provides the delineation between switch and ENETC. This translates to lower power (less logic and memory) and lower delay (as there is no serialization delay across this link). Different from the standalone ENETC which is used as the external port, the internal ENETC has a different PCIe device ID, and it does not have Ethernet MAC port registers, instead, it has a small number of pseudo MAC port registers, so some features are not supported by pseudo MAC, such as loopback, half duplex, one-step timestamping and so on. Therefore, the configuration of this internal ENETC is also somewhat different from that of the standalone ENETC. So add the basic support for ENETC with pseudo MAC. More supports will be added in the future. Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://patch.msgid.link/20251029013900.407583-6-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 18:44:19 -07:00
Clark Wang	1cd3f21c18	net: enetc: add ptp timer binding support for i.MX94 The i.MX94 has three PTP timers, and all standalone ENETCs can select one of them to bind to as their PHC. The 'ptp-timer' property is used to represent the PTP device of the Ethernet controller. So users can add 'ptp-timer' to the ENETC node to specify the PTP timer. The driver parses this property to bind the two hardware devices. If the "ptp-timer" property is not present, the first timer of the PCIe bus where the ENETC is located is used as the default bound PTP timer. Signed-off-by: Clark Wang <xiaoning.wang@nxp.com> Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20251029013900.407583-5-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 18:44:19 -07:00
Wei Fang	ba5d7d45ce	net: enetc: add preliminary i.MX94 NETC blocks control support NETC blocks control is used for warm reset and pre-boot initialization. Different versions of NETC blocks control are not exactly the same. We need to add corresponding netc_devinfo data for each version. i.MX94 series are launched after i.MX95, so its NETC version (v4.3) is higher than i.MX95 NETC (v4.1). Currently, the patch adds the following configurations for ENETCs. 1. Set the link's MII protocol. 2. ENETC 0 (MAC 3) and the switch port 2 (MAC 2) share the same parallel interface, but due to SoC constraint, they cannot be used simultaneously. Since the switch is not supported yet, so the interface is assigned to ENETC 0 by default. The switch configuration will be added separately in a subsequent patch. Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20251029013900.407583-4-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 18:44:19 -07:00
Wei Fang	c4430f2ac0	dt-bindings: net: enetc: add compatible string for ENETC with pseduo MAC The ENETC with pseudo MAC is used to connect to the CPU port of the NETC switch. This ENETC has a different PCI device ID, so add a standard PCI device compatible string to it. Signed-off-by: Wei Fang <wei.fang@nxp.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20251029013900.407583-3-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 18:44:19 -07:00
Wei Fang	3a85ec37bc	dt-bindings: net: netc-blk-ctrl: add compatible string for i.MX94 platforms Add the compatible string "nxp,imx94-netc-blk-ctrl" for i.MX94 platforms. Signed-off-by: Wei Fang <wei.fang@nxp.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20251029013900.407583-2-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 18:44:18 -07:00
Jakub Kicinski	76c231b3c2	Merge branch 'icmp-add-rfc-5837-support' Ido Schimmel says: ==================== icmp: Add RFC 5837 support tl;dr ===== This patchset extends certain ICMP error messages (e.g., "Time Exceeded") with incoming interface information in accordance with RFC 5837 [1]. This is required for more meaningful traceroute results in unnumbered networks. Like other ICMP settings, the feature is controlled via a per-{netns, address family} sysctl. The interface and the implementation are designed to support more ICMP extensions. Motivation ========== Over the years, the kernel was extended with the ability to derive the source IP of ICMP error messages from the interface that received the datagram which elicited the ICMP error [2][3][4]. This is especially important for "Time Exceeded" messages as it allows traceroute users to trace the actual packet path along the network. The above scheme does not work in unnumbered networks. In these networks, only the loopback / VRF interface is assigned a global IP address while router interfaces are assigned IPv6 link-local addresses. As such, ICMP error messages are generated with a source IP derived from the loopback / VRF interface, making it impossible to trace the actual packet path when parallel links exist between routers. The problem can be solved by implementing the solution proposed by RFC 4884 [5] and RFC 5837. The former defines an ICMP extension structure that can be appended to selected ICMP messages and carry extension objects. The latter defines an extension object called the "Interface Information Object" (IIO) that can carry interface information (e.g., name, index, MTU) about interfaces with certain roles such as the interface that received the datagram which elicited the ICMP error. The payload of the datagram that elicited the error (potentially padded / trimmed) along with the ICMP extension structure will be queued to the error queue of the originating socket, thereby allowing traceroute applications to parse and display the information encoded in the ICMP extension structure. Example: # traceroute6 -e 2001:db8:1::3 traceroute to 2001:db8:1::3 (2001:db8:1::3), 30 hops max, 80 byte packets 1 2001:db8:1::2 (2001:db8:1::2) <INC:11,"eth1",mtu=1500> 0.214 ms 0.171 ms 0.162 ms 2 2001:db8:1::3 (2001:db8:1::3) <INC:12,"eth2",mtu=1500> 0.154 ms 0.135 ms 0.127 ms # traceroute -e 192.0.2.3 traceroute to 192.0.2.3 (192.0.2.3), 30 hops max, 60 byte packets 1 192.0.2.2 (192.0.2.2) <INC:11,"eth1",mtu=1500> 0.191 ms 0.148 ms 0.144 ms 2 192.0.2.3 (192.0.2.3) <INC:12,"eth2",mtu=1500> 0.137 ms 0.122 ms 0.114 ms Implementation ============== As previously stated, the feature is controlled via a per-{netns, address} sysctl. Specifically, a bit mask where each bit controls the addition of a different ICMP extension to ICMP error messages. Currently, only a single value is supported, to append the incoming interface information. Key points: 1. Global knob vs finer control. I am not aware of users who require finer control, but it is possible that some users will want to avoid appending ICMP extensions when the packet is sent out of a specific interface (e.g., the management interface) or to a specific subnet. This can be accomplished via a tc-bpf program that trims the ICMP extension structure. An example program can be found here [6]. 2. Split implementation between IPv4 / IPv6. While the implementation is currently similar, there are some differences between both address families. In addition, some extensions (e.g., RFC 8883 [7]) are IPv6-specific. Given the above and given that the implementation is not very complex, it makes sense to keep both implementations separate. 3. Compatibility with legacy applications. RFC 4884 from 2007 extended certain ICMP messages with a length field that encodes the length of the "original datagram" field, so that applications will be able to tell where the "original datagram" ends and where the ICMP extension structure starts. Before the introduction of the IP{,6}_RECVERR_RFC4884 socket options [8][9] in 2020 it was impossible for applications to know where the ICMP extension structure starts and to this day some applications assume that it starts at offset 128, which is the minimum length of the "original datagram" field as specified by RFC 4884. Therefore, in order to be compatible with both legacy and modern applications, the datagram that elicited the ICMP error is trimmed / padded to 128 bytes before appending the ICMP extension structure. This behavior is specifically called out by RFC 4884: "Those wishing to be backward compatible with non-compliant TRACEROUTE implementations will include exactly 128 octets" [10]. Note that in 128 bytes we should be able to include enough headers for the originating node to match the ICMP error message with the relevant socket. For example, the following headers will be present in the "original datagram" field when a VXLAN encapsulated IPv6 packet elicits an ICMP error in an IPv6 underlay: IPv6 (40) \| UDP (8) \| VXLAN (8) \| Eth (14) \| IPv6 (40) \| UDP (8). Overall, 118 bytes. If the 128 bytes limit proves to be insufficient for some use case, we can consider dedicating a new bit in the previously mentioned sysctl to allow for more bytes to be included in the "original datagram" field. 4. Extensibility. This patchset adds partial support for a single ICMP extension. However, the interface and the implementation should be able to support more extensions, if needed. Examples: * More interface information objects as part of RFC 5837. We should be able to derive the outgoing interface information and nexthop IP from the dst entry attached to the packet that elicited the error. * Node identification object (e.g., hostname / loopback IP) [11]. * Extended Information object which encodes aggregate header limits as part of RFC 8883. A previous proposal from Ishaan Gandhi and Ron Bonica is available here [12]. Testing ======= The existing traceroute selftest is extended to test that ICMP extensions are reported correctly when enabled. Both address families are tested and with different packet sizes in order to make sure that trimming / padding works correctly. Tested that packets are parsed correctly by the IP{,6}_RECVERR_RFC4884 socket options using Willem's selftest [13]. Changelog ========= Changes since v1 [14]: * Patches #1-#2: Added a comment about field ordering and review tags. * Patch #3: Converted "sysctl" to "echo" when testing the return value. Added a check to skip the test if traceroute version is older than 2.1.5. [1] https://datatracker.ietf.org/doc/html/rfc5837 [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1c2fb7f93cb20621772bf304f3dba0849942e5db [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fac6fce9bdb59837bb89930c3a92f5e0d1482f0b [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4a8c416602d97a4e2073ed563d4d4c7627de19cf [5] https://datatracker.ietf.org/doc/html/rfc4884 [6] https://gist.github.com/idosch/5013448cdb5e9e060e6bfdc8b433577c [7] https://datatracker.ietf.org/doc/html/rfc8883 [8] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eba75c587e811d3249c8bd50d22bb2266ccd3c0f [9] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=01370434df85eb76ecb1527a4466013c4aca2436 [10] https://datatracker.ietf.org/doc/html/rfc4884#section-5.3 [11] https://datatracker.ietf.org/doc/html/draft-ietf-intarea-extended-icmp-nodeid-04 [12] https://lore.kernel.org/netdev/20210317221959.4410-1-ishaangandhi@gmail.com/ [13] https://lore.kernel.org/netdev/aPpMItF35gwpgzZx@shredder/ [14] https://lore.kernel.org/netdev/20251022065349.434123-1-idosch@nvidia.com/ ==================== Link: https://patch.msgid.link/20251027082232.232571-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 18:28:33 -07:00
Ido Schimmel	02da595751	selftests: traceroute: Add ICMP extensions tests Test that ICMP extensions are reported correctly when enabled and not reported when disabled. Test both IPv4 and IPv6 and using different packet sizes, to make sure trimming / padding works correctly. Disable ICMP rate limiting (defaults to 1 per-second per-target) so that the kernel will always generate ICMP errors when needed. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20251027082232.232571-4-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 18:28:30 -07:00
Ido Schimmel	d12d04d221	ipv6: icmp: Add RFC 5837 support Add the ability to append the incoming IP interface information to ICMPv6 error messages in accordance with RFC 5837 and RFC 4884. This is required for more meaningful traceroute results in unnumbered networks. The feature is disabled by default and controlled via a new sysctl ("net.ipv6.icmp.errors_extension_mask") which accepts a bitmask of ICMP extensions to append to ICMP error messages. Currently, only a single value is supported, but the interface and the implementation should be able to support more extensions, if needed. Clone the skb and copy the relevant data portions before modifying the skb as the caller of icmp6_send() still owns the skb after the function returns. This should be fine since by default ICMP error messages are rate limited to 1000 per second and no more than 1 per second per specific host. Trim or pad the packet to 128 bytes before appending the ICMP extension structure in order to be compatible with legacy applications that assume that the ICMP extension structure always starts at this offset (the minimum length specified by RFC 4884). Since commit `20e1954fe2` ("ipv6: RFC 4884 partial support for SIT/GRE tunnels") it is possible for icmp6_send() to be called with an skb that already contains ICMP extensions. This can happen when we receive an ICMPv4 message with extensions from a tunnel and translate it to an ICMPv6 message towards an IPv6 host in the overlay network. I could not find an RFC that supports this behavior, but it makes sense to not overwrite the original extensions that were appended to the packet. Therefore, avoid appending extensions if the length field in the provided ICMPv6 header is already filled. Export netdev_copy_name() using EXPORT_IPV6_MOD_GPL() to make it available to IPv6 when it is built as a module. Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20251027082232.232571-3-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 18:28:30 -07:00
Ido Schimmel	f0e7036fc9	ipv4: icmp: Add RFC 5837 support Add the ability to append the incoming IP interface information to ICMPv4 error messages in accordance with RFC 5837 and RFC 4884. This is required for more meaningful traceroute results in unnumbered networks. The feature is disabled by default and controlled via a new sysctl ("net.ipv4.icmp_errors_extension_mask") which accepts a bitmask of ICMP extensions to append to ICMP error messages. Currently, only a single value is supported, but the interface and the implementation should be able to support more extensions, if needed. Clone the skb and copy the relevant data portions before modifying the skb as the caller of __icmp_send() still owns the skb after the function returns. This should be fine since by default ICMP error messages are rate limited to 1000 per second and no more than 1 per second per specific host. Trim or pad the packet to 128 bytes before appending the ICMP extension structure in order to be compatible with legacy applications that assume that the ICMP extension structure always starts at this offset (the minimum length specified by RFC 4884). Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20251027082232.232571-2-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 18:28:29 -07:00
Kuniyuki Iwashima	b8a7826e4b	net: sched: Don't use WARN_ON_ONCE() for -ENOMEM in tcf_classify(). As demonstrated by syzbot, WARN_ON_ONCE() in tcf_classify() can be easily triggered by fault injection. [0] We should not use WARN_ON_ONCE() for the simple -ENOMEM case. Also, we provide SKB_DROP_REASON_NOMEM for the same error. Let's remove WARN_ON_ONCE() there. [0]: FAULT_INJECTION: forcing a failure. name failslab, interval 1, probability 0, space 0, times 0 CPU: 0 UID: 0 PID: 31392 Comm: syz.8.7081 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025 Call Trace: <TASK> dump_stack_lvl+0x189/0x250 should_fail_ex+0x414/0x560 should_failslab+0xa8/0x100 kmem_cache_alloc_noprof+0x74/0x6e0 skb_ext_add+0x148/0x8f0 tcf_classify+0xeba/0x1140 multiq_enqueue+0xfd/0x4c0 net/sched/sch_multiq.c:66 ... WARNING: CPU: 0 PID: 31392 at net/sched/cls_api.c:1869 tcf_classify+0xfd7/0x1140 Modules linked in: CPU: 0 UID: 0 PID: 31392 Comm: syz.8.7081 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025 RIP: 0010:tcf_classify+0xfd7/0x1140 Code: e8 03 42 0f b6 04 30 84 c0 0f 85 41 01 00 00 66 41 89 1f eb 05 e8 89 26 75 f8 bb ff ff ff ff e9 04 f9 ff ff e8 7a 26 75 f8 90 <0f> 0b 90 49 83 c5 44 4c 89 eb 49 c1 ed 03 43 0f b6 44 35 00 84 c0 RSP: 0018:ffffc9000b7671f0 EFLAGS: 00010293 RAX: ffffffff894addf6 RBX: 0000000000000002 RCX: ffff888025029e40 RDX: 0000000000000000 RSI: ffffffff8bbf05c0 RDI: ffffffff8bbf0580 RBP: 0000000000000000 R08: 00000000ffffffff R09: 1ffffffff1c0bfd6 R10: dffffc0000000000 R11: fffffbfff1c0bfd7 R12: ffff88805a90de5c R13: ffff88805a90ddc0 R14: dffffc0000000000 R15: ffffc9000b7672c0 FS: 00007f20739f66c0(0000) GS:ffff88812613e000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000110c2d2a80 CR3: 0000000024e36000 CR4: 00000000003526f0 Call Trace: <TASK> multiq_classify net/sched/sch_multiq.c:39 [inline] multiq_enqueue+0xfd/0x4c0 net/sched/sch_multiq.c:66 dev_qdisc_enqueue+0x4e/0x260 net/core/dev.c:4118 __dev_xmit_skb net/core/dev.c:4214 [inline] __dev_queue_xmit+0xe83/0x3b50 net/core/dev.c:4729 packet_snd net/packet/af_packet.c:3076 [inline] packet_sendmsg+0x3e33/0x5080 net/packet/af_packet.c:3108 sock_sendmsg_nosec net/socket.c:727 [inline] __sock_sendmsg+0x21c/0x270 net/socket.c:742 ____sys_sendmsg+0x505/0x830 net/socket.c:2630 ___sys_sendmsg+0x21f/0x2a0 net/socket.c:2684 __sys_sendmsg net/socket.c:2716 [inline] __do_sys_sendmsg net/socket.c:2721 [inline] __se_sys_sendmsg net/socket.c:2719 [inline] __x64_sys_sendmsg+0x19b/0x260 net/socket.c:2719 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f207578efc9 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f20739f6038 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f20759e5fa0 RCX: 00007f207578efc9 RDX: 0000000000000004 RSI: 00002000000000c0 RDI: 0000000000000008 RBP: 00007f20739f6090 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 R13: 00007f20759e6038 R14: 00007f20759e5fa0 R15: 00007f2075b0fa28 </TASK> Reported-by: syzbot+87e1289a044fcd0c5f62@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/69003e33.050a0220.32483.00e8.GAE@google.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20251028035859.2067690-1-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 18:00:37 -07:00
Ankit Khushwaha	afb8f6567a	selftest: net: fix socklen_t type mismatch in sctp_collision test Socket APIs like recvfrom(), accept(), and getsockname() expect socklen_t* arg, but tests were using int variables. This causes -Wpointer-sign warnings on platforms where socklen_t is unsigned. Change the variable type from int to socklen_t to resolve the warning and ensure type safety across platforms. warning fixed: sctp_collision.c:62:70: warning: passing 'int ' to parameter of type 'socklen_t ' (aka 'unsigned int ') converts between pointers to integer types with different sign [-Wpointer-sign] 62 \| ret = recvfrom(sd, buf, sizeof(buf), 0, (struct sockaddr )&daddr, &len); \| ^~~~ /usr/include/sys/socket.h:165:27: note: passing argument to parameter '__addr_len' here 165 \| socklen_t *__restrict __addr_len); \| ^ Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Ankit Khushwaha <ankitkhushwaha.linux@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251028172947.53153-1-ankitkhushwaha.linux@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 17:39:26 -07:00
Jakub Kicinski	efd3e30e65	Merge branch 'net-stmmac-hwif-c-cleanups' Russell King says: ==================== net: stmmac: hwif.c cleanups This series cleans up hwif.c: - move the reading of the version information out of stmmac_hwif_init() into its own function, stmmac_get_version(), storing the result in a new struct. - simplify stmmac_get_version(). - read the version register once, passing it to stmmac_get_id() and stmmac_get_dev_id(). - move stmmac_get_id() and stmmac_get_dev_id() into stmmac_get_version() - define version register fields and use FIELD_GET() to decode - start tackling the big loop in stmmac_hwif_init() - provide a function, stmmac_hwif_find(), which looks up the hwif entry, thus making a much smaller loop, which improves readability of this code. - change the use of '^' to '!=' when comparing the dev_id, which is what is really meant here. - reorganise the test after calling stmmac_hwif_init() so that we handle the error case in the indented code, and the success case with no indent, which is the classical arrangement. ==================== Link: https://patch.msgid.link/aQFZVSGJuv8-_DIo@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 17:18:25 -07:00
Russell King (Oracle)	6436f408eb	net: stmmac: reorganise stmmac_hwif_init() Reorganise stmmac_hwif_init() to handle the error case of stmmac_hwif_find() in the indented block, which follows normal programming pattern. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vDtfG-0000000CCCX-2YwQ@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 17:18:23 -07:00
Russell King (Oracle)	f9326b139b	net: stmmac: use != rather than ^ for comparing dev_id Use the more usual not-equals rather than exclusive-or operator when comparing the dev_id in stmmac_hwif_find(). Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vDtfB-0000000CCCR-25rr@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 17:18:22 -07:00
Russell King (Oracle)	7b510ea8e5	net: stmmac: provide function to lookup hwif Provide a function to lookup the hwif entry given the core type, Synopsys version, and device ID (used for XGMAC cores). Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vDtf6-0000000CCCL-1cQA@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 17:18:22 -07:00
Russell King (Oracle)	b2fe9e29b5	net: stmmac: use FIELD_GET() for version register Provide field definitions in common.h, and use these with FIELD_GET() to extract the fields from the version register. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vDtf1-0000000CCCF-0uUV@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 17:18:22 -07:00
Russell King (Oracle)	7b2e41fff7	net: stmmac: move stmmac_get_*id() into stmmac_get_version() Move the contents of both stmmac_get_id() and stmmac_get_dev_id() into stmmac_get_version() as it no longer makes sense for these to be separate functions. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vDtew-0000000CCC9-0KeM@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 17:18:22 -07:00
Russell King (Oracle)	c36b97e4ca	net: stmmac: consolidate version reading and validation There is no need to read the version register twice, once in stmmac_get_id() and then again in stmmac_get_dev_id(). Consolidate this into stmmac_get_version() and pass each of these this value. As both functions unnecessarily issue the same warning for a zero register value, also move this into stmmac_get_version(). Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vDteq-0000000CCC3-3zbJ@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 17:18:22 -07:00
Russell King (Oracle)	f49838f77c	net: stmmac: simplify stmmac_get_version() We can simplify stmmac_get_version() by pre-initialising the version members to zero, detecting the MAC100 core and returning, otherwise determining the version register offset separately from calling stmmac_get_id() and stmmac_get_dev_id(). Do this. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vDtel-0000000CCBx-3Lpf@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 17:18:22 -07:00
Russell King (Oracle)	fc18b6e98c	net: stmmac: move version handling into own function Move the version handling out of stmmac_hwif_init() and into its own function, returning the version information through a structure. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vDteg-0000000CCBr-2m7q@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 17:18:21 -07:00
Wang Liang	f58abec23d	net: ipv4: Remove extern udp_v4_early_demux()/tcp_v4_early_demux() in .c files Function udp_v4_early_demux() was already declared in 'include/net/udp.h', no need to keep the extern in 'ip_input.c', which may produce the following checkpatch warning: WARNING: externs should be avoided in .c files #45: FILE: net/ipv4/ip_input.c:322: +enum skb_drop_reason udp_v4_early_demux(struct sk_buff *skb); Replace it by including 'net/udp.h'. Do the same for tcp_v4_early_demux(). Signed-off-by: Wang Liang <wangliang74@huawei.com> Link: https://patch.msgid.link/20251025092637.1020960-1-wangliang74@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-29 17:05:30 -07:00
Tianling Shen	a8abe8e210	net: phy: motorcomm: Add support for PHY LEDs on YT8531 The LED registers on YT8531 are exactly same as YT8521, so simply reuse yt8521_led_hw_* functions. Tested on OrangePi R1 Plus LTS and Zero3. Signed-off-by: Tianling Shen <cnsztl@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Jijie Shao<shaojijie@huawei.com> Link: https://patch.msgid.link/20251026133652.1288732-1-cnsztl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-28 18:56:16 -07:00
Issam Hamdi	61958b33ef	net: phy: realtek: Add RTL8224 cable testing support The RTL8224 can detect open pairs and short types (in same pair or some other pair). The distance to this problem can be estimated. This is done for each of the 4 pairs separately. It is not meant to be run while there is an active link partner because this interferes with the active test pulses. Output with open 50 m cable: Pair A code Open Circuit, source: TDR Pair A, fault length: 51.79m, source: TDR Pair B code Open Circuit, source: TDR Pair B, fault length: 51.28m, source: TDR Pair C code Open Circuit, source: TDR Pair C, fault length: 50.46m, source: TDR Pair D code Open Circuit, source: TDR Pair D, fault length: 51.12m, source: TDR Terminated cable: Pair A code OK, source: TDR Pair B code OK, source: TDR Pair C code OK, source: TDR Pair D code OK, source: TDR Shorted cable (both short types are at roughly the same distance) Pair A code Short to another pair, source: TDR Pair A, fault length: 2.35m, source: TDR Pair B code Short to another pair, source: TDR Pair B, fault length: 2.15m, source: TDR Pair C code OK, source: TDR Pair D code Short within Pair, source: TDR Pair D, fault length: 1.94m, source: TDR Signed-off-by: Issam Hamdi <ih@simonwunderlich.de> Co-developed-by: Sven Eckelmann <se@simonwunderlich.de> Signed-off-by: Sven Eckelmann <se@simonwunderlich.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20251024-rtl8224-cable-test-v1-1-e3cda89ac98f@simonwunderlich.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-28 18:35:08 -07:00
Jakub Kicinski	e9ce7f493e	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: postpone service task disabling Przemek Kitszel says: Move service task shutdown to the very end of driver teardown procedure. This is needed (or at least beneficial) for all unwinding functions that talk to FW/HW via Admin Queue (so, most of top-level functions, like ice_deinit_hw()). Most of the patches move stuff around (I believe it makes it much easier to review/proof when kept separate) in preparation to defer stopping the service task to the very end of ice_remove() (and other unwinding flows). Then last patch fixes duplicate call to ice_init_hw() (actual, but unlikely to encounter, so -next, given the size of the changes). First patch is not much related, only by that it was developed together * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: remove duplicate call to ice_deinit_hw() on error paths ice: move ice_deinit_dev() to the end of deinit paths ice: extract ice_init_dev() from ice_init() ice: move ice_init_pf() out of ice_init_dev() ice: move udp_tunnel_nic and misc IRQ setup into ice_init_pf() ice: ice_init_pf: destroy mutexes and xarrays on memory alloc failure ice: move ice_init_interrupt_scheme() prior ice_init_pf() ice: move service task start out of ice_init_pf() ice: enforce RTNL assumption of queue NAPI manipulation ==================== Link: https://patch.msgid.link/20251024204746.3092277-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-28 18:12:07 -07:00
Rakuram Eswaran	5c00da851c	net: tcp_lp: fix kernel-doc warnings and update outdated reference links Fix kernel-doc warnings in tcp_lp.c by adding missing parameter descriptions for tcp_lp_cong_avoid() and tcp_lp_pkts_acked() when building with W=1. Also replace invalid URLs in the file header comment with the currently valid links to the TCP-LP paper and implementation page. No functional changes. Signed-off-by: Rakuram Eswaran <rakuram.e96@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251025-net_ipv4_tcp_lp_c-v1-1-058cc221499e@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-28 17:52:44 -07:00
Christophe JAILLET	294bfe0343	sctp: Constify struct sctp_sched_ops 'struct sctp_sched_ops' is not modified in these drivers. Constifying this structure moves some data to a read-only section, so increases overall security, especially when the structure holds some function pointers. On a x86_64, with allmodconfig, as an example: Before: ====== text data bss dec hex filename 8019 568 0 8587 218b net/sctp/stream_sched_fc.o After: ===== text data bss dec hex filename 8275 312 0 8587 218b net/sctp/stream_sched_fc.o Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://patch.msgid.link/dce03527eb7b7cc8a3c26d5cdac12bafe3350135.1761377890.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-28 17:50:55 -07:00
Bobby Eshleman	8443c31608	net: netmem: remove NET_IOV_MAX from net_iov_type enum Remove the NET_IOV_MAX workaround from the net_iov_type enum. This entry was previously added to force the enum size to unsigned long to satisfy the NET_IOV_ASSERT_OFFSET static assertions. After commit `f3d85c9ee5` ("netmem: introduce struct netmem_desc mirroring struct page") this approach became unnecessary by placing the net_iov_type after the netmem_desc. Placing the net_iov_type after netmem_desc results in the net_iov_type size having no effect on the position or layout of the fields that mirror the struct page. The layout before this patch: struct net_iov { union { struct netmem_desc desc; /* 0 48 / struct { long unsigned int _flags; / 0 8 / long unsigned int pp_magic; / 8 8 / struct page_pool pp; /* 16 8 / long unsigned int _pp_mapping_pad; / 24 8 / long unsigned int dma_addr; / 32 8 / atomic_long_t pp_ref_count; / 40 8 / }; / 0 48 / }; / 0 48 / struct net_iov_area owner; /* 48 8 / enum net_iov_type type; / 56 8 / / size: 64, cachelines: 1, members: 3 / }; The layout after this patch: struct net_iov { union { struct netmem_desc desc; / 0 48 / struct { long unsigned int _flags; / 0 8 / long unsigned int pp_magic; / 8 8 / struct page_pool pp; /* 16 8 / long unsigned int _pp_mapping_pad; / 24 8 / long unsigned int dma_addr; / 32 8 / atomic_long_t pp_ref_count; / 40 8 / }; / 0 48 / }; / 0 48 / struct net_iov_area owner; /* 48 8 / enum net_iov_type type; / 56 4 / / size: 64, cachelines: 1, members: 3 / / padding: 4 */ }; Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20251024-b4-devmem-remove-niov-max-v1-1-ba72c68bc869@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-28 17:41:46 -07:00
Eric Dumazet	c72568c21b	net: rps: softnet_data reorg to make enqueue_to_backlog() fast enqueue_to_backlog() is showing up in kernel profiles on hosts with many cores, when RFS/RPS is used. The following softnet_data fields need to be updated: - input_queue_tail - input_pkt_queue (next, prev, qlen, lock) - backlog.state (if input_pkt_queue was empty) Unfortunately they are currenly using two cache lines: /* --- cacheline 3 boundary (192 bytes) --- / call_single_data_t csd __attribute__((__aligned__(64))); / 0xc0 0x20 / struct softnet_data rps_ipi_next; /* 0xe0 0x8 / unsigned int cpu; / 0xe8 0x4 / unsigned int input_queue_tail; / 0xec 0x4 / struct sk_buff_head input_pkt_queue; / 0xf0 0x18 / / --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- / struct napi_struct backlog __attribute__((__aligned__(8))); / 0x108 0x1f0 / Add one ____cacheline_aligned_in_smp to make sure they now are using a single cache line. Also, because napi_struct has written fields, make @state its first field. We want to make sure that cpus adding packets to sd->input_pkt_queue are not slowing down cpus processing their backlog because of false sharing. After this patch new layout is: / --- cacheline 5 boundary (320 bytes) --- / long int pad[3] __attribute__((__aligned__(64))); / 0x140 0x18 / unsigned int input_queue_tail; / 0x158 0x4 / / XXX 4 bytes hole, try to pack / struct sk_buff_head input_pkt_queue; / 0x160 0x18 / struct napi_struct backlog __attribute__((__aligned__(8))); / 0x178 0x1f0 */ Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251024091240.3292546-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-28 17:41:17 -07:00
Eric Dumazet	a086e9860c	net: optimize enqueue_to_backlog() for the fast path Add likely() and unlikely() clauses for the common cases: Device is running. Queue is not full. Queue is less than half capacity. Add max_backlog parameter to skb_flow_limit() to avoid a second READ_ONCE(net_hotdata.max_backlog). skb_flow_limit() does not need the backlog_lock protection, and can be called before we acquire the lock, for even better resistance to attacks. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251024090517.3289181-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-28 17:39:03 -07:00
Jakub Kicinski	34164142b5	tools: ynl: rework the string representation of NlError In early days of YNL development dumping the NlMsg on errors was quite useful, as the library itself could have been buggy. These days increasingly the NlMsg is just taking up screen space and means nothing to a typical user. Try to format the errors more in line with how YNL C formats its errors strings. Before: $ ynl --family ethtool --do channels-set --json '{}' Netlink error: Invalid argument nl_len = 44 (28) nl_flags = 0x300 nl_type = 2 error: -22 extack: {'miss-type': 'header'} $ ynl --family ethtool --do channels-set --json '{..., "tx-count": 999}' Netlink error: Invalid argument nl_len = 88 (72) nl_flags = 0x300 nl_type = 2 error: -22 extack: {'msg': 'requested channel count exceeds maximum', 'bad-attr': '.tx-count'} After: $ ynl --family ethtool --do channels-set --json '{}' Netlink error: Invalid argument {'miss-type': 'header'} $ ynl --family ethtool --do channels-set --json '{..., "tx-count": 999}' Netlink error: requested channel count exceeds maximum: Invalid argument {'bad-attr': '.tx-count'} Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20251027192958.2058340-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-28 16:35:06 -07:00
Jakub Kicinski	09e2603513	tools: ynl: fix indent issues in the main Python lib Class NlError() and operation_do_attributes() are indented by 2 spaces rather than 4 spaces used by the rest of the file. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20251027192958.2058340-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-28 16:35:06 -07:00
Paolo Abeni	cebba694d2	Merge branch 'net-stmmac-add-support-for-coarse-timestamping' Maxime Chevallier says: ==================== net: stmmac: Add support for coarse timestamping This is V2 for coarse timetamping support in stmmac. This version uses a dedicated devlink param "ts_coarse" to control this mode. This doesn't conflict with Russell's cleanup of hwif. Maxime [1] : https://lore.kernel.org/netdev/20200514102808.31163-1-olivier.dautricourt@orolia.com/ V1: https://lore.kernel.org/netdev/20251015102725.1297985-1-maxime.chevallier@bootlin.com/ ==================== Link: https://patch.msgid.link/20251024070720.71174-1-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-10-28 15:34:36 +01:00
Maxime Chevallier	6920fa0c76	net: stmmac: Add a devlink attribute to control timestamping mode The DWMAC1000 supports 2 timestamping configurations to configure how frequency adjustments are made to the ptp_clock, as well as the reported timestamp values. There was a previous attempt at upstreaming support for configuring this mode by Olivier Dautricourt and Julien Beraud a few years back [1] In a nutshell, the timestamping can be either set in fine mode or in coarse mode. In fine mode, which is the default, we use the overflow of an accumulator to trigger frequency adjustments, but by doing so we lose precision on the timetamps that are produced by the timestamping unit. The main drawback is that the sub-second increment value, used to generate timestamps, can't be set to lower than (2 / ptp_clock_freq). The "fine" qualification comes from the frequent frequency adjustments we are able to do, which is perfect for a PTP follower usecase. In Coarse mode, we don't do frequency adjustments based on an accumulator overflow. We can therefore have very fine subsecond increment values, allowing for better timestamping precision. However this mode works best when the ptp clock frequency is adjusted based on an external signal, such as a PPS input produced by a GPS clock. This mode is therefore perfect for a Grand-master usecase. Introduce a driver-specific devlink parameter "ts_coarse" to enable or disable coarse mode, keeping the "fine" mode as a default. This can then be changed with: devlink dev param set <dev> name ts_coarse value true cmode runtime The associated documentation is also added. [1] : https://lore.kernel.org/netdev/20200514102808.31163-1-olivier.dautricourt@orolia.com/ Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20251024070720.71174-3-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-10-28 15:34:35 +01:00
Maxime Chevallier	792000fbcd	net: stmmac: Move subsecond increment configuration in dedicated helper In preparation for fine/coarse support, let's move the subsecond increment and addend configuration in a dedicated helper. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20251024070720.71174-2-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-10-28 15:34:34 +01:00
Paolo Abeni	d7d5eca4de	Merge branch 'net-macb-eyeq5-support' says: ==================== net: macb: EyeQ5 support This series' goal is adding support to the MACB driver for EyeQ5 GEM. The specifics for this compatible are: - HW cannot add dummy bytes at the start of IP packets for alignment purposes. The behavior can be detected using DCFG6 so it isn't attached to compatible data. - The hardware LSO/TSO is known to be buggy: add a compatible capability flag to force disable it. - At init, we have to wiggle two syscon registers that configure the PHY integration. In past attempts [0] we did it in macb_config->init() using a syscon regmap. That was far from ideal so now a generic PHY driver abstracts that away. We reuse the bp->sgmii_phy field used by some compatibles. We have to add a phy_set_mode() call as the PHY power on sequence depends on whether we do RGMII or SGMII. [0]: https://lore.kernel.org/lkml/20250627-macb-v2-15-ff8207d0bb77@bootlin.com/ Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com> --- Changes in v3: - Drop Fixes: trailer on [2/5]. We don't fix any platform using the driver currently. - Improve [5/5] commit message; add info about how an unconditional phy_set_mode_ext() won't break existing platforms. - Hardbreak 82 characters line in [2/5]; warning by patchwork. - Trailers: - 1x Acked-by: Conor Dooley on [1/5]. - 2x Reviewed-by: Andrew Lunn on [1/5] and [4/5]. - 2x Reviewed-by: Maxime Chevallier on [4/5] and [5/5]. - Link to v2: https://lore.kernel.org/r/20251022-macb-eyeq5-v2-0-7c140abb0581@bootlin.com Changes in v2: - Drop non net-next patches. - Re-run get_maintainers.pl to shorten the To/Cc list. - Rebase upon latest net-next; no changes. Tested on HW. - Link to v1: https://lore.kernel.org/r/20251021-macb-eyeq5-v1-0-3b0b5a9d2f85@bootlin.com Past versions of the MACB EyeQ5 patches: - March 2025: [PATCH net-next 00/13] Support the Cadence MACB/GEM instances on Mobileye EyeQ5 SoCs https://lore.kernel.org/lkml/20250321-macb-v1-0-537b7e37971d@bootlin.com/ - June 2025: [PATCH net-next v2 00/18] Support the Cadence MACB/GEM instances on Mobileye EyeQ5 SoCs https://lore.kernel.org/lkml/20250627-macb-v2-0-ff8207d0bb77@bootlin.com/ - August 2025: [PATCH net v3 00/16] net: macb: various fixes & cleanup https://lore.kernel.org/lkml/20250808-macb-fixes-v3-0-08f1fcb5179f@bootlin.com/ --- Théo Lebrun (5): dt-bindings: net: cdns,macb: add Mobileye EyeQ5 ethernet interface net: macb: match skb_reserve(skb, NET_IP_ALIGN) with HW alignment net: macb: add no LSO capability (MACB_CAPS_NO_LSO) net: macb: rename bp->sgmii_phy field to bp->phy net: macb: Add "mobileye,eyeq5-gem" compatible .../devicetree/bindings/net/cdns,macb.yaml \| 10 +++ drivers/net/ethernet/cadence/macb.h \| 6 +- drivers/net/ethernet/cadence/macb_main.c \| 94 +++++++++++++++++----- 3 files changed, 91 insertions(+), 19 deletions(-) --- base-commit: `61b7ade9ba` change-id: 20251020-macb-eyeq5-fe2c0d1edc75 Best regards, ==================== Link: https://patch.msgid.link/20251023-macb-eyeq5-v3-0-af509422c204@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-10-28 15:17:56 +01:00
Théo Lebrun	48cf0be9b9	net: macb: Add "mobileye,eyeq5-gem" compatible Add support for the two GEM instances inside Mobileye EyeQ5 SoCs, using compatible "mobileye,eyeq5-gem". With it, add a custom init sequence that must grab a generic PHY and initialise it. We use bp->phy in both RGMII and SGMII cases. Tell our mode by adding a phy_set_mode_ext() during macb_open(), before phy_power_on(). We are the first users of bp->phy that use it in non-SGMII cases. The phy_set_mode_ext() call is made unconditionally. It cannot cause issues on platforms where !bp->phy or !bp->phy->ops->set_mode as, in those cases, the call is a no-op (returning zero). From reading upstream DTS, we can figure out that no platform has a bp->phy and a PHY driver that has a .set_mode() implementation: - cdns,zynqmp-gem: no DTS upstream. - microchip,mpfs-macb: microchip/mpfs.dtsi, &mac0..1, no PHY attached. - xlnx,versal-gem: xilinx/versal-net.dtsi, &gem0..1, no PHY attached. - xlnx,zynqmp-gem: xilinx/zynqmp.dtsi, &gem0..3, PHY attached to drivers/phy/xilinx/phy-zynqmp.c which has no .set_mode(). Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20251023-macb-eyeq5-v3-5-af509422c204@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-10-28 15:17:54 +01:00
Théo Lebrun	3f7e51cd5f	net: macb: rename bp->sgmii_phy field to bp->phy The bp->sgmii_phy field is initialised at probe by init_reset_optional() if bp->phy_interface == PHY_INTERFACE_MODE_SGMII. It gets used by: - zynqmp_config: "cdns,zynqmp-gem" or "xlnx,zynqmp-gem" compatibles. - mpfs_config: "microchip,mpfs-macb" compatible. - versal_config: "xlnx,versal-gem" compatible. Make name more generic as EyeQ5 requires the PHY in SGMII & RGMII cases. Drop "for ZynqMP SGMII mode" comment that is already a lie, as it gets used on Microchip platforms as well. And soon it won't be SGMII-only. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com> Link: https://patch.msgid.link/20251023-macb-eyeq5-v3-4-af509422c204@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-10-28 15:17:54 +01:00

1 2 3 4 5 ...

1397468 Commits