linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-16 07:51:31 -04:00

Author	SHA1	Message	Date
Kuniyuki Iwashima	b7fdc3cfb6	ipmr: Move unregister_netdevice_many() out of ipmr_free_table(). This is a prep commit to convert ipmr_net_exit_batch() to ->exit_rtnl(). Let's move unregister_netdevice_many() in ipmr_free_table() to its callers. Now ipmr_rules_exit() can do batching all tables per netns. Note that later we will remove RTNL and unregister_netdevice_many() in ipmr_rules_init(). Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-9-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:40 -08:00
Kuniyuki Iwashima	3810f9529d	ipmr: Move unregister_netdevice_many() out of mroute_clean_tables(). This is a prep commit to convert ipmr_net_exit_batch() to ->exit_rtnl(). Let's move unregister_netdevice_many() in mroute_clean_tables() to its callers. As a bonus, mrtsock_destruct() can do batching for all tables. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-8-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:40 -08:00
Kuniyuki Iwashima	2c698bab29	ipmr: Convert ipmr_rtm_dumproute() to RCU. ipmr_rtm_dumproute() calls mr_table_dump() or mr_rtm_dumproute(), and mr_rtm_dumproute() finally calls mr_table_dump(). mr_table_dump() calls the passed function, _ipmr_fill_mroute(). _ipmr_fill_mroute() is a wrapper of ipmr_fill_mroute() to cast struct mr_mfc * to struct mfc_cache *. ipmr_fill_mroute() can be already called safely under RCU. Let's convert ipmr_rtm_dumproute() to RCU. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-7-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:40 -08:00
Kuniyuki Iwashima	295a17b3ea	ipmr: Convert ipmr_rtm_getroute() to RCU. ipmr_rtm_getroute() calls __ipmr_get_table(), ipmr_cache_find(), and ipmr_fill_mroute(). The table is not removed until netns dismantle, and net->ipv4.mr_tables is managed with RCU list API, so __ipmr_get_table() is safe under RCU. struct mfc_cache is freed by mr_cache_put() after RCU grace period, so we can use ipmr_cache_find() under RCU. rcu_read_lock() around it was just to avoid lockdep splat for rhl_for_each_entry_rcu(). ipmr_fill_mroute() calls mr_fill_mroute(), which properly uses RCU. Let's drop RTNL for ipmr_rtm_getroute() and use RCU instead. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-6-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:40 -08:00
Kuniyuki Iwashima	2bd6c9d600	ipmr: Use MAXVIFS in mroute_msgsize(). mroute_msgsize() calculates skb size needed for ipmr_fill_mroute(). The size differs based on mrt->maxvif. We will drop RTNL for ipmr_rtm_getroute() and mrt->maxvif may change under RCU. To avoid -EMSGSIZE, let's calculate the size with the maximum value of mrt->maxvif, MAXVIFS. struct rtnexthop is 8 bytes and MAXVIFS is 32, so the maximum delta is 256 bytes, which is small enough. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-5-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:39 -08:00
Kuniyuki Iwashima	402a8111d7	ipmr: Convert ipmr_rtm_dumplink() to RCU. net->ipv4.mr_tables is updated under RTNL and can be read safely under RCU. Once created, the multicast route tables are not removed until netns dismantle. ipmr_rtm_dumplink() does not need RTNL protection for ipmr_for_each_table() and ipmr_fill_table() if RCU is held. Even if mrt->maxvif changes concurrently, ipmr_fill_vif() returns true to continue dumping the next table. Let's convert it to RCU. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-4-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:39 -08:00
Kuniyuki Iwashima	261950e039	ipmr: Annotate access to mrt->mroute_do_{pim,assert,wrvifwhole}. These fields in struct mr_table are updated in ip_mroute_setsockopt() under RTNL: * mroute_do_pim * mroute_do_assert * mroute_do_wrvifwhole However, ip_mroute_getsockopt() does not hold RTNL and read the first two fields locklessly, and ip_mr_forward() reads all the three under RCU. pim_rcv_v1() also reads mroute_do_pim locklessly. Let's use WRITE_ONCE() and READ_ONCE() for them. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-3-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:39 -08:00
Kuniyuki Iwashima	05068eaa67	selftest: net: Add basic functionality tests for ipmr. The new test exercise paths, where RTNL is needed, to catch lockdep splat: setsockopt MRT_INIT / MRT_DONE MRT_ADD_VIF / MRT_DEL_VIF MRT_ADD_MFC / MRT_DEL_MFC / MRT_ADD_MFC_PROXY / MRT_DEL_MFC_PROXY MRT_TABLE MRT_FLUSH rtnetlink RTM_NEWROUTE RTM_DELROUTE NETDEV_UNREGISTER I will extend this to cover IPv6 setsockopt() later. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-2-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:39 -08:00
Eric Dumazet	a0e8c9a506	mpls: remove test against ipv6_stub ipv6_stub is never NULL, let's remove this test. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260228175715.1195536-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:48:43 -08:00
Jakub Kicinski	c34604572e	Merge branch 'net-sparx5-clean-up-probe-remove-init-and-deinit-paths' Daniel Machon says: ==================== net: sparx5: clean up probe/remove init and deinit paths This series refactors the sparx5 init and deinit code out of sparx5_start() and into probe(), adding proper per-subsystem cleanup labels and deinit functions. Currently, the sparx5 driver initializes most subsystems inside sparx5_start(), which is called from probe(). This includes registering netdevs, starting worker threads for stats and MAC table polling, requesting PTP IRQs, and initializing VCAP. The function has grown to handle many unrelated subsystems, and has no granular error handling — it either succeeds entirely or returns an error, leaving cleanup to a single catch-all label in probe(). The remove() path has a similar problem: teardown is not structured as the reverse of initialization, and several subsystems lack proper deinit functions. For example, the stats workqueue has no corresponding cleanup, and the mact workqueue is destroyed without first cancelling its delayed work. Refactor this by moving each init function out of sparx5_start() and into probe(), with a corresponding goto-based cleanup label. Add deinit functions for subsystems that allocate resources, to properly cancel work and destroy workqueues. Ensure that cleanup order in both error paths and remove() follows the reverse of initialization order. sparx5_start() is eliminated entirely — its hardware register setup is renamed to sparx5_forwarding_init() and its FDMA/XTR setup is extracted to sparx5_frame_io_init(). Before this series, most init functions live inside sparx5_start() with no individual cleanup: probe(): sparx5_start(): <- no granular error handling sparx5_mact_init() sparx_stats_init() <- starts worker, no cleanup mact_queue setup <- no cancel on teardown sparx5_register_netdevs() sparx5_register_notifier_blocks() sparx5_vcap_init() sparx5_ptp_init() probe() error path: cleanup_ports: sparx5_cleanup_ports() destroy_workqueue(mact_queue) After this series, probe() initializes subsystems in order with matching cleanup labels, and remove() tears down in reverse: probe(): sparx5_pgid_init() sparx5_vlan_init() sparx5_board_init() sparx5_forwarding_init() sparx5_calendar_init() -> cleanup_ports sparx5_qos_init() -> cleanup_ports sparx5_vcap_init() -> cleanup_ports sparx5_mact_init() -> cleanup_vcap sparx5_stats_init() -> cleanup_mact sparx5_frame_io_init() -> cleanup_stats sparx5_ptp_init() -> cleanup_frame_io sparx5_register_netdevs() -> cleanup_ptp sparx5_register_notifier_blocks() -> cleanup_netdevs remove(): sparx5_unregister_notifier_blocks() sparx5_unregister_netdevs() sparx5_ptp_deinit() sparx5_frame_io_deinit() sparx5_stats_deinit() sparx5_mact_deinit() sparx5_vcap_deinit() sparx5_destroy_netdevs() ==================== Link: https://patch.msgid.link/20260227-sparx5-init-deinit-v2-0-10ba54ccf005@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:46:24 -08:00
Daniel Machon	1e540c4d8f	net: sparx5: replace sparx5_start() with sparx5_forwarding_init() With all subsystem initializations moved out, sparx5_start() only sets up forwarding (UPSIDs, CPU ports, masks, PGIDs, FCS, watermarks). Rename it to sparx5_forwarding_init() and make it void since it cannot fail. This removes sparx5_start() entirely. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20260227-sparx5-init-deinit-v2-9-10ba54ccf005@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:46:22 -08:00
Daniel Machon	8b1e4a6747	net: sparx5: move FDMA/XTR initialization out of sparx5_start() Move the Frame DMA and register-based extraction initialization out of sparx5_start() and into a new sparx5_frame_io_init() function, called from probe(). Also, add sparx5_frame_io_deinit() for the cleanup path. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20260227-sparx5-init-deinit-v2-8-10ba54ccf005@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:46:22 -08:00
Daniel Machon	0432c60112	net: sparx5: move PTP IRQ handling out of sparx5_start() Move the PTP IRQ request into sparx5_ptp_init() so all PTP setup is done in one place. Also move the sparx5_ptp_init() call to right before sparx5_register_netdevs() and add a cleanup_ptp label. Update remove() to disable the PTP IRQ and reorder ptp_deinit accordingly. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20260227-sparx5-init-deinit-v2-7-10ba54ccf005@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:46:22 -08:00
Daniel Machon	cdc374359f	net: sparx5: move remaining init functions from start() to probe() Move sparx5_pgid_init(), sparx5_vlan_init(), and sparx5_board_init() from sparx5_start() to probe(). These functions do not require cleanup. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20260227-sparx5-init-deinit-v2-6-10ba54ccf005@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:46:21 -08:00
Daniel Machon	274182ff34	net: sparx5: move calendar initialization to probe Move the calendar initialization from sparx5_start() to probe() by creating a new sparx5_calendar_init() wrapper function that calls both sparx5_config_auto_calendar() and sparx5_config_dsm_calendar(). Calendar initialization does not require cleanup. Also, make the individual calendar config functions static since they are now only called from within sparx5_calendar.c. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20260227-sparx5-init-deinit-v2-5-10ba54ccf005@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:46:21 -08:00
Daniel Machon	e180067a03	net: sparx5: move stats initialization and add deinit function The sparx5_stats_init() function starts a worker thread which needs to be cleaned up. Move the initialization code to probe() and add a deinit() function for proper teardown. Also, rename sparx_stats_init() to sparx5_stats_init() to match the driver naming convention. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20260227-sparx5-init-deinit-v2-4-10ba54ccf005@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:46:21 -08:00
Daniel Machon	13cb1b6884	net: sparx5: move MAC table initialization and add deinit function Consolidate all MAC table initialization from sparx5_start() into sparx5_mact_init(), move it to probe(), and add a deinit function for proper teardown. Also, make sparx5_mact_pull_work() static since it is only used within sparx5_mactable.c. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20260227-sparx5-init-deinit-v2-3-10ba54ccf005@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:46:21 -08:00
Daniel Machon	3a95973e7c	net: sparx5: move VCAP initialization to probe Move the VCAP initialization code from sparx5_start() to probe(). Add proper error handling with a cleanup_vcap label and sparx5_vcap_deinit() call. Also, rename sparx5_vcap_destroy() to sparx5_vcap_deinit() to stay consistent with the naming. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20260227-sparx5-init-deinit-v2-2-10ba54ccf005@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:46:21 -08:00
Daniel Machon	b8909aad5b	net: sparx5: move netdev and notifier block registration to probe Move netdev registration and notifier block registration from sparx5_start() to probe(). This allows proper cleanup via goto-based error labels in probe(). Also, remove the sparx5_cleanup_ports() helper as its functionality is now split between sparx5_unregister_netdevs() and sparx5_destroy_netdevs() called at appropriate points. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20260227-sparx5-init-deinit-v2-1-10ba54ccf005@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:46:21 -08:00
Jakub Kicinski	d03c9ae654	Merge branch 'net-stmmac-further-cleanups' Russell King says: ==================== net: stmmac: further cleanups Yet another bunch of patches cleaning up the stmmac driver. We start off by cleaning up the formatting for stmmac_mac_finish(). Then remove a plat_dat->port_node which is redundant, followed by several descriptor methods that aren't called. We then remove useless dwmac4 interrupt definitions, and realise that v4.10 definitions are the same as v4.0, so get rid of those as well. We also remove the write-only priv->hw->xlgmac member. Next, we change priv->extend_desc and priv->chain_mode to be a boolean and document what each of these are doing. Also do the same for dma_cfg->fixed_burst and dma_cfg->mixed_burst. Then, move the initialisation of dma_cfg->atds into stmmac_hw_init() as this is where we have all the dependencies for this known, and simplify its initialisation. Also comment what this is doing. Finally, move the check that priv->plat->dma_cfg is present and the programmable burst limit is set into the driver probe rather than checking it each time we are just about to reset the dwmac core. It is unnecessary to keep checking this. This makes a platform glue driver fail early when it hasn't setup everything that's required rather than when attempting to bring the netdev up for the first time. ==================== Link: https://patch.msgid.link/aaFpZvuIzOLaNM0m@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:46 -08:00
Russell King (Oracle)	07a8531d44	net: stmmac: move DMA configuration validation to driver probe Move the DMA configuration validation from stmmac_init_dma_engine() to the start of the driver probe function. The platform glue is expected to supply the DMA configuration, and a non-zero programmable burst length (bpl). Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuY3-0000000Avnq-1Spv@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:07 -08:00
Russell King (Oracle)	93cde989bd	net: stmmac: simplify atds initialisation atds is boolean, and there is only one place that its value is changed. Simplify this to a boolean assignment. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXy-0000000Avnk-10Q8@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:06 -08:00
Russell King (Oracle)	0835bc7251	net: stmmac: move initialisation of dma_cfg->atds Move the initialisation of priv->plat->dma_cfg->atds, which indicates that 8 32-bit word descriptors are being used for pre-v4.0 cores, after the call to stmmac_hwif_init(), which will initialise priv->extend_desc and priv->mode (the descriptor mode.) We don't need to re-evaluate this in stmmac_init_dma_engine() - as the state that it depends on only changes in stmmac_hwif_init() which is only called in the probe path. Also, once set, no code clears this flag. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXt-0000000Avnc-0UYC@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:06 -08:00
Russell King (Oracle)	1558705afb	net: stmmac: make dma_cfg mixed/fixed burst boolean struct stmmac_dma_cfg mixed_burst/fixed_burst members are both boolean in nature - of_property_read_bool() are used to read these from DT, and they are only tested for non-zero values. Use bool to avoid unnecessary padding in this structure. Update dwmac-intel to initialise these using true rather than '1', and remove the '0' initialisers as the struct is already zero initialised on allocation. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXn-0000000AvnX-4A1u@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:06 -08:00
Russell King (Oracle)	a2a3832ad7	net: stmmac: make chain_mode a boolean priv->chain_mode is only tested for non-zero, so it can be a boolean. Change its type to boolean, and add a comment describing this member. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXi-0000000AvnR-3btC@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:06 -08:00
Russell King (Oracle)	ecb037f58d	net: stmmac: make extend_desc boolean extend_desc is a boolean, so make it so, and use "true" to assign it. Add a comment to describe what this member does. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXd-0000000AvnL-36K3@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:06 -08:00
Russell King (Oracle)	70bafb53b3	net: stmmac: remove mac->xlgmac mac->xlgmac is only ever written to by the dwxlgmac2_quirk() function. Remove mac->xlgmac, and the quirk function that then becomes redundant. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXY-0000000AvnF-2ccv@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:06 -08:00
Russell King (Oracle)	0e7cb34d0f	net: stmmac: remove dwmac410_(enable\|disable)_dma_irq As a result of the previous cleanup, it is now obvious that there are no differences between the dwmac4 and dwmac410 versions of the DMA interrupt enable/disable functions. Moreover, dwmac410_disable_dma_irq() is completely unused; instead, dwmac4_disable_dma_irq() is used to disable the interrupts for v4.10a cores while dwmac410_enable_dma_irq() was being used to enable these same same interrupts. Remove the unnecessary v4.10a functions. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXT-0000000Avn9-29US@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:05 -08:00
Russell King (Oracle)	d192529123	net: stmmac: remove dwmac4 DMA_CHAN_INTR_DEFAULT_[TR]X* Remove the DMA_CHAN_INTR_DEFAULT_[TR]X* definitions, which are aliases of their respective DMA_CHAN_INTR_ENA_[TR]IE definitions. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXO-0000000Avn3-1hhD@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:05 -08:00
Russell King (Oracle)	19f2d59c3c	net: stmmac: remove .get_tx_len() No code calls stmmac_get_tx_len(). Remove this macro, its associated function pointer, and all implementations. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXJ-0000000Avmx-1B8Y@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:05 -08:00
Russell King (Oracle)	1fe444bdc5	net: stmmac: remove .get_tx_ls() No code calls stmmac_get_tx_ls(). Remove this macro, its associated function pointer, and all implementations. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuXE-0000000Avmr-0eB0@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:05 -08:00
Russell King (Oracle)	d48ba98bbc	net: stmmac: remove .get_tx_owner() No code calls stmmac_get_tx_owner(). Remove the macro, its associated function pointer, and all implementations. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuX9-0000000Avml-08Lo@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:05 -08:00
Russell King (Oracle)	44a2ec96d3	net: stmmac: remove plat_dat->port_node There are repeated instances of: fwnode = priv->plat->port_node; if (!fwnode) fwnode = dev_fwnode(priv->device); However, the only place that ->port_node is set is stmmac_probe_config_dt(): struct device_node np = pdev->dev.of_node; ... / PHYLINK automatically parses the phy-handle property */ plat->port_node = of_fwnode_handle(np); which is equivalent to dev_fwnode(&pdev->dev) and, as priv->device will be &pdev->dev, is also equivalent to dev_fwnode(priv->device). Thus, plat_dat->port_node doesn't provide any extra benefit over using dev_fwnode(priv->device) directly. There is one case where port_node is used directly, which can be found in stmmac_pcs_setup(). This may cause a change of behaviour as PCI drivers do not populate plat_dat->port_node, but dev_fwnode(priv->device) may be valid. PCI-based stmmac should be tested. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuX3-0000000Avme-3oej@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:05 -08:00
Russell King (Oracle)	940ec40dd2	net: stmmac: clean up formatting in stmmac_mac_finish() Wrap the arguments for priv->plat->mac_finish() to avoid an overly long line. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvuWy-0000000AvmY-3GWN@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:35:04 -08:00
Jakub Kicinski	df548e627b	Merge branch 'selftests-drv-net-iou-zcrx-improve-stability-and-make-the-large-chunk-test-work' Jakub Kicinski says: ==================== selftests: drv-net: iou-zcrx: improve stability and make the large chunk test work The iou-zcrx test hasn't been passing in NIPA, I assumed it's because we're missing iouring changes, but it's still failing after the merge window. Turns out there was a bug in the implementation which was fixed separately via the iouring tree. With that out of the way the tests are passing but flaky. Patch 1 deals with the flakiness. While looking at this I also noticed that the large chunk test isn't running at all. So fix and enable it (patches 2 and 3). ==================== Link: https://patch.msgid.link/20260227171305.2848240-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:21:20 -08:00
Jakub Kicinski	c7b228418e	selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test The large chunks test needs 2MB hugepages for its mmap allocation, but the test system may not have any pre-allocated. Ensure at least 64 hugepages are available before running the test, and restore the original value on cleanup. While at it strip the stdout, it has a trailing new line. Before: ok 5 iou-zcrx.test_zcrx_large_chunks # SKIP Can't allocate huge pages Link: https://patch.msgid.link/20260227171305.2848240-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:21:16 -08:00
Jakub Kicinski	67792dde27	selftests: drv-net: iou-zcrx: rework large chunks test to use common setup Commit `a32bb32d01` ("selftests: iou-zcrx: test large chunk sizes") and commit `de7c600e2d` ("selftests/net: parametrise iou-zcrx.py with ksft_variants") landed at similar time. The large chunks test was actually not included in the list of tests, so it never run. We haven't noticed that it uses the old-style helpers (_get_combined_channels, _get_current_settings, _set_flow_rule) that were removed by the other commit. Rework test_zcrx_large_chunks to reuse the single() setup function and add it to the ksft_run cases list so it actually gets executed. Link: https://patch.msgid.link/20260227171305.2848240-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:21:15 -08:00
Jakub Kicinski	27c4ab9438	selftests: drv-net: iou-zcrx: wait for memory provider cleanup io_uring defers zcrx context teardown to the iou_exit workqueue. # ps aux \| grep iou ... 07:58 0:00 [kworker/u19:0-iou_exit] ... 07:58 0:00 [kworker/u18:2-iou_exit] When the test's receiver process exits, bkg() returns but the memory provider may still be attached to the rx queue. The subsequent defer() that restores tcp-data-split then fails: # Exception while handling defer / cleanup (callback 3 of 3)! # Defer Exception\| net.ynl.pyynl.lib.ynl.NlError: Netlink error: can't disable tcp-data-split while device has memory provider enabled: Invalid argument not ok 1 iou-zcrx.test_zcrx.single Add a helper that polls netdev queue-get until no rx queue reports the io-uring memory provider attribute. Register it as a defer() just before tcp-data-split is restored as a "barrier". Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Link: https://patch.msgid.link/20260227171305.2848240-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:21:15 -08:00
Eric Dumazet	8341c989ac	net: remove addr_len argument of recvmsg() handlers Use msg->msg_namelen as a place holder instead of a temporary variable, notably in inet[6]_recvmsg(). This removes stack canaries and allows tail-calls. $ scripts/bloat-o-meter -t vmlinux.old vmlinux add/remove: 0/0 grow/shrink: 2/19 up/down: 26/-532 (-506) Function old new delta rawv6_recvmsg 744 767 +23 vsock_dgram_recvmsg 55 58 +3 vsock_connectible_recvmsg 50 47 -3 unix_stream_recvmsg 161 158 -3 unix_seqpacket_recvmsg 62 59 -3 unix_dgram_recvmsg 42 39 -3 tcp_recvmsg 546 543 -3 mptcp_recvmsg 1568 1565 -3 ping_recvmsg 806 800 -6 tcp_bpf_recvmsg_parser 983 974 -9 ip_recv_error 588 576 -12 ipv6_recv_rxpmtu 442 428 -14 udp_recvmsg 1243 1224 -19 ipv6_recv_error 1046 1024 -22 udpv6_recvmsg 1487 1461 -26 raw_recvmsg 465 437 -28 udp_bpf_recvmsg 1027 984 -43 sock_common_recvmsg 103 27 -76 inet_recvmsg 257 175 -82 inet6_recvmsg 257 175 -82 tcp_bpf_recvmsg 663 568 -95 Total: Before=25143834, After=25143328, chg -0.00% Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260227151120.1346573-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:17:17 -08:00
Jakub Kicinski	f5ada26d6c	Merge tag 'phy-qcom-sgmii-eth-add-set_mode-and-validate-methods' net: stmmac: qcom-ethqos: further serdes reorganisation [part] First PHY patch of Russell's series. Vladimir will need this to avoid a conflict with his work. Link: https://patch.msgid.link/aaDSJAc-x2-klvHJ@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 15:46:10 -08:00
Russell King (Oracle)	4ff5801f45	phy: qcom-sgmii-eth: add .set_mode() and .validate() methods qcom-sgmii-eth is an Ethernet SerDes supporting only Ethernet mode using SGMII, 1000BASE-X and 2500BASE-X. Add an implementation of the .set_mode() method, which can be used instead of or as well as the .set_speed() method. The Ethernet interface modes mentioned above all have a fixed data rate, so setting the mode is sufficient to fully specify the operating parameters. Add an implementation of the .validate() method, which will be necessary to allow discovery of the SerDes capabilities for platform independent SerDes support in the stmmac network driver. Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Acked-by: Vinod Koul <vkoul@kernel.org> Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vvkU3-0000000AuP2-0hu3@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 15:41:38 -08:00
Jakub Kicinski	01857fc712	Merge branch 'net-sched-refactor-qdisc-drop-reasons-into-dedicated-tracepoint' Jesper Dangaard Brouer says: ==================== net: sched: refactor qdisc drop reasons into dedicated tracepoint This series refactors qdisc drop reason handling by introducing a dedicated enum qdisc_drop_reason and trace_qdisc_drop tracepoint, providing qdisc layer drop diagnostics with direct qdisc context visibility. Background: ----------- Identifying which qdisc dropped a packet via skb_drop_reason is difficult. Normally, the kfree_skb tracepoint caller "location" hints at the dropping code, but qdisc drops happen at a central point (__dev_queue_xmit), making this unusable. As a workaround, commits `5765c7f6e3` ("net_sched: sch_fq: add three drop_reason") and `a42d71e322` ("net_sched: sch_cake: Add drop reasons") encoded qdisc names directly in the drop reason enums. This series provides a cleaner solution by creating a dedicated qdisc tracepoint that naturally includes qdisc context (handle, parent, kind). Solution: --------- Create a new tracepoint trace_qdisc_drop that builds on top of existing trace_qdisc_enqueue infrastructure. It includes qdisc handle, parent, qdisc kind (name), and device information directly. The existing SKB_DROP_REASON_QDISC_DROP is retained for backwards compatibility via kfree_skb_reason(). The qdisc-specific drop reasons (QDISC_DROP_*) provide fine-grained detail via the new tracepoint. The enum uses subsystem encoding (offset by SKB_DROP_REASON_SUBSYS_QDISC) to catch type mismatches during debugging. This implements the alternative approach described in: https://lore.kernel.org/all/6be17a08-f8aa-4f91-9bd0-d9e1f0a92d90@kernel.org/ ==================== Link: https://patch.msgid.link/177211325634.3011628.9343837509740374154.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 15:31:38 -08:00
Jesper Dangaard Brouer	67713dff63	net: sched: sch_dualpi2: use qdisc_dequeue_drop() for dequeue drops DualPI2 drops packets during dequeue but was using kfree_skb_reason() directly, bypassing trace_qdisc_drop. Convert to qdisc_dequeue_drop() and add QDISC_DROP_L4S_STEP_NON_ECN to the qdisc drop reason enum. - Set TCQ_F_DEQUEUE_DROPS flag in dualpi2_init() - Use enum qdisc_drop_reason in drop_and_retry() - Replace kfree_skb_reason() with qdisc_dequeue_drop() Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/177211351978.3011628.11267023360997620069.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 15:31:35 -08:00
Jesper Dangaard Brouer	9d3e7f9718	net: sched: rename QDISC_DROP_CAKE_FLOOD to QDISC_DROP_FLOOD_PROTECTION Rename QDISC_DROP_CAKE_FLOOD to QDISC_DROP_FLOOD_PROTECTION to use a generic name without embedding the qdisc name. This follows the principle that drop reasons should describe the drop mechanism rather than being tied to a specific qdisc implementation. The flood protection drop reason is used by qdiscs implementing probabilistic drop algorithms (like BLUE) that detect unresponsive flows indicating potential DoS or flood attacks. CAKE uses this via its Cobalt AQM component. Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/177211347537.3011628.13759059534638729639.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 15:31:35 -08:00
Jesper Dangaard Brouer	f30d9073ec	net: sched: rename QDISC_DROP_FQ_* to generic names Rename FQ-specific drop reasons to generic names: - QDISC_DROP_FQ_BAND_LIMIT -> QDISC_DROP_BAND_LIMIT - QDISC_DROP_FQ_HORIZON_LIMIT -> QDISC_DROP_HORIZON_LIMIT This follows the principle that drop reasons should describe the drop mechanism rather than being tied to a specific qdisc implementation. These concepts (priority band limits, timestamp horizon) could apply to other qdiscs as well. Remove the local macro define FQDR() and instead use the full QDISC_DROP_* name to make it easier to navigate code. Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/177211346902.3011628.12523261489552097455.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 15:31:35 -08:00
Jesper Dangaard Brouer	3e28f8ad47	net: sched: sfq: convert to qdisc drop reasons Convert SFQ to use the new qdisc-specific drop reason infrastructure. This patch demonstrates how to convert a flow-based qdisc to use the new enum qdisc_drop_reason. As part of this conversion: - Add QDISC_DROP_MAXFLOWS for flow table exhaustion - Rename FQ_FLOW_LIMIT to generic FLOW_LIMIT, now shared by FQ and SFQ - Use QDISC_DROP_OVERLIMIT for sfq_drop() when overall limit exceeded - Use QDISC_DROP_FLOW_LIMIT for per-flow depth limit exceeded The FLOW_LIMIT reason is now a common drop reason for per-flow limits, applicable to both FQ and SFQ qdiscs. Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/177211345946.3011628.12770616071857185664.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 15:31:34 -08:00
Jesper Dangaard Brouer	ff2998f29f	net: sched: introduce qdisc-specific drop reason tracing Create new enum qdisc_drop_reason and trace_qdisc_drop tracepoint for qdisc layer drop diagnostics with direct qdisc context visibility. The new tracepoint includes qdisc handle, parent, kind (name), and device information. Existing SKB_DROP_REASON_QDISC_DROP is retained for backwards compatibility via kfree_skb_reason(). Convert qdiscs with drop reasons to use the new infrastructure. Change CAKE's cobalt_should_drop() return type from enum skb_drop_reason to enum qdisc_drop_reason to fix implicit enum conversion warnings. Use QDISC_DROP_UNSPEC as the 'not dropped' sentinel instead of SKB_NOT_DROPPED_YET. Both have the same compiled value (0), so the comparison logic remains semantically equivalent. Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/177211345275.3011628.1974310302645218067.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 15:31:34 -08:00
Jakub Kicinski	52d534aa66	Merge branch 'icmp-fix-icmp-error-source-address-over-xfrm-tunnel' Antony Antony says: ==================== icmp: Fix icmp error source address over xfrm tunnel icmp: Fix icmp error source address over xfrm tunnel This fix, originally sent to XFRM/IPsec, has been recommended by Steffen Klassert to submit to the net tree, since it changes ICMP behavior. The patch addresses a minor issue related to the IPv4 source address of ICMP error messages. The bug only occurs when xfrm policies are configured. It originated from an old 2011 commit: commit `415b3334a2` ("icmp: Fix regression in nexthop resolution during replies.") ==================== Link: https://patch.msgid.link/cover.1772101380.git.antony.antony@secunet.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 15:08:20 -08:00
Antony Antony	5b43d35e57	selftests: net: add ICMP error source address test over xfrm tunnel Test that ICMP error messages generated by an IPsec gateway use the correct source address (the gateway's address, not the unreachable destination). Signed-off-by: Antony Antony <antony.antony@secunet.com> Link: https://patch.msgid.link/79d526f96cf2252d71550d38772876bc72c7e3c7.1772101380.git.antony.antony@secunet.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 15:08:15 -08:00
Antony Antony	595da751c8	icmp: fix ICMP error source address when xfrm policy matches When an IPsec gateway generates an ICMP error (e.g., Destination Host Unreachable), the source address incorrectly shows the unreachable destination instead of the gateway's address. IPv6 behaves correctly. Before fix: ping 10.1.6.3 From 10.1.6.3 icmp_seq=1 Destination Host Unreachable (wrong - 10.1.6.3 is the unreachable host) After fix: ping 10.1.6.3 From 10.1.5.2 icmp_seq=1 Destination Host Unreachable (correct - 10.1.5.2 is the gateway) The fix removes the memcpy that overwrote fl4 with fl4_dec after xfrm_lookup(). A follow-up commit adds a selftest. Fixes: `415b3334a2` ("icmp: Fix regression in nexthop resolution during replies.") Cc: stable+noautosel@kernel.org # Avoid false positives in tests Signed-off-by: Antony Antony <antony.antony@secunet.com> Acked-by: Tobias Brunner <tobias@strongswan.org> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/19a0156ff6e76baa323a81d710510d399a6ff63a.1772101380.git.antony.antony@secunet.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 15:08:15 -08:00

1 2 3 4 5 ...

1427176 Commits