linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-16 16:01:44 -04:00

Author	SHA1	Message	Date
Zeeshan Ahmad	d6ca199568	net: core: failover: enforce mandatory ops and clean up redundant checks The failover framework requires 'ops' to be functional. Currently, failover_register() allows an instance to be registered with NULL ops, which leads to inconsistent NULL checks and potential NULL pointer dereferences in the slave registration paths. Harden the entry point by requiring non-NULL ops in failover_register(). This ensures the 'fops' pointer is guaranteed to be valid for any successfully registered failover instance. Consequently, remove the now redundant NULL checks for 'fops' throughout the module to simplify the logic. Signed-off-by: Zeeshan Ahmad <zeeshanahmad022019@gmail.com> Link: https://patch.msgid.link/20260302064317.9964-1-zeeshanahmad022019@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 17:44:11 -08:00
Breno Leitao	dfa77c0dd4	selftests: netconsole: print diagnostic on busywait timeout in netcons_basic The script uses set -euo pipefail, so when busywait times out waiting for the netconsole message to arrive, it returns 1 and the script exits immediately without printing any error message. As reported by Jakub, this makes failures hard to diagnose since the test reports exit=1 with no explanation. Handle the busywait failure explicitly so that a FAIL message is printed before exiting. This is how it looks like now: Running with target mode: basic (ipv6) [ 167.452561] netconsole selftest: netcons_QdMay FAIL: Timed out waiting (20000 ms) for netconsole message in /tmp/netcons_QdMay The remaining silent failures under set -e can only happen during the setup phase (netdevsim creation, interface configuration, configfs writes). So, it is not expected to have any silent failure once the test starts. Note that this issue might be less frequent now, since commit `a68a9bd086` ("selftests: netconsole: Increase port listening timeout") increased the timeout that _might_ have been the root cause of these random failures in NIPA. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260302-netconsole_test_verbose-v1-1-b1be5d30cd7d@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 17:22:54 -08:00
Jakub Kicinski	1085c258d8	Merge branch 'grab-ipa-imem-slice-through-dt' Konrad Dybcio says: ==================== Grab IPA IMEM slice through DT This adds the necessary driver change to migrate over from hardcoded-per-IPA-version-but-varying-per-implementation numbers, while unfortunately keeping them in there for backwards compatibility. The DT changes will be submitted in a separate series, this one is OK to merge independently. ==================== Link: https://patch.msgid.link/20260302-topic-ipa_imem-v6-0-c0ebbf3eae9f@oss.qualcomm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 17:22:18 -08:00
Konrad Dybcio	6f82cb4ecd	net: ipa: Grab IMEM slice base/size from DTS This is a detail that differ per chip, and not per IPA version (and there are cases of the same IPA versions being implemented across very very very different SoCs). This region isn't actually used by the driver, but we most definitely want to iommu-map it, so that IPA can poke at the data within. Reviewed-by: Alex Elder <elder@riscstar.com> Acked-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://patch.msgid.link/20260302-topic-ipa_imem-v6-3-c0ebbf3eae9f@oss.qualcomm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 17:22:14 -08:00
Konrad Dybcio	f5a598abfd	dt-bindings: net: qcom,ipa: Add sram property for describing IMEM slice The IPA driver currently grabs a slice of IMEM through hardcoded addresses. Not only is that ugly and against the principles of DT, but it also creates a situation where two distinct platforms implementing the same version of IPA would need to be hardcoded together and matched at runtime. Instead, do the sane thing and accept a handle to said region directly. Don't make it required on purpose, as it's not there on ancient implementations (currently unsupported) and we're not yet done with filling the data across al DTs. Reviewed-by: Alex Elder <elder@riscstar.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://patch.msgid.link/20260302-topic-ipa_imem-v6-2-c0ebbf3eae9f@oss.qualcomm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 17:22:14 -08:00
Konrad Dybcio	ca4c7771a0	dt-bindings: sram: qcom,imem: Allow modem-tables subnode The IP Accelerator hardware/firmware owns a sizeable region within the IMEM, named 'modem-tables', containing various packet processing configuration data. It's not actually accessed by the OS, although we have to IOMMU-map it with the IPA device, so that presumably the firmware can act upon it. Allow it as a subnode of IMEM. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Reviewed-by: Alex Elder <elder@riscstar.com> Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://patch.msgid.link/20260302-topic-ipa_imem-v6-1-c0ebbf3eae9f@oss.qualcomm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 17:22:14 -08:00
Sean Chang	acd338ba2f	net: macb: use ethtool_sprintf to fill ethtool stats strings The RISC-V toolchain triggers a stringop-truncation warning when using snprintf() with a fixed ETH_GSTRING_LEN (32 bytes) buffer. Convert the driver to use the modern ethtool_sprintf() API from linux/ethtool.h. This removes the need for manual snprintf() and memcpy() calls, handles the 32-byte padding automatically, and simplifies the logic by removing manual pointer arithmetic. Suggested-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Signed-off-by: Sean Chang <seanwascoding@gmail.com> Link: https://patch.msgid.link/20260302142931.49108-1-seanwascoding@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 17:18:00 -08:00
Kohei Enju	39feb171f3	net: core: allow netdev_upper_get_next_dev_rcu from bh context Since XDP programs are called from a NAPI poll context, the RCU reference liveness is ensured by local_bh_disable(). Commit `aeea1b86f9` ("bpf, devmap: Exclude XDP broadcast to master device") started to call netdev_upper_get_next_dev_rcu() from this context, but missed adding rcu_read_lock_bh_held() as a condition to the RCU checks. While both bh_disabled and rcu_read_lock() provide RCU protection, lockdep complains since the check condition is insufficient [1]. Add rcu_read_lock_bh_held() as condition to help lockdep to understand the dereference is safe, in the same way as commit `694cea395f` ("bpf: Allow RCU-protected lookups to happen from bh context"). [1] WARNING: net/core/dev.c:8099 at netdev_upper_get_next_dev_rcu+0x96/0xd0, CPU#0: swapper/0/0 ... RIP: 0010:netdev_upper_get_next_dev_rcu+0x96/0xd0 ... <IRQ> dev_map_enqueue_multi+0x411/0x970 xdp_do_redirect+0xdf2/0x1030 __igc_xdp_run_prog+0x6a0/0xc80 igc_poll+0x34b0/0x70b0 __napi_poll.constprop.0+0x98/0x490 net_rx_action+0x8f2/0xfa0 handle_softirqs+0x1c7/0x710 __irq_exit_rcu+0xb1/0xf0 irq_exit_rcu+0x9/0x20 common_interrupt+0x7f/0x90 </IRQ> Signed-off-by: Kohei Enju <kohei@enjuk.jp> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260220110922.94781-1-kohei@enjuk.jp Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 17:15:20 -08:00
Tomasz Unger	8ce185c7e0	NFC: s3fwrn5: Replace strcpy() with strscpy() Replace strcpy() with strscpy() which limits the copy to the size of the destination buffer. Since fw_info->fw_name is an array with a fixed, declared size, the two-argument variant of strscpy() is used - the compiler deduces the buffer size automatically. This is a defensive cleanup replacing the deprecated strcpy() with the preferred strscpy(). Signed-off-by: Tomasz Unger <tomasz.unger@yahoo.pl> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Link: https://patch.msgid.link/20260302100908.26399-1-tomasz.unger@yahoo.pl Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 17:13:18 -08:00
Tomasz Unger	c49a9eb650	NFC: nfcmrvl: Replace strcpy() with strscpy() Replace strcpy() with strscpy() which limits the copy to the size of the destination buffer. Since fw_dnld->name is an array, the two-argument variant of strscpy() is used - the compiler deduces the buffer size automatically. This is a defensive cleanup replacing the deprecated strcpy() with the preferred strscpy(). Signed-off-by: Tomasz Unger <tomasz.unger@yahoo.pl> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260301144345.218628-1-tomasz.unger@yahoo.pl Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 17:13:11 -08:00
Tomasz Unger	66e807f96f	NFC: nxp-nci: Replace strcpy() with strscpy() Replace strcpy() with strscpy() which limits the copy to the size of the destination buffer. Since fw_info->name is an array, the two-argument variant of strscpy() is used - the compiler deduces the buffer size automatically. This is a defensive cleanup replacing the deprecated strcpy() with the preferred strscpy(). Signed-off-by: Tomasz Unger <tomasz.unger@yahoo.pl> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260301135633.214497-1-tomasz.unger@yahoo.pl Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 17:12:53 -08:00
Tomasz Unger	e63f5918ad	NFC: pn544: i2c: Replace strcpy() with strscpy() Replace strcpy() with strscpy() which limits the copy to the size of the destination buffer. Since phy->firmware_name is an array, the two-argument variant of strscpy() is used - the compiler deduces the buffer size automatically. This is a defensive cleanup. As pointed out by Jakub Kicinski <kuba@kernel.org>, firmware_name is already bounded to NFC_FIRMWARE_NAME_MAXSIZE via nla_strscpy() in net/nfc/netlink.c before reaching this driver, so no actual buffer overflow is possible. Signed-off-by: Tomasz Unger <tomasz.unger@yahoo.pl> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260301121254.174354-1-tomasz.unger@yahoo.pl Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 17:12:46 -08:00
Paolo Abeni	ed0abfe93f	Merge branch 'net-phy-improve-stats-handling-in-mdio_bus-c' Heiner Kallweit says: ==================== net: phy: improve stats handling in mdio_bus.c Improve stats handling in mdio_bus.c. ==================== Link: https://patch.msgid.link/799114be-1456-442b-b479-142e7ee9d254@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-03 13:23:47 +01:00
Heiner Kallweit	1afccc5a20	net: phy: improve mdiobus_stats_acct - Remove duplicated preempt disable. Disabling preemption has been added to functions like u64_stats_update_begin() in the meantime. - Simplify branch structure Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/2ceeb542-986a-404e-ad0f-62e0a938ce7c@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-03 13:23:45 +01:00
Heiner Kallweit	7f97ca5f98	net: phy: inline helper mdio_bus_get_global_stat mdio_bus_get_global_stat() has only one user. Inline it to simplify the code. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/7876625a-bd6f-42b4-8eb3-420f39d2f59a@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-03 13:23:45 +01:00
Heiner Kallweit	8e0bdf30be	net: mdio: use macro __ATTRIBUTE_GROUPS Use macro __ATTRIBUTE_GROUPS() to simplify the code. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/260fb184-c662-415c-b288-e1423097f2b9@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-03 13:23:45 +01:00
Heiner Kallweit	a4c08b7015	net: mdio: constify attributes and attribute arrays Constify attributes and attribute arrays, using new member attrs_const of struct attribute_group. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/c20f17bb-3489-42b5-b8fe-457245ac6cb3@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-03 13:23:45 +01:00
Heiner Kallweit	c599649d05	net: phy: avoid extra casting in mdio_bus_get_stat Using void * instead of char * allows to remove one cast. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/054bbf60-d8ac-45ce-8b80-9c396469b7f9@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-03 13:23:45 +01:00
Heiner Kallweit	8068acaff1	net: phy: consider that mdio_bus_device_stat_field_show doesn't use member address mdio_bus_device_stat_field_show() doesn't use the address member, so we don't have to initialize it. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/03a812a7-6871-4cc0-b5bf-ee80c6d6b5fd@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-03 13:23:45 +01:00
Heiner Kallweit	807d8addc3	net: mdio: use macro __ATTR to simplify the code Use macro __ATTR to simplify the code. Note that __ATTR can't be used in MDIO_BUS_STATS_ADDR_ATTR_DECL because the included stringification would conflict with how argument file is passed. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/4877a4dc-247c-4453-b281-20a8d969b15b@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-03 13:23:45 +01:00
Heiner Kallweit	5c494f404a	net: mdio: extend struct mdio_bus_stat_attr instead of using dev_ext_attribute Currently the var member of struct dev_ext_attribute is used in a very ugly way. Extend struct mdio_bus_stat_attr instead, what allows to simplify the code and also slightly reduces memory footprint. Note: Member addr is renamed to avoid a conflict in macro MDIO_BUS_STATS_ADDR_ATTR_DECL. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/ce9f85d2-4f72-4b15-b868-210a8ced662d@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-03 13:23:44 +01:00
Heiner Kallweit	e07bd1f716	net: ti: davinci_emac: stop using bus type mdio_bus_type This driver is the only user of mdio_bus_type outside phylib. Using mdio_bus_type isn't strictly needed here, so use an alternative approach. This will allow to make mdio_bus_type private to phylib in a follow-up series. Compile-tested only. Note: Devices supported by this driver are OF-only, therefore the string comparison in match_first_device() isn't needed any longer. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/cc8e83aa-48c3-4497-b6ad-760a7f9e25dc@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-03 13:12:08 +01:00
MD Danish Anwar	f56438a74d	net: ti: icssg: Add HSR/PRP protocol frame filtering Add support for HSR and PRP protocol frame filtering in the ICSSG classifier by configuring filter table 3 (FT3) to detect PTP frames (EtherType 0x88F7) in HSR/PRP tagged packets. Also add rx_class_or_base to miig_rt_offsets structure to support RX_CLASS_OR register access, and fix typos in FT1_N_REG and FT3_N_REG macros (slize -> slice). Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Link: https://patch.msgid.link/20260227174254.3821443-1-danishanwar@ti.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-03 12:45:30 +01:00
Paolo Abeni	c006396f59	Merge branch 'dpll-zl3073x-consolidate-chip-info-and-add-temperature-reporting' Ivan Vecera says: ==================== dpll: zl3073x: consolidate chip info and add temperature reporting This series refactors the ZL3073x chip variant handling and adds die temperature reporting for chips that support it. Patch 1 replaces the five per-variant chip_info structures and their exported symbols with a single consolidated lookup table. The chip variant is now detected at runtime from the chip ID register rather than being selected at compile time via bus driver match data. This simplifies the I2C/SPI drivers and makes adding new variants a single-line table addition. A flags field replaces the hardcoded chip_id switch in zl3073x_dev_is_ref_phase_comp_32bit(). Patch 2 uses the new flags infrastructure to add die temperature reporting for chip variants that provide a temperature status register. The temp_get callback is conditionally set during device registration based on the ZL3073X_FLAG_DIE_TEMP chip flag. ==================== Link: https://patch.msgid.link/20260227105300.710272-1-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-03 12:15:32 +01:00
Ivan Vecera	3a97e02b3e	dpll: zl3073x: add die temperature reporting for supported chips Some zl3073x chip variants (0x1Exx, 0x2Exx and 0x3FC4) provide a die temperature status register with 0.1 C resolution. Add a ZL3073X_FLAG_DIE_TEMP chip flag to identify these variants and implement zl3073x_dpll_temp_get() as the dpll_device_ops.temp_get callback. The register value is converted from 0.1 C units to millidegrees as expected by the DPLL subsystem. To support per-instance ops selection, copy the base dpll_device_ops into struct zl3073x_dpll and conditionally set .temp_get during device registration based on the chip flag. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Link: https://patch.msgid.link/20260227105300.710272-3-ivecera@redhat.com Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-03 12:15:30 +01:00
Ivan Vecera	4845f2fff7	dpll: zl3073x: detect DPLL channel count from chip ID at runtime Replace the five per-variant zl3073x_chip_info structures and their exported symbol definitions with a single consolidated chip ID lookup table. The chip variant is now detected at runtime by reading the chip ID register from hardware and looking it up in the table, rather than being selected at compile time via the bus driver match data. Repurpose struct zl3073x_chip_info to hold a single chip ID, its channel count, and a flags field. Introduce enum zl3073x_flags with ZL3073X_FLAG_REF_PHASE_COMP_32 to replace the chip_id switch statement in zl3073x_dev_is_ref_phase_comp_32bit(). Store a pointer to the detected chip_info entry in struct zl3073x_dev for runtime access. This simplifies the bus drivers by removing per-variant .data and .driver_data references from the I2C/SPI match tables, and makes adding support for new chip variants a single-line table addition. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Link: https://patch.msgid.link/20260227105300.710272-2-ivecera@redhat.com Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-03 12:15:30 +01:00
Dipayaan Roy	2b12ffb669	net: mana: Trigger VF reset/recovery on health check failure due to HWC timeout The GF stats periodic query is used as mechanism to monitor HWC health check. If this HWC command times out, it is a strong indication that the device/SoC is in a faulty state and requires recovery. Today, when a timeout is detected, the driver marks hwc_timeout_occurred, clears cached stats, and stops rescheduling the periodic work. However, the device itself is left in the same failing state. Extend the timeout handling path to trigger the existing MANA VF recovery service by queueing a GDMA_EQE_HWC_RESET_REQUEST work item. This is expected to initiate the appropriate recovery flow by suspende resume first and if it fails then trigger a bus rescan. This change is intentionally limited to HWC command timeouts and does not trigger recovery for errors reported by the SoC as a normal command response. Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/aaFShvKnwR5FY8dH@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-03 11:14:22 +01:00
Randy Dunlap	c69855ada2	atm: atmdev: add function parameter names and description kernel-doc reports function parameters not described for parameters that are not named. Add parameter names for these functions and then describe the function parameters in kernel-doc format. Fixes these warnings: Warning: include/linux/atmdev.h:316 function parameter '' not described in 'register_atm_ioctl' Warning: include/linux/atmdev.h:321 function parameter '' not described in 'deregister_atm_ioctl' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://patch.msgid.link/20260228220845.2978547-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:55:21 -08:00
Kuniyuki Iwashima	425e080a1c	dccp Remove inet_hashinfo2_init_mod(). Commit `c92c81df93` ("net: dccp: fix kernel crash on module load") added inet_hashinfo2_init_mod() for DCCP. Commit `22d6c9eebf` ("net: Unexport shared functions for DCCP.") removed EXPORT_SYMBOL_GPL() it but forgot to remove the function itself. Let's remove inet_hashinfo2_init_mod(). Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260301063756.1581685-1-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:50:28 -08:00
Jakub Kicinski	cd99465222	Merge branch 'ipmr-no-rtnl-for-rtnl_family_ipmr-rtnetlink' Kuniyuki Iwashima says: ==================== ipmr: No RTNL for RTNL_FAMILY_IPMR rtnetlink. This series removes RTNL from ipmr rtnetlink handlers. After this series, there are a few RTNL left in net/ipv4/ipmr.c and such users will be converted to per-netns RTNL in another series. Patch 1 adds a selftest to exercise most? of the RTNL paths in net/ipv4/ipmr.c Patch 2 - 6 converts RTM_GETLINK / RTM_GETROUTE handlers to RCU. Patch 7 - 9 converts ->exit_batch() to ->exit_rtnl() to save one RTNL in cleanup_net(). Patch 10 - 11 removes unnecessary RTNL during setup_net() failure. Patch 12 is a random cleanup. Patch 13 - 15 drops RTNL for RTM_NEWROUTE and RTM_DELROUTE. ==================== Link: https://patch.msgid.link/20260228221800.1082070-1-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:43 -08:00
Kuniyuki Iwashima	bddafc06ca	ipmr: Don't hold RTNL for ipmr_rtm_route(). ipmr_mfc_add() and ipmr_mfc_delete() are already protected by a dedicated mutex. rtm_to_ipmr_mfcc() calls __ipmr_get_table(), __dev_get_by_index(), amd ipmr_find_vif(). Once __dev_get_by_index() is converted to dev_get_by_index_rcu(), we can move the other two functions under that same RCU section and drop RTNL for ipmr_rtm_route(). Let's do that conversion and drop ASSERT_RTNL() in mr_call_mfc_notifiers(). Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-16-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:41 -08:00
Kuniyuki Iwashima	3c1e53e554	ipmr: Add dedicated mutex for mrt->{mfc_hash,mfc_cache_list}. We will no longer hold RTNL for ipmr_rtm_route() to modify the MFC hash table. Only __dev_get_by_index() in rtm_to_ipmr_mfcc() is the RTNL dependant, otherwise, we just need protection for mrt->mfc_hash and mrt->mfc_cache_list. Let's add a new mutex for ipmr_mfc_add(), ipmr_mfc_delete(), and mroute_clean_tables() (setsockopt(MRT_FLUSH or MRT_DONE)). Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-15-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:41 -08:00
Kuniyuki Iwashima	4480d5fa1f	ipmr/ip6mr: Convert net->ipv[46].ipmr_seq to atomic_t. We will no longer hold RTNL for ipmr_mfc_add() and ipmr_mfc_delete(). MFC entry can be loosely connected with VIF by its index for mrt->vif_table[] (stored in mfc_parent), but the two tables are not synchronised. i.e. Even if VIF 1 is removed, MFC for VIF 1 is not automatically removed. The only field that the MFC/VIF interfaces share is net->ipv[46].ipmr_seq, which is protected by RTNL. Adding a new mutex for both just to protect a single field is overkill. Let's convert the field to atomic_t. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-14-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:41 -08:00
Kuniyuki Iwashima	1c36d186a0	ipmr: Define net->ipv4.{ipmr_notifier_ops,ipmr_seq} under CONFIG_IP_MROUTE. net->ipv4.ipmr_notifier_ops and net->ipv4.ipmr_seq are used only in net/ipv4/ipmr.c. Let's move these definitions under CONFIG_IP_MROUTE. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-13-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:41 -08:00
Kuniyuki Iwashima	478c2add78	ipmr: Call fib_rules_unregister() without RTNL. fib_rules_unregister() removes ops from net->rules_ops under spinlock, calls ops->delete() for each rule, and frees the ops. ipmr_rules_ops_template does not have ->delete(), and any operation does not require RTNL there. Let's move fib_rules_unregister() from ipmr_rules_exit_rtnl() to ipmr_net_exit(). Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-12-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:41 -08:00
Kuniyuki Iwashima	4a11adcd9e	ipmr: Remove RTNL in ipmr_rules_init() and ipmr_net_init(). When ipmr_free_table() is called from ipmr_rules_init() or ipmr_net_init(), the netns is not yet published. Thus, no device should have been registered, and mroute_clean_tables() will not call vif_delete(), so unregister_netdevice_many() is unnecessary. unregister_netdevice_many() does nothing if the list is empty, but it requires RTNL due to the unconditional ASSERT_RTNL() at the entry of unregister_netdevice_many_notify(). Let's remove unnecessary RTNL and ASSERT_RTNL() and instead add WARN_ON_ONCE() in ipmr_free_table(). Note that we use a local list for the new WARN_ON_ONCE() because dev_kill_list passed from ipmr_rules_exit_rtnl() may have some devices when other ops->init() fails after ipmr durnig setup_net(). Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-11-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:40 -08:00
Kuniyuki Iwashima	b22b018674	ipmr: Convert ipmr_net_exit_batch() to ->exit_rtnl(). ipmr_net_ops uses ->exit_batch() to acquire RTNL only once for dying network namespaces. ipmr does not depend on the ordering of ->exit_rtnl() and ->exit_batch() of other pernet_operations (unlike fib_net_ops). Once ipmr_free_table() is called and all devices are queued for destruction in ->exit_rtnl(), later during NETDEV_UNREGISTER, ipmr_device_event() will not see anything in vif table and just do nothing. Let's convert ipmr_net_exit_batch() to ->exit_rtnl(). Note that fib_rules_unregister() does not need RTNL and we will remove RTNL and unregister_netdevice_many() in ipmr_net_init(). Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-10-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:40 -08:00
Kuniyuki Iwashima	b7fdc3cfb6	ipmr: Move unregister_netdevice_many() out of ipmr_free_table(). This is a prep commit to convert ipmr_net_exit_batch() to ->exit_rtnl(). Let's move unregister_netdevice_many() in ipmr_free_table() to its callers. Now ipmr_rules_exit() can do batching all tables per netns. Note that later we will remove RTNL and unregister_netdevice_many() in ipmr_rules_init(). Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-9-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:40 -08:00
Kuniyuki Iwashima	3810f9529d	ipmr: Move unregister_netdevice_many() out of mroute_clean_tables(). This is a prep commit to convert ipmr_net_exit_batch() to ->exit_rtnl(). Let's move unregister_netdevice_many() in mroute_clean_tables() to its callers. As a bonus, mrtsock_destruct() can do batching for all tables. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-8-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:40 -08:00
Kuniyuki Iwashima	2c698bab29	ipmr: Convert ipmr_rtm_dumproute() to RCU. ipmr_rtm_dumproute() calls mr_table_dump() or mr_rtm_dumproute(), and mr_rtm_dumproute() finally calls mr_table_dump(). mr_table_dump() calls the passed function, _ipmr_fill_mroute(). _ipmr_fill_mroute() is a wrapper of ipmr_fill_mroute() to cast struct mr_mfc * to struct mfc_cache *. ipmr_fill_mroute() can be already called safely under RCU. Let's convert ipmr_rtm_dumproute() to RCU. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-7-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:40 -08:00
Kuniyuki Iwashima	295a17b3ea	ipmr: Convert ipmr_rtm_getroute() to RCU. ipmr_rtm_getroute() calls __ipmr_get_table(), ipmr_cache_find(), and ipmr_fill_mroute(). The table is not removed until netns dismantle, and net->ipv4.mr_tables is managed with RCU list API, so __ipmr_get_table() is safe under RCU. struct mfc_cache is freed by mr_cache_put() after RCU grace period, so we can use ipmr_cache_find() under RCU. rcu_read_lock() around it was just to avoid lockdep splat for rhl_for_each_entry_rcu(). ipmr_fill_mroute() calls mr_fill_mroute(), which properly uses RCU. Let's drop RTNL for ipmr_rtm_getroute() and use RCU instead. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-6-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:40 -08:00
Kuniyuki Iwashima	2bd6c9d600	ipmr: Use MAXVIFS in mroute_msgsize(). mroute_msgsize() calculates skb size needed for ipmr_fill_mroute(). The size differs based on mrt->maxvif. We will drop RTNL for ipmr_rtm_getroute() and mrt->maxvif may change under RCU. To avoid -EMSGSIZE, let's calculate the size with the maximum value of mrt->maxvif, MAXVIFS. struct rtnexthop is 8 bytes and MAXVIFS is 32, so the maximum delta is 256 bytes, which is small enough. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-5-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:39 -08:00
Kuniyuki Iwashima	402a8111d7	ipmr: Convert ipmr_rtm_dumplink() to RCU. net->ipv4.mr_tables is updated under RTNL and can be read safely under RCU. Once created, the multicast route tables are not removed until netns dismantle. ipmr_rtm_dumplink() does not need RTNL protection for ipmr_for_each_table() and ipmr_fill_table() if RCU is held. Even if mrt->maxvif changes concurrently, ipmr_fill_vif() returns true to continue dumping the next table. Let's convert it to RCU. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-4-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:39 -08:00
Kuniyuki Iwashima	261950e039	ipmr: Annotate access to mrt->mroute_do_{pim,assert,wrvifwhole}. These fields in struct mr_table are updated in ip_mroute_setsockopt() under RTNL: * mroute_do_pim * mroute_do_assert * mroute_do_wrvifwhole However, ip_mroute_getsockopt() does not hold RTNL and read the first two fields locklessly, and ip_mr_forward() reads all the three under RCU. pim_rcv_v1() also reads mroute_do_pim locklessly. Let's use WRITE_ONCE() and READ_ONCE() for them. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-3-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:39 -08:00
Kuniyuki Iwashima	05068eaa67	selftest: net: Add basic functionality tests for ipmr. The new test exercise paths, where RTNL is needed, to catch lockdep splat: setsockopt MRT_INIT / MRT_DONE MRT_ADD_VIF / MRT_DEL_VIF MRT_ADD_MFC / MRT_DEL_MFC / MRT_ADD_MFC_PROXY / MRT_DEL_MFC_PROXY MRT_TABLE MRT_FLUSH rtnetlink RTM_NEWROUTE RTM_DELROUTE NETDEV_UNREGISTER I will extend this to cover IPv6 setsockopt() later. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-2-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:49:39 -08:00
Eric Dumazet	a0e8c9a506	mpls: remove test against ipv6_stub ipv6_stub is never NULL, let's remove this test. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260228175715.1195536-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:48:43 -08:00
Jakub Kicinski	c34604572e	Merge branch 'net-sparx5-clean-up-probe-remove-init-and-deinit-paths' Daniel Machon says: ==================== net: sparx5: clean up probe/remove init and deinit paths This series refactors the sparx5 init and deinit code out of sparx5_start() and into probe(), adding proper per-subsystem cleanup labels and deinit functions. Currently, the sparx5 driver initializes most subsystems inside sparx5_start(), which is called from probe(). This includes registering netdevs, starting worker threads for stats and MAC table polling, requesting PTP IRQs, and initializing VCAP. The function has grown to handle many unrelated subsystems, and has no granular error handling — it either succeeds entirely or returns an error, leaving cleanup to a single catch-all label in probe(). The remove() path has a similar problem: teardown is not structured as the reverse of initialization, and several subsystems lack proper deinit functions. For example, the stats workqueue has no corresponding cleanup, and the mact workqueue is destroyed without first cancelling its delayed work. Refactor this by moving each init function out of sparx5_start() and into probe(), with a corresponding goto-based cleanup label. Add deinit functions for subsystems that allocate resources, to properly cancel work and destroy workqueues. Ensure that cleanup order in both error paths and remove() follows the reverse of initialization order. sparx5_start() is eliminated entirely — its hardware register setup is renamed to sparx5_forwarding_init() and its FDMA/XTR setup is extracted to sparx5_frame_io_init(). Before this series, most init functions live inside sparx5_start() with no individual cleanup: probe(): sparx5_start(): <- no granular error handling sparx5_mact_init() sparx_stats_init() <- starts worker, no cleanup mact_queue setup <- no cancel on teardown sparx5_register_netdevs() sparx5_register_notifier_blocks() sparx5_vcap_init() sparx5_ptp_init() probe() error path: cleanup_ports: sparx5_cleanup_ports() destroy_workqueue(mact_queue) After this series, probe() initializes subsystems in order with matching cleanup labels, and remove() tears down in reverse: probe(): sparx5_pgid_init() sparx5_vlan_init() sparx5_board_init() sparx5_forwarding_init() sparx5_calendar_init() -> cleanup_ports sparx5_qos_init() -> cleanup_ports sparx5_vcap_init() -> cleanup_ports sparx5_mact_init() -> cleanup_vcap sparx5_stats_init() -> cleanup_mact sparx5_frame_io_init() -> cleanup_stats sparx5_ptp_init() -> cleanup_frame_io sparx5_register_netdevs() -> cleanup_ptp sparx5_register_notifier_blocks() -> cleanup_netdevs remove(): sparx5_unregister_notifier_blocks() sparx5_unregister_netdevs() sparx5_ptp_deinit() sparx5_frame_io_deinit() sparx5_stats_deinit() sparx5_mact_deinit() sparx5_vcap_deinit() sparx5_destroy_netdevs() ==================== Link: https://patch.msgid.link/20260227-sparx5-init-deinit-v2-0-10ba54ccf005@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:46:24 -08:00
Daniel Machon	1e540c4d8f	net: sparx5: replace sparx5_start() with sparx5_forwarding_init() With all subsystem initializations moved out, sparx5_start() only sets up forwarding (UPSIDs, CPU ports, masks, PGIDs, FCS, watermarks). Rename it to sparx5_forwarding_init() and make it void since it cannot fail. This removes sparx5_start() entirely. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20260227-sparx5-init-deinit-v2-9-10ba54ccf005@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:46:22 -08:00
Daniel Machon	8b1e4a6747	net: sparx5: move FDMA/XTR initialization out of sparx5_start() Move the Frame DMA and register-based extraction initialization out of sparx5_start() and into a new sparx5_frame_io_init() function, called from probe(). Also, add sparx5_frame_io_deinit() for the cleanup path. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20260227-sparx5-init-deinit-v2-8-10ba54ccf005@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:46:22 -08:00
Daniel Machon	0432c60112	net: sparx5: move PTP IRQ handling out of sparx5_start() Move the PTP IRQ request into sparx5_ptp_init() so all PTP setup is done in one place. Also move the sparx5_ptp_init() call to right before sparx5_register_netdevs() and add a cleanup_ptp label. Update remove() to disable the PTP IRQ and reorder ptp_deinit accordingly. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20260227-sparx5-init-deinit-v2-7-10ba54ccf005@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-02 18:46:22 -08:00

1 2 3 4 5 ...

1427213 Commits