linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-17 01:23:58 -04:00

Author	SHA1	Message	Date
Greg Kroah-Hartman	38a01c9700	can: ems_usb: ems_usb_read_bulk_callback(): check the proper length of a message When looking at the data in a USB urb, the actual_length is the size of the buffer passed to the driver, not the transfer_buffer_length which is set by the driver as the max size of the buffer. When parsing the messages in ems_usb_read_bulk_callback() properly check the size both at the beginning of parsing the message to make sure it is big enough for the expected structure, and at the end of the message to make sure we don't overflow past the end of the buffer for the next message. Cc: Vincent Mailhol <mailhol@kernel.org> Cc: Marc Kleine-Budde <mkl@pengutronix.de> Cc: stable@kernel.org Assisted-by: gkh_clanker_2000 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/2026022316-answering-strainer-a5db@gregkh Fixes: `702171adee` ("ems_usb: Added support for EMS CPC-USB/ARM7 CAN/USB interface") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2026-03-02 11:03:42 +01:00
Ziyi Guo	968b098220	can: esd_usb: add endpoint type validation esd_usb_probe() constructs bulk pipes for two endpoints without verifying their transfer types: - usb_rcvbulkpipe(dev->udev, 1) for RX (version reply, async RX data) - usb_sndbulkpipe(dev->udev, 2) for TX (version query, CAN frames) A malformed USB device can present these endpoints with transfer types that differ from what the driver assumes, triggering the WARNING in usb_submit_urb(). Use usb_find_common_endpoints() to discover and validate the first bulk IN and bulk OUT endpoints at probe time, before any allocation. Found pipes are saved to struct esd_usb and code uses them directly instead of making pipes in place. Similar to - commit `136bed0bfd` ("can: mcba_usb: properly check endpoint type") which established the usb_find_common_endpoints() + stored pipes pattern for CAN USB drivers. Fixes: `96d8e90382` ("can: Add driver for esd CAN-USB/2 device") Suggested-by: Vincent Mailhol <mailhol@kernel.org> Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu> Reviewed-by: Vincent Mailhol <mailhol@kernel.org> Link: https://patch.msgid.link/20260213203927.599163-1-n7l8m4@u.northwestern.edu Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2026-03-02 11:03:32 +01:00
Alban Bedel	ab3f894de2	can: mcp251x: fix deadlock in error path of mcp251x_open The mcp251x_open() function call free_irq() in its error path with the mpc_lock mutex held. But if an interrupt already occurred the interrupt handler will be waiting for the mpc_lock and free_irq() will deadlock waiting for the handler to finish. This issue is similar to the one fixed in commit `7dd9c26bd6` ("can: mcp251x: fix deadlock if an interrupt occurs during mcp251x_open") but for the error path. To solve this issue move the call to free_irq() after the lock is released. Setting `priv->force_quit = 1` beforehand ensure that the IRQ handler will exit right away once it acquired the lock. Signed-off-by: Alban Bedel <alban.bedel@lht.dlh.de> Link: https://patch.msgid.link/20260209144706.2261954-1-alban.bedel@lht.dlh.de Fixes: `bf66f3736a` ("can: mcp251x: Move to threaded interrupts instead of workqueues.") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2026-03-02 10:24:41 +01:00
Oliver Hartkopp	c77bfbdd6a	can: dummy_can: dummy_can_init(): fix packet statistics The former implementation was only counting the tx_packets value but not the tx_bytes as the skb was dropped on driver layer. Enable CAN echo support (IFF_ECHO) in dummy_can_init(), which activates the code for setting and retrieving the echo SKB and counts the tx_bytes correctly. Fixes: `816cf430e8` ("can: add dummy_can driver") Cc: Vincent Mailhol <mailhol@kernel.org> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Reviewed-by: Vincent Mailhol <mailhol@kernel.org> Link: https://patch.msgid.link/20260126104540.21024-1-socketcan@hartkopp.net [mkl: make commit message imperative] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2026-03-02 10:24:41 +01:00
Raju Rangoju	9439a661c2	amd-xgbe: fix MAC_TCR_SS register width for 2.5G and 10M speeds Extend the MAC_TCR_SS (Speed Select) register field width from 2 bits to 3 bits to properly support all speed settings. The MAC_TCR register's SS field encoding requires 3 bits to represent all supported speeds: - 0x00: 10Gbps (XGMII) - 0x02: 2.5Gbps (GMII) / 100Mbps - 0x03: 1Gbps / 10Mbps - 0x06: 2.5Gbps (XGMII) - P100a only With only 2 bits, values 0x04-0x07 cannot be represented, which breaks 2.5G XGMII mode on newer platforms and causes incorrect speed select values to be programmed. Fixes: `07445f3c7c` ("amd-xgbe: Add support for 10 Mbps speed") Co-developed-by: Guruvendra Punugupati <Guruvendra.Punugupati@amd.com> Signed-off-by: Guruvendra Punugupati <Guruvendra.Punugupati@amd.com> Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Link: https://patch.msgid.link/20260226170753.250312-1-Raju.Rangoju@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 14:22:34 -08:00
MD Danish Anwar	147792c395	net: ti: icssg-prueth: Fix ping failure after offload mode setup when link speed is not 1G When both eth interfaces with links up are added to a bridge or hsr interface, ping fails if the link speed is not 1Gbps (e.g., 100Mbps). The issue is seen because when switching to offload (bridge/hsr) mode, prueth_emac_restart() restarts the firmware and clears DRAM with memset_io(), setting all memory to 0. This includes PORT_LINK_SPEED_OFFSET which firmware reads for link speed. The value 0 corresponds to FW_LINK_SPEED_1G (0x00), so for 1Gbps links the default value is correct and ping works. For 100Mbps links, the firmware needs FW_LINK_SPEED_100M (0x01) but gets 0 instead, causing ping to fail. The function emac_adjust_link() is called to reconfigure, but it detects no state change (emac->link is still 1, speed/duplex match PHY) so new_state remains false and icssg_config_set_speed() is never called to correct the firmware speed value. The fix resets emac->link to 0 before calling emac_adjust_link() in prueth_emac_common_start(). This forces new_state=true, ensuring icssg_config_set_speed() is called to write the correct speed value to firmware memory. Fixes: `06feac1540` ("net: ti: icssg-prueth: Fix emac link speed handling") Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Link: https://patch.msgid.link/20260226102356.2141871-1-danishanwar@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 13:41:35 -08:00
Guenter Roeck	74badb9c20	dpaa2-switch: Fix interrupt storm after receiving bad if_id in IRQ handler Commit `31a7a0bbeb` ("dpaa2-switch: add bounds check for if_id in IRQ handler") introduces a range check for if_id to avoid an out-of-bounds access. If an out-of-bounds if_id is detected, the interrupt status is not cleared. This may result in an interrupt storm. Clear the interrupt status after detecting an out-of-bounds if_id to avoid the problem. Found by an experimental AI code review agent at Google. Fixes: `31a7a0bbeb` ("dpaa2-switch: add bounds check for if_id in IRQ handler") Cc: Junrui Luo <moonafterrain@outlook.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://patch.msgid.link/20260227055812.1777915-1-linux@roeck-us.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 09:01:41 -08:00
Jakub Kicinski	6df0022b6c	Merge branch '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2026-02-19 (idpf, ice, i40e, ixgbevf, e1000e) For idpf: Li Li moves the check for software marker to occur after incrementing next to clean to avoid re-encountering the same packet. He also adds a couple of checks to prevent NULL pointer dereferences and NULLs rss_key, after free, in error path so that later checks are properly evaluated. Brian Vazquez adjusts IRQ naming to have correlation with netdev naming. Sreedevi removes validation of action type as part of ntuple rule deletion. For ice: Aaron Ma breaks RDMA initialization into two steps and adjusts calls so that VSIs are entirely configured before plugging. Michal Schmidt fixes initialization of loopback VSI to have proper resources allocated to allow for loopback testing to occur. For i40e: Thomas Gleixner fixes a leak of preempt count by replacing get_cpu() with smp_processor_id(). For ixgbevf: Jedrzej adds a check for mailbox version before attempting to call an associated link state call that is supported in that mailbox version. For e1000e: Vitaly clears power gating feature for Panther Lake systems to avoid packet issues. * '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: e1000e: clear DPG_EN after reset to avoid autonomous power-gating e1000e: introduce new board type for Panther Lake PCH ixgbevf: fix link setup issue i40e: Fix preempt count leak in napi poll tracepoint ice: fix crash in ethtool offline loopback test ice: recap the VSI and QoS info after rebuild idpf: Fix flow rule delete failure due to invalid validation idpf: change IRQ naming to match netdev and ethtool queue numbering idpf: nullify pointers after they are freed idpf: skip deallocating txq group's txqs if it is NULL idpf: skip deallocating bufq_sets from rx_qgrp if it is NULL idpf: increment completion queue next_to_clean in sw marker wait routine ==================== Link: https://patch.msgid.link/20260225211546.1949260-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-28 08:43:56 -08:00
Long Li	dabffd0854	net: mana: Ring doorbell at 4 CQ wraparounds MANA hardware requires at least one doorbell ring every 8 wraparounds of the CQ. The driver rings the doorbell as a form of flow control to inform hardware that CQEs have been consumed. The NAPI poll functions mana_poll_tx_cq() and mana_poll_rx_cq() can poll up to CQE_POLLING_BUFFER (512) completions per call. If the CQ has fewer than 512 entries, a single poll call can process more than 4 wraparounds without ringing the doorbell. The doorbell threshold check also uses ">" instead of ">=", delaying the ring by one extra CQE beyond 4 wraparounds. Combined, these issues can cause the driver to exceed the 8-wraparound hardware limit, leading to missed completions and stalled queues. Fix this by capping the number of CQEs polled per call to 4 wraparounds of the CQ in both TX and RX paths. Also change the doorbell threshold from ">" to ">=" so the doorbell is rung as soon as 4 wraparounds are reached. Cc: stable@vger.kernel.org Fixes: `58a63729c9` ("net: mana: Fix doorbell out of order violation and avoid unnecessary doorbell rings") Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20260226192833.1050807-1-longli@microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-27 19:29:38 -08:00
Valentin Spreckels	15fba71533	net: usb: r8152: add TRENDnet TUC-ET2G The TRENDnet TUC-ET2G is a RTL8156 based usb ethernet adapter. Add its vendor and product IDs. Signed-off-by: Valentin Spreckels <valentin@spreckels.dev> Link: https://patch.msgid.link/20260226195409.7891-2-valentin@spreckels.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-27 19:27:31 -08:00
Chintan Vankar	be11a53722	net: ethernet: ti: am65-cpsw-nuss/cpsw-ale: Fix multicast entry handling in ALE table In the current implementation, flushing multicast entries in MAC mode incorrectly deletes entries for all ports instead of only the target port, disrupting multicast traffic on other ports. The cause is adding multicast entries by setting only host port bit, and not setting the MAC port bits. Fix this by setting the MAC port's bit in the port mask while adding the multicast entry. Also fix the flush logic to preserve the host port bit during removal of MAC port and free ALE entries when mask contains only host port. Fixes: `5c50a856d5` ("drivers: net: ethernet: cpsw: add multicast address to ALE table") Signed-off-by: Chintan Vankar <c-vankar@ti.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260224181359.2055322-1-c-vankar@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-26 19:43:54 -08:00
Linus Torvalds	b9c8fc2cae	Merge tag 'net-7.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from IPsec, Bluetooth and netfilter Current release - regressions: - wifi: fix dev_alloc_name() return value check - rds: fix recursive lock in rds_tcp_conn_slots_available Current release - new code bugs: - vsock: lock down child_ns_mode as write-once Previous releases - regressions: - core: - do not pass flow_id to set_rps_cpu() - consume xmit errors of GSO frames - netconsole: avoid OOB reads, msg is not nul-terminated - netfilter: h323: fix OOB read in decode_choice() - tcp: re-enable acceptance of FIN packets when RWIN is 0 - udplite: fix null-ptr-deref in __udp_enqueue_schedule_skb(). - wifi: brcmfmac: fix potential kernel oops when probe fails - phy: register phy led_triggers during probe to avoid AB-BA deadlock - eth: - bnxt_en: fix deleting of Ntuple filters - wan: farsync: fix use-after-free bugs caused by unfinished tasklets - xscale: check for PTP support properly Previous releases - always broken: - tcp: fix potential race in tcp_v6_syn_recv_sock() - kcm: fix zero-frag skb in frag_list on partial sendmsg error - xfrm: - fix race condition in espintcp_close() - always flush state and policy upon NETDEV_UNREGISTER event - bluetooth: - purge error queues in socket destructors - fix response to L2CAP_ECRED_CONN_REQ - eth: - mlx5: - fix circular locking dependency in dump - fix "scheduling while atomic" in IPsec MAC address query - gve: fix incorrect buffer cleanup for QPL - team: avoid NETDEV_CHANGEMTU event when unregistering slave - usb: validate USB endpoints" * tag 'net-7.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits) netfilter: nf_conntrack_h323: fix OOB read in decode_choice() dpaa2-switch: validate num_ifs to prevent out-of-bounds write net: consume xmit errors of GSO frames vsock: document write-once behavior of the child_ns_mode sysctl vsock: lock down child_ns_mode as write-once selftests/vsock: change tests to respect write-once child ns mode net/mlx5e: Fix "scheduling while atomic" in IPsec MAC address query net/mlx5: Fix missing devlink lock in SRIOV enable error path net/mlx5: E-switch, Clear legacy flag when moving to switchdev net/mlx5: LAG, disable MPESW in lag_disable_change() net/mlx5: DR, Fix circular locking dependency in dump selftests: team: Add a reference count leak test team: avoid NETDEV_CHANGEMTU event when unregistering slave net: mana: Fix double destroy_workqueue on service rescan PCI path MAINTAINERS: Update maintainer entry for QUALCOMM ETHQOS ETHERNET DRIVER dpll: zl3073x: Remove redundant cleanup in devm_dpll_init() selftests/net: packetdrill: Verify acceptance of FIN packets when RWIN is 0 tcp: re-enable acceptance of FIN packets when RWIN is 0 vsock: Use container_of() to get net namespace in sysctl handlers net: usb: kaweth: validate USB endpoints ...	2026-02-26 08:00:13 -08:00
Junrui Luo	8a5752c6dc	dpaa2-switch: validate num_ifs to prevent out-of-bounds write The driver obtains sw_attr.num_ifs from firmware via dpsw_get_attributes() but never validates it against DPSW_MAX_IF (64). This value controls iteration in dpaa2_switch_fdb_get_flood_cfg(), which writes port indices into the fixed-size cfg->if_id[DPSW_MAX_IF] array. When firmware reports num_ifs >= 64, the loop can write past the array bounds. Add a bound check for num_ifs in dpaa2_switch_init(). dpaa2_switch_fdb_get_flood_cfg() appends the control interface (port num_ifs) after all matched ports. When num_ifs == DPSW_MAX_IF and all ports match the flood filter, the loop fills all 64 slots and the control interface write overflows by one entry. The check uses >= because num_ifs == DPSW_MAX_IF is also functionally broken. build_if_id_bitmap() silently drops any ID >= 64: if (id[i] < DPSW_MAX_IF) bmap[id[i] / 64] \|= ... Fixes: `539dda3c5d` ("staging: dpaa2-switch: properly setup switching domains") Signed-off-by: Junrui Luo <moonafterrain@outlook.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://patch.msgid.link/SYBPR01MB78812B47B7F0470B617C408AAF74A@SYBPR01MB7881.ausprd01.prod.outlook.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-02-26 12:37:21 +01:00
Jianbo Liu	859380694f	net/mlx5e: Fix "scheduling while atomic" in IPsec MAC address query Fix a "scheduling while atomic" bug in mlx5e_ipsec_init_macs() by replacing mlx5_query_mac_address() with ether_addr_copy() to get the local MAC address directly from netdev->dev_addr. The issue occurs because mlx5_query_mac_address() queries the hardware which involves mlx5_cmd_exec() that can sleep, but it is called from the mlx5e_ipsec_handle_event workqueue which runs in atomic context. The MAC address is already available in netdev->dev_addr, so no need to query hardware. This avoids the sleeping call and resolves the bug. Call trace: BUG: scheduling while atomic: kworker/u112:2/69344/0x00000200 __schedule+0x7ab/0xa20 schedule+0x1c/0xb0 schedule_timeout+0x6e/0xf0 __wait_for_common+0x91/0x1b0 cmd_exec+0xa85/0xff0 [mlx5_core] mlx5_cmd_exec+0x1f/0x50 [mlx5_core] mlx5_query_nic_vport_mac_address+0x7b/0xd0 [mlx5_core] mlx5_query_mac_address+0x19/0x30 [mlx5_core] mlx5e_ipsec_init_macs+0xc1/0x720 [mlx5_core] mlx5e_ipsec_build_accel_xfrm_attrs+0x422/0x670 [mlx5_core] mlx5e_ipsec_handle_event+0x2b9/0x460 [mlx5_core] process_one_work+0x178/0x2e0 worker_thread+0x2ea/0x430 Fixes: `cee137a634` ("net/mlx5e: Handle ESN update events") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260224114652.1787431-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-25 20:01:44 -08:00
Shay Drory	60253042c0	net/mlx5: Fix missing devlink lock in SRIOV enable error path The cited commit miss to add locking in the error path of mlx5_sriov_enable(). When pci_enable_sriov() fails, mlx5_device_disable_sriov() is called to clean up. This cleanup function now expects to be called with the devlink instance lock held. Add the missing devl_lock(devlink) and devl_unlock(devlink) Fixes: `84a433a40d` ("net/mlx5: Lock mlx5 devlink reload callbacks") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260224114652.1787431-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-25 20:01:44 -08:00
Shay Drory	d7073e8b97	net/mlx5: E-switch, Clear legacy flag when moving to switchdev The cited commit introduced MLX5_PRIV_FLAGS_SWITCH_LEGACY to identify when a transition to legacy mode is requested via devlink. However, the logic failed to clear this flag if the mode was subsequently changed back to MLX5_ESWITCH_OFFLOADS (switchdev). Consequently, if a user toggled from legacy to switchdev, the flag remained set, leaving the driver with wrong state indicating Fix this by explicitly clearing the MLX5_PRIV_FLAGS_SWITCH_LEGACY bit when the requested mode is MLX5_ESWITCH_OFFLOADS. Fixes: `2a4f56fbcc` ("net/mlx5e: Keep netdev when leave switchdev for devlink set legacy only") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260224114652.1787431-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-25 20:01:44 -08:00
Shay Drory	bd7b9f83fb	net/mlx5: LAG, disable MPESW in lag_disable_change() mlx5_lag_disable_change() unconditionally called mlx5_disable_lag() when LAG was active, which is incorrect for MLX5_LAG_MODE_MPESW. Hnece, call mlx5_disable_mpesw() when running in MPESW mode. Fixes: `a32327a3a0` ("net/mlx5: Lag, Control MultiPort E-Switch single FDB mode") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260224114652.1787431-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-25 20:01:44 -08:00
Shay Drory	2700b7e603	net/mlx5: DR, Fix circular locking dependency in dump Fix a circular locking dependency between dbg_mutex and the domain rx/tx mutexes that could lead to a deadlock. The dump path in dr_dump_domain_all() was acquiring locks in the order: dbg_mutex -> rx.mutex -> tx.mutex While the table/matcher creation paths acquire locks in the order: rx.mutex -> tx.mutex -> dbg_mutex This inverted lock ordering creates a circular dependency. Fix this by changing dr_dump_domain_all() to acquire the domain lock before dbg_mutex, matching the order used in mlx5dr_table_create() and mlx5dr_matcher_create(). Lockdep splat: ====================================================== WARNING: possible circular locking dependency detected 6.19.0-rc6net_next_e817c4e #1 Not tainted ------------------------------------------------------ sos/30721 is trying to acquire lock: ffff888102df5900 (&dmn->info.rx.mutex){+.+.}-{4:4}, at: dr_dump_start+0x131/0x450 [mlx5_core] but task is already holding lock: ffff888102df5bc0 (&dmn->dump_info.dbg_mutex){+.+.}-{4:4}, at: dr_dump_start+0x10b/0x450 [mlx5_core] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&dmn->dump_info.dbg_mutex){+.+.}-{4:4}: __mutex_lock+0x91/0x1060 mlx5dr_matcher_create+0x377/0x5e0 [mlx5_core] mlx5_cmd_dr_create_flow_group+0x62/0xd0 [mlx5_core] mlx5_create_flow_group+0x113/0x1c0 [mlx5_core] mlx5_chains_create_prio+0x453/0x2290 [mlx5_core] mlx5_chains_get_table+0x2e2/0x980 [mlx5_core] esw_chains_create+0x1e6/0x3b0 [mlx5_core] esw_create_offloads_fdb_tables.cold+0x62/0x63f [mlx5_core] esw_offloads_enable+0x76f/0xd20 [mlx5_core] mlx5_eswitch_enable_locked+0x35a/0x500 [mlx5_core] mlx5_devlink_eswitch_mode_set+0x561/0x950 [mlx5_core] devlink_nl_eswitch_set_doit+0x67/0xe0 genl_family_rcv_msg_doit+0xe0/0x130 genl_rcv_msg+0x188/0x290 netlink_rcv_skb+0x4b/0xf0 genl_rcv+0x24/0x40 netlink_unicast+0x1ed/0x2c0 netlink_sendmsg+0x210/0x450 __sock_sendmsg+0x38/0x60 __sys_sendto+0x119/0x180 __x64_sys_sendto+0x20/0x30 do_syscall_64+0x70/0xd00 entry_SYSCALL_64_after_hwframe+0x4b/0x53 -> #1 (&dmn->info.tx.mutex){+.+.}-{4:4}: __mutex_lock+0x91/0x1060 mlx5dr_table_create+0x11d/0x530 [mlx5_core] mlx5_cmd_dr_create_flow_table+0x62/0x140 [mlx5_core] __mlx5_create_flow_table+0x46f/0x960 [mlx5_core] mlx5_create_flow_table+0x16/0x20 [mlx5_core] esw_create_offloads_fdb_tables+0x136/0x240 [mlx5_core] esw_offloads_enable+0x76f/0xd20 [mlx5_core] mlx5_eswitch_enable_locked+0x35a/0x500 [mlx5_core] mlx5_devlink_eswitch_mode_set+0x561/0x950 [mlx5_core] devlink_nl_eswitch_set_doit+0x67/0xe0 genl_family_rcv_msg_doit+0xe0/0x130 genl_rcv_msg+0x188/0x290 netlink_rcv_skb+0x4b/0xf0 genl_rcv+0x24/0x40 netlink_unicast+0x1ed/0x2c0 netlink_sendmsg+0x210/0x450 __sock_sendmsg+0x38/0x60 __sys_sendto+0x119/0x180 __x64_sys_sendto+0x20/0x30 do_syscall_64+0x70/0xd00 entry_SYSCALL_64_after_hwframe+0x4b/0x53 -> #0 (&dmn->info.rx.mutex){+.+.}-{4:4}: __lock_acquire+0x18b6/0x2eb0 lock_acquire+0xd3/0x2c0 __mutex_lock+0x91/0x1060 dr_dump_start+0x131/0x450 [mlx5_core] seq_read_iter+0xe3/0x410 seq_read+0xfb/0x130 full_proxy_read+0x53/0x80 vfs_read+0xba/0x330 ksys_read+0x65/0xe0 do_syscall_64+0x70/0xd00 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&dmn->dump_info.dbg_mutex); lock(&dmn->info.tx.mutex); lock(&dmn->dump_info.dbg_mutex); lock(&dmn->info.rx.mutex); * DEADLOCK * Fixes: `9222f0b27d` ("net/mlx5: DR, Add support for dumping steering info") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260224114652.1787431-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-25 20:01:43 -08:00
Jakub Kicinski	6668c6f2dd	Merge tag 'wireless-2026-02-25' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== A good number of fixes: - cfg80211: - cancel rfkill work appropriately - fix radiotap parsing to correctly reject field 18 - fix wext (yes...) off-by-one for IGTK key ID - mac80211: - fix for mesh NULL pointer dereference - fix for stack out-of-bounds (2 bytes) write on specific multi-link action frames - set default WMM parameters for all links - mwifiex: check dev_alloc_name() return value correctly - libertas: fix potential timer use-after-free - brcmfmac: fix crash on probe failure * tag 'wireless-2026-02-25' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: mac80211: fix NULL pointer dereference in mesh_rx_csa_frame() wifi: mac80211: bounds-check link_id in ieee80211_ml_reconfiguration wifi: mac80211: set default WMM parameters on all links wifi: libertas: fix use-after-free in lbs_free_adapter() wifi: mwifiex: Fix dev_alloc_name() return value check wifi: brcmfmac: Fix potential kernel oops when probe fails wifi: radiotap: reject radiotap with unknown bits wifi: cfg80211: cancel rfkill_block work in wiphy_unregister() wifi: cfg80211: wext: fix IGTK key ID off-by-one ==================== Link: https://patch.msgid.link/20260225113159.360574-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-25 19:54:28 -08:00
Tetsuo Handa	bb4c698633	team: avoid NETDEV_CHANGEMTU event when unregistering slave syzbot is reporting unregister_netdevice: waiting for netdevsim0 to become free. Usage count = 3 ref_tracker: netdev@ffff88807dcf8618 has 1/2 users at __netdev_tracker_alloc include/linux/netdevice.h:4400 [inline] netdev_hold include/linux/netdevice.h:4429 [inline] inetdev_init+0x201/0x4e0 net/ipv4/devinet.c:286 inetdev_event+0x251/0x1610 net/ipv4/devinet.c:1600 notifier_call_chain+0x19d/0x3a0 kernel/notifier.c:85 call_netdevice_notifiers_mtu net/core/dev.c:2318 [inline] netif_set_mtu_ext+0x5aa/0x800 net/core/dev.c:9886 netif_set_mtu+0xd7/0x1b0 net/core/dev.c:9907 dev_set_mtu+0x126/0x260 net/core/dev_api.c:248 team_port_del+0xb07/0xcb0 drivers/net/team/team_core.c:1333 team_del_slave drivers/net/team/team_core.c:1936 [inline] team_device_event+0x207/0x5b0 drivers/net/team/team_core.c:2929 notifier_call_chain+0x19d/0x3a0 kernel/notifier.c:85 call_netdevice_notifiers_extack net/core/dev.c:2281 [inline] call_netdevice_notifiers net/core/dev.c:2295 [inline] __dev_change_net_namespace+0xcb7/0x2050 net/core/dev.c:12592 do_setlink+0x2ce/0x4590 net/core/rtnetlink.c:3060 rtnl_changelink net/core/rtnetlink.c:3776 [inline] __rtnl_newlink net/core/rtnetlink.c:3935 [inline] rtnl_newlink+0x15a9/0x1be0 net/core/rtnetlink.c:4072 rtnetlink_rcv_msg+0x7d5/0xbe0 net/core/rtnetlink.c:6958 netlink_rcv_skb+0x232/0x4b0 net/netlink/af_netlink.c:2550 netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline] netlink_unicast+0x80f/0x9b0 net/netlink/af_netlink.c:1344 netlink_sendmsg+0x813/0xb40 net/netlink/af_netlink.c:1894 problem. Ido Schimmel found steps to reproduce ip link add name team1 type team ip link add name dummy1 mtu 1499 master team1 type dummy ip netns add ns1 ip link set dev dummy1 netns ns1 ip -n ns1 link del dev dummy1 and also found that the same issue was fixed in the bond driver in commit `f51048c3e0` ("bonding: avoid NETDEV_CHANGEMTU event when unregistering slave"). Let's do similar thing for the team driver, with commit `ad7c7b2172` ("net: hold netdev instance lock during sysfs operations") and commit `303a8487a6` ("net: s/__dev_set_mtu/__netif_set_mtu/") also applied. Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 Suggested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Fixes: `3d249d4ca7` ("net: introduce ethernet teaming device") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260224125709.317574-2-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-25 19:17:05 -08:00
Dipayaan Roy	f975a09552	net: mana: Fix double destroy_workqueue on service rescan PCI path While testing corner cases in the driver, a use-after-free crash was found on the service rescan PCI path. When mana_serv_reset() calls mana_gd_suspend(), mana_gd_cleanup() destroys gc->service_wq. If the subsequent mana_gd_resume() fails with -ETIMEDOUT or -EPROTO, the code falls through to mana_serv_rescan() which triggers pci_stop_and_remove_bus_device(). This invokes the PCI .remove callback (mana_gd_remove), which calls mana_gd_cleanup() a second time, attempting to destroy the already- freed workqueue. Fix this by NULL-checking gc->service_wq in mana_gd_cleanup() and setting it to NULL after destruction. Call stack of issue for reference: [Sat Feb 21 18:53:48 2026] Call Trace: [Sat Feb 21 18:53:48 2026] <TASK> [Sat Feb 21 18:53:48 2026] mana_gd_cleanup+0x33/0x70 [mana] [Sat Feb 21 18:53:48 2026] mana_gd_remove+0x3a/0xc0 [mana] [Sat Feb 21 18:53:48 2026] pci_device_remove+0x41/0xb0 [Sat Feb 21 18:53:48 2026] device_remove+0x46/0x70 [Sat Feb 21 18:53:48 2026] device_release_driver_internal+0x1e3/0x250 [Sat Feb 21 18:53:48 2026] device_release_driver+0x12/0x20 [Sat Feb 21 18:53:48 2026] pci_stop_bus_device+0x6a/0x90 [Sat Feb 21 18:53:48 2026] pci_stop_and_remove_bus_device+0x13/0x30 [Sat Feb 21 18:53:48 2026] mana_do_service+0x180/0x290 [mana] [Sat Feb 21 18:53:48 2026] mana_serv_func+0x24/0x50 [mana] [Sat Feb 21 18:53:48 2026] process_one_work+0x190/0x3d0 [Sat Feb 21 18:53:48 2026] worker_thread+0x16e/0x2e0 [Sat Feb 21 18:53:48 2026] kthread+0xf7/0x130 [Sat Feb 21 18:53:48 2026] ? __pfx_worker_thread+0x10/0x10 [Sat Feb 21 18:53:48 2026] ? __pfx_kthread+0x10/0x10 [Sat Feb 21 18:53:48 2026] ret_from_fork+0x269/0x350 [Sat Feb 21 18:53:48 2026] ? __pfx_kthread+0x10/0x10 [Sat Feb 21 18:53:48 2026] ret_from_fork_asm+0x1a/0x30 [Sat Feb 21 18:53:48 2026] </TASK> Fixes: `505cc26bca` ("net: mana: Add support for auxiliary device servicing events") Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/aZ2bzL64NagfyHpg@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-25 19:16:09 -08:00
Greg Kroah-Hartman	4b063c002c	net: usb: kaweth: validate USB endpoints The kaweth driver should validate that the device it is probing has the proper number and types of USB endpoints it is expecting before it binds to it. If a malicious device were to not have the same urbs the driver will crash later on when it blindly accesses these endpoints. Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Simon Horman <horms@kernel.org> Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Link: https://patch.msgid.link/2026022305-substance-virtual-c728@gregkh Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-25 18:58:05 -08:00
Greg Kroah-Hartman	c58b6c29a4	net: usb: kalmia: validate USB endpoints The kalmia driver should validate that the device it is probing has the proper number and types of USB endpoints it is expecting before it binds to it. If a malicious device were to not have the same urbs the driver will crash later on when it blindly accesses these endpoints. Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Simon Horman <horms@kernel.org> Fixes: `d40261236e` ("net/usb: Add Samsung Kalmia driver for Samsung GT-B3730") Link: https://patch.msgid.link/2026022326-shack-headstone-ef6f@gregkh Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-25 18:56:43 -08:00
Greg Kroah-Hartman	11de1d3ae5	net: usb: pegasus: validate USB endpoints The pegasus driver should validate that the device it is probing has the proper number and types of USB endpoints it is expecting before it binds to it. If a malicious device were to not have the same urbs the driver will crash later on when it blindly accesses these endpoints. Cc: Petko Manolov <petkan@nucleusys.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/2026022347-legibly-attest-cc5c@gregkh Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-25 18:52:49 -08:00
Vitaly Lifshits	0942fc6d32	e1000e: clear DPG_EN after reset to avoid autonomous power-gating Panther Lake systems introduced an autonomous power gating feature for the integrated Gigabit Ethernet in shutdown state (S5) state. As part of it, the reset value of DPG_EN bit was changed to 1. Clear this bit after performing hardware reset to avoid errors such as Tx/Rx hangs, or packet loss/corruption. Fixes: `0c9183ce61` ("e1000e: Add support for the next LOM generation") Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2026-02-25 11:43:57 -08:00
Vitaly Lifshits	5b644464ee	e1000e: introduce new board type for Panther Lake PCH Add new board type for Panther Lake devices for separating device-specific features and flows. Additionally, remove the deprecated device IDs 0x57B5 and 0x57B6, which are not used by any existing devices. Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2026-02-25 11:43:57 -08:00
Jedrzej Jagielski	feae40a6a1	ixgbevf: fix link setup issue It may happen that VF spawned for E610 adapter has problem with setting link up. This happens when ixgbevf supporting mailbox API 1.6 cooperates with PF driver which doesn't support this version of API, and hence doesn't support new approach for getting PF link data. In that case VF asks PF to provide link data but as PF doesn't support it, returns -EOPNOTSUPP what leads to early bail from link configuration sequence. Avoid such situation by using legacy VFLINKS approach whenever negotiated API version is less than 1.6. To reproduce the issue just create VF and set its link up - adapter must be any from the E610 family, ixgbevf must support API 1.6 or higher while ixgbevf must not. Fixes: `53f0eb62b4` ("ixgbevf: fix getting link speed data for E610 devices") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Cc: stable@vger.kernel.org Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2026-02-25 11:43:57 -08:00
Thomas Gleixner	4b3d54a85b	i40e: Fix preempt count leak in napi poll tracepoint Using get_cpu() in the tracepoint assignment causes an obvious preempt count leak because nothing invokes put_cpu() to undo it: softirq: huh, entered softirq 3 NET_RX with preempt_count 00000100, exited with 00000101? This clearly has seen a lot of testing in the last 3+ years... Use smp_processor_id() instead. Fixes: `6d4d584a7e` ("i40e: Add i40e_napi_poll tracepoint") Signed-off-by: Thomas Gleixner <tglx@kernel.org> Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Cc: intel-wired-lan@lists.osuosl.org Cc: netdev@vger.kernel.org Reviewed-by: Joe Damato <joe@dama.to> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2026-02-25 11:43:57 -08:00
Michal Schmidt	a9c354e656	ice: fix crash in ethtool offline loopback test Since the conversion of ice to page pool, the ethtool loopback test crashes: BUG: kernel NULL pointer dereference, address: 000000000000000c #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 1100f1067 P4D 0 Oops: Oops: 0002 [#1] SMP NOPTI CPU: 23 UID: 0 PID: 5904 Comm: ethtool Kdump: loaded Not tainted 6.19.0-0.rc7.260128g1f97d9dcf5364.49.eln154.x86_64 #1 PREEMPT(lazy) Hardware name: [...] RIP: 0010:ice_alloc_rx_bufs+0x1cd/0x310 [ice] Code: 83 6c 24 30 01 66 41 89 47 08 0f 84 c0 00 00 00 41 0f b7 dc 48 8b 44 24 18 48 c1 e3 04 41 bb 00 10 00 00 48 8d 2c 18 8b 04 24 <89> 45 0c 41 8b 4d 00 49 d3 e3 44 3b 5c 24 24 0f 83 ac fe ff ff 44 RSP: 0018:ff7894738aa1f768 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000700 RDI: 0000000000000000 RBP: 0000000000000000 R08: ff16dcae79880200 R09: 0000000000000019 R10: 0000000000000001 R11: 0000000000001000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ff16dcae6c670000 FS: 00007fcf428850c0(0000) GS:ff16dcb149710000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000000c CR3: 0000000121227005 CR4: 0000000000773ef0 PKRU: 55555554 Call Trace: <TASK> ice_vsi_cfg_rxq+0xca/0x460 [ice] ice_vsi_cfg_rxqs+0x54/0x70 [ice] ice_loopback_test+0xa9/0x520 [ice] ice_self_test+0x1b9/0x280 [ice] ethtool_self_test+0xe5/0x200 __dev_ethtool+0x1106/0x1a90 dev_ethtool+0xbe/0x1a0 dev_ioctl+0x258/0x4c0 sock_do_ioctl+0xe3/0x130 __x64_sys_ioctl+0xb9/0x100 do_syscall_64+0x7c/0x700 entry_SYSCALL_64_after_hwframe+0x76/0x7e [...] It crashes because we have not initialized libeth for the rx ring. Fix it by treating ICE_VSI_LB VSIs slightly more like normal PF VSIs and letting them have a q_vector. It's just a dummy, because the loopback test does not use interrupts, but it contains a napi struct that can be passed to libeth_rx_fq_create() called from ice_vsi_cfg_rxq() -> ice_rxq_pp_create(). Fixes: `93f53db9f9` ("ice: switch to Page Pool") Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2026-02-25 11:43:57 -08:00
Aaron Ma	6aa07e23dd	ice: recap the VSI and QoS info after rebuild Fix IRDMA hardware initialization timeout (-110) after resume by separating VSI-dependent configuration from RDMA resource allocation, ensuring VSI is rebuilt before IRDMA accesses it. After resume from suspend, IRDMA hardware initialization fails: ice: IRDMA hardware initialization FAILED init_state=4 status=-110 Separate RDMA initialization into two phases: 1. ice_init_rdma() - Allocate resources only (no VSI/QoS access, no plug) 2. ice_rdma_finalize_setup() - Assign VSI/QoS info and plug device This allows: - ice_init_rdma() to stay in ice_resume() (mirrors ice_deinit_rdma() in ice_suspend()) - VSI assignment deferred until after ice_vsi_rebuild() completes - QoS info updated after ice_dcb_rebuild() completes - Device plugged only when control queues, VSI, and DCB are all ready Fixes: `bc69ad7486` ("ice: avoid IRQ collision to fix init failure on ACPI S3 resume") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Aaron Ma <aaron.ma@canonical.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2026-02-25 11:43:57 -08:00
Sreedevi Joshi	2c31557336	idpf: Fix flow rule delete failure due to invalid validation When deleting a flow rule using "ethtool -N <dev> delete <location>", idpf_sideband_action_ena() incorrectly validates fsp->ring_cookie even though ethtool doesn't populate this field for delete operations. The uninitialized ring_cookie may randomly match RX_CLS_FLOW_DISC or RX_CLS_FLOW_WAKE, causing validation to fail and preventing legitimate rule deletions. Remove the unnecessary sideband action enable check and ring_cookie validation during delete operations since action validation is not required when removing existing rules. Fixes: `ada3e24b84` ("idpf: add flow steering support") Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2026-02-25 11:43:57 -08:00
Brian Vazquez	1500a8662d	idpf: change IRQ naming to match netdev and ethtool queue numbering The code uses the vidx for the IRQ name but that doesn't match ethtool reporting nor netdev naming, this makes it hard to tune the device and associate queues with IRQs. Sequentially requesting irqs starting from '0' makes the output consistent. This commit changes the interrupt numbering but preserves the name format, maintaining ABI compatibility. Existing tools relying on the old numbering are already non-functional, as they lack a useful correlation to the interrupts. Before: ethtool -L eth1 tx 1 combined 3 grep . /proc/irq//idpf/../smp_affinity_list /proc/irq/67/idpf-Mailbox-0/../smp_affinity_list:0-55,112-167 /proc/irq/68/idpf-eth1-TxRx-1/../smp_affinity_list:0 /proc/irq/70/idpf-eth1-TxRx-3/../smp_affinity_list:1 /proc/irq/71/idpf-eth1-TxRx-4/../smp_affinity_list:2 /proc/irq/72/idpf-eth1-Tx-5/../smp_affinity_list:3 ethtool -S eth1 \| grep -v ': 0' NIC statistics: tx_q-0_pkts: 1002 tx_q-1_pkts: 2679 tx_q-2_pkts: 1113 tx_q-3_pkts: 1192 <----- tx_q-3 vs idpf-eth1-Tx-5 rx_q-0_pkts: 1143 rx_q-1_pkts: 3172 rx_q-2_pkts: 1074 After: ethtool -L eth1 tx 1 combined 3 grep . /proc/irq//idpf/../smp_affinity_list /proc/irq/67/idpf-Mailbox-0/../smp_affinity_list:0-55,112-167 /proc/irq/68/idpf-eth1-TxRx-0/../smp_affinity_list:0 /proc/irq/70/idpf-eth1-TxRx-1/../smp_affinity_list:1 /proc/irq/71/idpf-eth1-TxRx-2/../smp_affinity_list:2 /proc/irq/72/idpf-eth1-Tx-3/../smp_affinity_list:3 ethtool -S eth1 \| grep -v ': 0' NIC statistics: tx_q-0_pkts: 118 tx_q-1_pkts: 134 tx_q-2_pkts: 228 tx_q-3_pkts: 138 <--- tx_q-3 matches idpf-eth1-Tx-3 rx_q-0_pkts: 111 rx_q-1_pkts: 366 rx_q-2_pkts: 120 Fixes: `d4d5587182` ("idpf: initialize interrupts and enable vport") Signed-off-by: Brian Vazquez <brianvv@google.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2026-02-25 11:43:57 -08:00
Li Li	bc3b319773	idpf: nullify pointers after they are freed rss_data->rss_key needs to be nullified after it is freed. Checks like "if (!rss_data->rss_key)" in the code could fail if it is not nullified. Tested: built and booted the kernel. Fixes: `83f38f210b` ("idpf: Fix RSS LUT NULL pointer crash on early ethtool operations") Signed-off-by: Li Li <boolli@google.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2026-02-25 11:43:57 -08:00
Li Li	e403bf4013	idpf: skip deallocating txq group's txqs if it is NULL In idpf_txq_group_alloc(), if any txq group's txqs failed to allocate memory: for (j = 0; j < tx_qgrp->num_txq; j++) { tx_qgrp->txqs[j] = kzalloc(sizeof(*tx_qgrp->txqs[j]), GFP_KERNEL); if (!tx_qgrp->txqs[j]) goto err_alloc; } It would cause a NULL ptr kernel panic in idpf_txq_group_rel(): for (j = 0; j < txq_grp->num_txq; j++) { if (flow_sch_en) { kfree(txq_grp->txqs[j]->refillq); txq_grp->txqs[j]->refillq = NULL; } kfree(txq_grp->txqs[j]); txq_grp->txqs[j] = NULL; } [ 6.532461] BUG: kernel NULL pointer dereference, address: 0000000000000058 ... [ 6.534433] RIP: 0010:idpf_txq_group_rel+0xc9/0x110 ... [ 6.538513] Call Trace: [ 6.538639] <TASK> [ 6.538760] idpf_vport_queues_alloc+0x75/0x550 [ 6.538978] idpf_vport_open+0x4d/0x3f0 [ 6.539164] idpf_open+0x71/0xb0 [ 6.539324] __dev_open+0x142/0x260 [ 6.539506] netif_open+0x2f/0xe0 [ 6.539670] dev_open+0x3d/0x70 [ 6.539827] bond_enslave+0x5ed/0xf50 [ 6.540005] ? rcutree_enqueue+0x1f/0xb0 [ 6.540193] ? call_rcu+0xde/0x2a0 [ 6.540375] ? barn_get_empty_sheaf+0x5c/0x80 [ 6.540594] ? __kfree_rcu_sheaf+0xb6/0x1a0 [ 6.540793] ? nla_put_ifalias+0x3d/0x90 [ 6.540981] ? kvfree_call_rcu+0xb5/0x3b0 [ 6.541173] ? kvfree_call_rcu+0xb5/0x3b0 [ 6.541365] do_set_master+0x114/0x160 [ 6.541547] do_setlink+0x412/0xfb0 [ 6.541717] ? security_sock_rcv_skb+0x2a/0x50 [ 6.541931] ? sk_filter_trim_cap+0x7c/0x320 [ 6.542136] ? skb_queue_tail+0x20/0x50 [ 6.542322] ? __nla_validate_parse+0x92/0xe50 ro[o t t o6 .d5e4f2a5u4l0t]- ? security_capable+0x35/0x60 [ 6.542792] rtnl_newlink+0x95c/0xa00 [ 6.542972] ? __rtnl_unlock+0x37/0x70 [ 6.543152] ? netdev_run_todo+0x63/0x530 [ 6.543343] ? allocate_slab+0x280/0x870 [ 6.543531] ? security_capable+0x35/0x60 [ 6.543722] rtnetlink_rcv_msg+0x2e6/0x340 [ 6.543918] ? __pfx_rtnetlink_rcv_msg+0x10/0x10 [ 6.544138] netlink_rcv_skb+0x16a/0x1a0 [ 6.544328] netlink_unicast+0x20a/0x320 [ 6.544516] netlink_sendmsg+0x304/0x3b0 [ 6.544748] __sock_sendmsg+0x89/0xb0 [ 6.544928] ____sys_sendmsg+0x167/0x1c0 [ 6.545116] ? ____sys_recvmsg+0xed/0x150 [ 6.545308] ___sys_sendmsg+0xdd/0x120 [ 6.545489] ? ___sys_recvmsg+0x124/0x1e0 [ 6.545680] ? rcutree_enqueue+0x1f/0xb0 [ 6.545867] ? rcutree_enqueue+0x1f/0xb0 [ 6.546055] ? call_rcu+0xde/0x2a0 [ 6.546222] ? evict+0x286/0x2d0 [ 6.546389] ? rcutree_enqueue+0x1f/0xb0 [ 6.546577] ? kmem_cache_free+0x2c/0x350 [ 6.546784] __x64_sys_sendmsg+0x72/0xc0 [ 6.546972] do_syscall_64+0x6f/0x890 [ 6.547150] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 6.547393] RIP: 0033:0x7fc1a3347bd0 ... [ 6.551375] RIP: 0010:idpf_txq_group_rel+0xc9/0x110 ... [ 6.578856] Rebooting in 10 seconds.. We should skip deallocating txqs[j] if it is NULL in the first place. Tested: with this patch, the kernel panic no longer appears. Fixes: `1c325aac10` ("idpf: configure resources for TX queues") Signed-off-by: Li Li <boolli@google.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2026-02-25 11:43:56 -08:00
Li Li	d11e5da2d6	idpf: skip deallocating bufq_sets from rx_qgrp if it is NULL In idpf_rxq_group_alloc(), if rx_qgrp->splitq.bufq_sets failed to get allocated: rx_qgrp->splitq.bufq_sets = kcalloc(vport->num_bufqs_per_qgrp, sizeof(struct idpf_bufq_set), GFP_KERNEL); if (!rx_qgrp->splitq.bufq_sets) { err = -ENOMEM; goto err_alloc; } idpf_rxq_group_rel() would attempt to deallocate it in idpf_rxq_sw_queue_rel(), causing a kernel panic: ``` [ 7.967242] early-network-sshd-n-rexd[3148]: knetbase: Info: [ 8.127804] BUG: kernel NULL pointer dereference, address: 00000000000000c0 ... [ 8.129779] RIP: 0010:idpf_rxq_group_rel+0x101/0x170 ... [ 8.133854] Call Trace: [ 8.133980] <TASK> [ 8.134092] idpf_vport_queues_alloc+0x286/0x500 [ 8.134313] idpf_vport_open+0x4d/0x3f0 [ 8.134498] idpf_open+0x71/0xb0 [ 8.134668] __dev_open+0x142/0x260 [ 8.134840] netif_open+0x2f/0xe0 [ 8.135004] dev_open+0x3d/0x70 [ 8.135166] bond_enslave+0x5ed/0xf50 [ 8.135345] ? nla_put_ifalias+0x3d/0x90 [ 8.135533] ? kvfree_call_rcu+0xb5/0x3b0 [ 8.135725] ? kvfree_call_rcu+0xb5/0x3b0 [ 8.135916] do_set_master+0x114/0x160 [ 8.136098] do_setlink+0x412/0xfb0 [ 8.136269] ? security_sock_rcv_skb+0x2a/0x50 [ 8.136509] ? sk_filter_trim_cap+0x7c/0x320 [ 8.136714] ? skb_queue_tail+0x20/0x50 [ 8.136899] ? __nla_validate_parse+0x92/0xe50 [ 8.137112] ? security_capable+0x35/0x60 [ 8.137304] rtnl_newlink+0x95c/0xa00 [ 8.137483] ? __rtnl_unlock+0x37/0x70 [ 8.137664] ? netdev_run_todo+0x63/0x530 [ 8.137855] ? allocate_slab+0x280/0x870 [ 8.138044] ? security_capable+0x35/0x60 [ 8.138235] rtnetlink_rcv_msg+0x2e6/0x340 [ 8.138431] ? __pfx_rtnetlink_rcv_msg+0x10/0x10 [ 8.138650] netlink_rcv_skb+0x16a/0x1a0 [ 8.138840] netlink_unicast+0x20a/0x320 [ 8.139028] netlink_sendmsg+0x304/0x3b0 [ 8.139217] __sock_sendmsg+0x89/0xb0 [ 8.139399] ____sys_sendmsg+0x167/0x1c0 [ 8.139588] ? ____sys_recvmsg+0xed/0x150 [ 8.139780] ___sys_sendmsg+0xdd/0x120 [ 8.139960] ? ___sys_recvmsg+0x124/0x1e0 [ 8.140152] ? rcutree_enqueue+0x1f/0xb0 [ 8.140341] ? rcutree_enqueue+0x1f/0xb0 [ 8.140528] ? call_rcu+0xde/0x2a0 [ 8.140695] ? evict+0x286/0x2d0 [ 8.140856] ? rcutree_enqueue+0x1f/0xb0 [ 8.141043] ? kmem_cache_free+0x2c/0x350 [ 8.141236] __x64_sys_sendmsg+0x72/0xc0 [ 8.141424] do_syscall_64+0x6f/0x890 [ 8.141603] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 8.141841] RIP: 0033:0x7f2799d21bd0 ... [ 8.149905] Kernel panic - not syncing: Fatal exception [ 8.175940] Kernel Offset: 0xf800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 8.176425] Rebooting in 10 seconds.. ``` Tested: With this patch, the kernel panic no longer appears. Fixes: `95af467d9a` ("idpf: configure resources for RX queues") Signed-off-by: Li Li <boolli@google.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2026-02-25 11:43:56 -08:00
Li Li	712896ac4b	idpf: increment completion queue next_to_clean in sw marker wait routine Currently, in idpf_wait_for_sw_marker_completion(), when an IDPF_TXD_COMPLT_SW_MARKER packet is found, the routine breaks out of the for loop and does not increment the next_to_clean counter. This causes the subsequent NAPI polls to run into the same IDPF_TXD_COMPLT_SW_MARKER packet again and print out the following: [ 23.261341] idpf 0000:05:00.0 eth1: Unknown TX completion type: 5 Instead, we should increment next_to_clean regardless when an IDPF_TXD_COMPLT_SW_MARKER packet is found. Tested: with the patch applied, we do not see the errors above from NAPI polls anymore. Fixes: `9d39447051` ("idpf: remove SW marker handling from NAPI") Signed-off-by: Li Li <boolli@google.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2026-02-25 11:43:56 -08:00
Russell King (Oracle)	2f61f38a21	net: stmmac: fix timestamping configuration after suspend/resume When stmmac_init_timestamping() is called, it clears the receive and transmit path booleans that allow timestamps to be read. These are never re-initialised until after userspace requests timestamping features to be enabled. However, our copy of the timestamp configuration is not cleared, which means we return the old configuration to userspace when requested. This is inconsistent. Fix this by clearing the timestamp configuration. Fixes: `d6228b7cdd` ("net: stmmac: implement the SIOCGHWTSTAMP ioctl") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/E1vuUu4-0000000Afea-0j9B@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-24 17:46:15 -08:00
Andrew Lunn	c8dbdc6e38	net: phy: register phy led_triggers during probe to avoid AB-BA deadlock There is an AB-BA deadlock when both LEDS_TRIGGER_NETDEV and LED_TRIGGER_PHY are enabled: [ 1362.049207] [<8054e4b8>] led_trigger_register+0x5c/0x1fc <-- Trying to get lock "triggers_list_lock" via down_write(&triggers_list_lock); [ 1362.054536] [<80662830>] phy_led_triggers_register+0xd0/0x234 [ 1362.060329] [<8065e200>] phy_attach_direct+0x33c/0x40c [ 1362.065489] [<80651fc4>] phylink_fwnode_phy_connect+0x15c/0x23c [ 1362.071480] [<8066ee18>] mtk_open+0x7c/0xba0 [ 1362.075849] [<806d714c>] __dev_open+0x280/0x2b0 [ 1362.080384] [<806d7668>] __dev_change_flags+0x244/0x24c [ 1362.085598] [<806d7698>] dev_change_flags+0x28/0x78 [ 1362.090528] [<807150e4>] dev_ioctl+0x4c0/0x654 <-- Hold lock "rtnl_mutex" by calling rtnl_lock(); [ 1362.094985] [<80694360>] sock_ioctl+0x2f4/0x4e0 [ 1362.099567] [<802e9c4c>] sys_ioctl+0x32c/0xd8c [ 1362.104022] [<80014504>] syscall_common+0x34/0x58 Here LED_TRIGGER_PHY is registering LED triggers during phy_attach while holding RTNL and then taking triggers_list_lock. [ 1362.191101] [<806c2640>] register_netdevice_notifier+0x60/0x168 <-- Trying to get lock "rtnl_mutex" via rtnl_lock(); [ 1362.197073] [<805504ac>] netdev_trig_activate+0x194/0x1e4 [ 1362.202490] [<8054e28c>] led_trigger_set+0x1d4/0x360 <-- Hold lock "triggers_list_lock" by down_read(&triggers_list_lock); [ 1362.207511] [<8054eb38>] led_trigger_write+0xd8/0x14c [ 1362.212566] [<80381d98>] sysfs_kf_bin_write+0x80/0xbc [ 1362.217688] [<8037fcd8>] kernfs_fop_write_iter+0x17c/0x28c [ 1362.223174] [<802cbd70>] vfs_write+0x21c/0x3c4 [ 1362.227712] [<802cc0c4>] ksys_write+0x78/0x12c [ 1362.232164] [<80014504>] syscall_common+0x34/0x58 Here LEDS_TRIGGER_NETDEV is being enabled on an LED. It first takes triggers_list_lock and then RTNL. A classical AB-BA deadlock. phy_led_triggers_registers() does not require the RTNL, it does not make any calls into the network stack which require protection. There is also no requirement the PHY has been attached to a MAC, the triggers only make use of phydev state. This allows the call to phy_led_triggers_registers() to be placed elsewhere. PHY probe() and release() don't hold RTNL, so solving the AB-BA deadlock. Reported-by: Shiji Yang <yangshiji66@outlook.com> Closes: https://lore.kernel.org/all/OS7PR01MB13602B128BA1AD3FA38B6D1FFBC69A@OS7PR01MB13602.jpnprd01.prod.outlook.com/ Fixes: `06f502f57d` ("leds: trigger: Introduce a NETDEV trigger") Cc: stable@vger.kernel.org Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Shiji Yang <yangshiji66@outlook.com> Link: https://patch.msgid.link/20260222152601.1978655-1-andrew@lunn.ch Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-02-24 12:39:54 +01:00
Ziyi Guo	3d7e6ce34f	net: usb: pegasus: enable basic endpoint checking pegasus_probe() fills URBs with hardcoded endpoint pipes without verifying the endpoint descriptors: - usb_rcvbulkpipe(dev, 1) for RX data - usb_sndbulkpipe(dev, 2) for TX data - usb_rcvintpipe(dev, 3) for status interrupts A malformed USB device can present these endpoints with transfer types that differ from what the driver assumes. Add a pegasus_usb_ep enum for endpoint numbers, replacing magic constants throughout. Add usb_check_bulk_endpoints() and usb_check_int_endpoints() calls before any resource allocation to verify endpoint types before use, rejecting devices with mismatched descriptors at probe time, and avoid triggering assertion. Similar fix to - commit `90b7f29617` ("net: usb: rtl8150: enable basic endpoint checking") - commit `9e7021d2ae` ("net: usb: catc: enable basic endpoint checking") Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260222050633.410165-1-n7l8m4@u.northwestern.edu Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-02-24 11:51:51 +01:00
Jakub Kicinski	82aec772fc	netconsole: avoid OOB reads, msg is not nul-terminated msg passed to netconsole from the console subsystem is not guaranteed to be nul-terminated. Before recent commit `7eab73b186` ("netconsole: convert to NBCON console infrastructure") the message would be placed in printk_shared_pbufs, a static global buffer, so KASAN had harder time catching OOB accesses. Now we see: printk: console [netcon_ext0] enabled BUG: KASAN: slab-out-of-bounds in string+0x1f7/0x240 Read of size 1 at addr ffff88813b6d4c00 by task pr/netcon_ext0/594 CPU: 65 UID: 0 PID: 594 Comm: pr/netcon_ext0 Not tainted 6.19.0-11754-g4246fd6547c9 Call Trace: kasan_report+0xe4/0x120 string+0x1f7/0x240 vsnprintf+0x655/0xba0 scnprintf+0xba/0x120 netconsole_write+0x3fe/0xa10 nbcon_emit_next_record+0x46e/0x860 nbcon_kthread_func+0x623/0x750 Allocated by task 1: nbcon_alloc+0x1ea/0x450 register_console+0x26b/0xe10 init_netconsole+0xbb0/0xda0 The buggy address belongs to the object at ffff88813b6d4000 which belongs to the cache kmalloc-4k of size 4096 The buggy address is located 0 bytes to the right of allocated 3072-byte region [ffff88813b6d4000, ffff88813b6d4c00) Fixes: `c62c0a17f9` ("netconsole: Append kernel version to message") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260219195021.2099699-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-02-24 10:46:29 +01:00
Duoming Zhou	bae8a5d2e7	net: wan: farsync: Fix use-after-free bugs caused by unfinished tasklets When the FarSync T-series card is being detached, the fst_card_info is deallocated in fst_remove_one(). However, the fst_tx_task or fst_int_task may still be running or pending, leading to use-after-free bugs when the already freed fst_card_info is accessed in fst_process_tx_work_q() or fst_process_int_work_q(). A typical race condition is depicted below: CPU 0 (cleanup) \| CPU 1 (tasklet) \| fst_start_xmit() fst_remove_one() \| tasklet_schedule() unregister_hdlc_device()\| \| fst_process_tx_work_q() //handler kfree(card) //free \| do_bottom_half_tx() \| card-> //use The following KASAN trace was captured: ================================================================== BUG: KASAN: slab-use-after-free in do_bottom_half_tx+0xb88/0xd00 Read of size 4 at addr ffff88800aad101c by task ksoftirqd/3/32 ... Call Trace: <IRQ> dump_stack_lvl+0x55/0x70 print_report+0xcb/0x5d0 ? do_bottom_half_tx+0xb88/0xd00 kasan_report+0xb8/0xf0 ? do_bottom_half_tx+0xb88/0xd00 do_bottom_half_tx+0xb88/0xd00 ? _raw_spin_lock_irqsave+0x85/0xe0 ? __pfx__raw_spin_lock_irqsave+0x10/0x10 ? __pfx___hrtimer_run_queues+0x10/0x10 fst_process_tx_work_q+0x67/0x90 tasklet_action_common+0x1fa/0x720 ? hrtimer_interrupt+0x31f/0x780 handle_softirqs+0x176/0x530 __irq_exit_rcu+0xab/0xe0 sysvec_apic_timer_interrupt+0x70/0x80 ... Allocated by task 41 on cpu 3 at 72.330843s: kasan_save_stack+0x24/0x50 kasan_save_track+0x17/0x60 __kasan_kmalloc+0x7f/0x90 fst_add_one+0x1a5/0x1cd0 local_pci_probe+0xdd/0x190 pci_device_probe+0x341/0x480 really_probe+0x1c6/0x6a0 __driver_probe_device+0x248/0x310 driver_probe_device+0x48/0x210 __device_attach_driver+0x160/0x320 bus_for_each_drv+0x101/0x190 __device_attach+0x198/0x3a0 device_initial_probe+0x78/0xa0 pci_bus_add_device+0x81/0xc0 pci_bus_add_devices+0x7e/0x190 enable_slot+0x9b9/0x1130 acpiphp_check_bridge.part.0+0x2e1/0x460 acpiphp_hotplug_notify+0x36c/0x3c0 acpi_device_hotplug+0x203/0xb10 acpi_hotplug_work_fn+0x59/0x80 ... Freed by task 41 on cpu 1 at 75.138639s: kasan_save_stack+0x24/0x50 kasan_save_track+0x17/0x60 kasan_save_free_info+0x3b/0x60 __kasan_slab_free+0x43/0x70 kfree+0x135/0x410 fst_remove_one+0x2ca/0x540 pci_device_remove+0xa6/0x1d0 device_release_driver_internal+0x364/0x530 pci_stop_bus_device+0x105/0x150 pci_stop_and_remove_bus_device+0xd/0x20 disable_slot+0x116/0x260 acpiphp_disable_and_eject_slot+0x4b/0x190 acpiphp_hotplug_notify+0x230/0x3c0 acpi_device_hotplug+0x203/0xb10 acpi_hotplug_work_fn+0x59/0x80 ... The buggy address belongs to the object at ffff88800aad1000 which belongs to the cache kmalloc-1k of size 1024 The buggy address is located 28 bytes inside of freed 1024-byte region The buggy address belongs to the physical page: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xaad0 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0x100000000000040(head\|node=0\|zone=1) page_type: f5(slab) raw: 0100000000000040 ffff888007042dc0 dead000000000122 0000000000000000 raw: 0000000000000000 0000000080100010 00000000f5000000 0000000000000000 head: 0100000000000040 ffff888007042dc0 dead000000000122 0000000000000000 head: 0000000000000000 0000000080100010 00000000f5000000 0000000000000000 head: 0100000000000003 ffffea00002ab401 00000000ffffffff 00000000ffffffff head: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88800aad0f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88800aad0f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff88800aad1000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88800aad1080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88800aad1100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Fix this by ensuring that both fst_tx_task and fst_int_task are properly canceled before the fst_card_info is released. Add tasklet_kill() in fst_remove_one() to synchronize with any pending or running tasklets. Since unregister_hdlc_device() stops data transmission and reception, and fst_disable_intr() prevents further interrupts, it is appropriate to place tasklet_kill() after these calls. The bugs were identified through static analysis. To reproduce the issue and validate the fix, a FarSync T-series card was simulated in QEMU and delays(e.g., mdelay()) were introduced within the tasklet handler to increase the likelihood of triggering the race condition. Fixes: `2f623aaf9f` ("net: farsync: Fix kmemleak when rmmods farsync") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Reviewed-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20260219124637.72578-1-duoming@zju.edu.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-02-24 10:31:52 +01:00
Ankit Garg	fb868db5f4	gve: fix incorrect buffer cleanup in gve_tx_clean_pending_packets for QPL In DQ-QPL mode, gve_tx_clean_pending_packets() incorrectly uses the RDA buffer cleanup path. It iterates num_bufs times and attempts to unmap entries in the dma array. This leads to two issues: 1. The dma array shares storage with tx_qpl_buf_ids (union). Interpreting buffer IDs as DMA addresses results in attempting to unmap incorrect memory locations. 2. num_bufs in QPL mode (counting 2K chunks) can significantly exceed the size of the dma array, causing out-of-bounds access warnings (trace below is how we noticed this issue). UBSAN: array-index-out-of-bounds in drivers/net/ethernet/drivers/net/ethernet/google/gve/gve_tx_dqo.c:178:5 index 18 is out of range for type 'dma_addr_t[18]' (aka 'unsigned long long[18]') Workqueue: gve gve_service_task [gve] Call Trace: <TASK> dump_stack_lvl+0x33/0xa0 __ubsan_handle_out_of_bounds+0xdc/0x110 gve_tx_stop_ring_dqo+0x182/0x200 [gve] gve_close+0x1be/0x450 [gve] gve_reset+0x99/0x120 [gve] gve_service_task+0x61/0x100 [gve] process_scheduled_works+0x1e9/0x380 Fix this by properly checking for QPL mode and delegating to gve_free_tx_qpl_bufs() to reclaim the buffers. Cc: stable@vger.kernel.org Fixes: `a6fb8d5a8b` ("gve: Tx path for DQO-QPL") Signed-off-by: Ankit Garg <nktgrg@google.com> Reviewed-by: Jordan Rhee <jordanrhee@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260220215324.1631350-1-joshwash@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-23 17:22:48 -08:00
Daniel Hodges	03cc8f90d0	wifi: libertas: fix use-after-free in lbs_free_adapter() The lbs_free_adapter() function uses timer_delete() (non-synchronous) for both command_timer and tx_lockup_timer before the structure is freed. This is incorrect because timer_delete() does not wait for any running timer callback to complete. If a timer callback is executing when lbs_free_adapter() is called, the callback will access freed memory since lbs_cfg_free() frees the containing structure immediately after lbs_free_adapter() returns. Both timer callbacks (lbs_cmd_timeout_handler and lbs_tx_lockup_handler) access priv->driver_lock, priv->cur_cmd, priv->dev, and other fields, which would all be use-after-free violations. Use timer_delete_sync() instead to ensure any running timer callback has completed before returning. This bug was introduced in commit `8f641d93c3` ("libertas: detect TX lockups and reset hardware") where del_timer() was used instead of del_timer_sync() in the cleanup path. The command_timer has had the same issue since the driver was first written. Fixes: `8f641d93c3` ("libertas: detect TX lockups and reset hardware") Fixes: `954ee164f4` ("[PATCH] libertas: reorganize and simplify init sequence") Cc: stable@vger.kernel.org Signed-off-by: Daniel Hodges <git@danielhodges.dev> Link: https://patch.msgid.link/20260206195356.15647-1-git@danielhodges.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-02-23 09:28:14 +01:00
Chen-Yu Tsai	25723454f6	wifi: mwifiex: Fix dev_alloc_name() return value check dev_alloc_name() returns the allocated ID on success, which could be over 0. Fix the return value check to check for negative error codes. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/all/aYmsQfujoAe5qO02@stanley.mountain/ Fixes: `7bab5bdb81` ("wifi: mwifiex: Allocate dev name earlier for interface workqueue name") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com> Link: https://patch.msgid.link/20260210100337.1131279-1-wenst@chromium.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-02-23 09:25:48 +01:00
Marek Szyprowski	243307a0d1	wifi: brcmfmac: Fix potential kernel oops when probe fails When probe of the sdio brcmfmac device fails for some reasons (i.e. missing firmware), the sdiodev->bus is set to error instead of NULL, thus the cleanup later in brcmf_sdio_remove() tries to free resources via invalid bus pointer. This happens because sdiodev->bus is set 2 times: first in brcmf_sdio_probe() and second time in brcmf_sdiod_probe(). Fix this by chaning the brcmf_sdio_probe() function to return the error code and set sdio->bus only there. Fixes: `0ff0843310` ("wifi: brcmfmac: Add optional lpo clock enable support") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Arend van Spriel<arend.vanspriel@broadcom.com> Link: https://patch.msgid.link/20260203102133.1478331-1-m.szyprowski@samsung.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-02-23 09:24:30 +01:00
Kees Cook	189f164e57	Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses Conversion performed via this Coccinelle script: // SPDX-License-Identifier: GPL-2.0-only // Options: --include-headers-for-types --all-includes --include-headers --keep-comments virtual patch @gfp depends on patch && !(file in "tools") && !(file in "samples")@ identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex, kzalloc_obj,kzalloc_objs,kzalloc_flex, kvmalloc_obj,kvmalloc_objs,kvmalloc_flex, kvzalloc_obj,kvzalloc_objs,kvzalloc_flex}; @@ ALLOC(... - , GFP_KERNEL ) $ make coccicheck MODE=patch COCCI=gfp.cocci Build and boot tested x86_64 with Fedora 42's GCC and Clang: Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-22 08:26:33 -08:00
Linus Torvalds	32a92f8c89	Convert more 'alloc_obj' cases to default GFP_KERNEL arguments This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 20:03:00 -08:00
Linus Torvalds	323bbfcf1e	Convert 'alloc_flex' family to use the new default GFP_KERNEL argument This is the exact same thing as the 'alloc_obj()' version, only much smaller because there are a lot fewer users of the alloc_flex() interface. As with alloc_obj() version, this was done entirely with mindless brute force, using the same script, except using 'flex' in the pattern rather than 'objs'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Linus Torvalds	bf4afc53b7	Convert 'alloc_obj' family to use the new default GFP_KERNEL argument This was done entirely with mindless brute force, using git grep -l '\<k[vmz]alloc_objs(., GFP_KERNEL)' \| xargs sed -i 's/$alloc_objs(.*$, GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Kees Cook	7a70c15bd1	kmalloc_obj: Clean up after treewide replacements Coccinelle doesn't handle re-indenting line escapes. Fix the 2 places where these got misaligned. Remove 2 now-redundant type casts, found with: $ git grep -P 'struct (\S+).\)\sk\S+alloc_(objs?\|flex)\(struct \1' Signed-off-by: Kees Cook <kees@kernel.org>	2026-02-21 01:02:52 -08:00

1 2 3 4 5 ...

137847 Commits