linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-07-22 01:08:13 -04:00

Author	SHA1	Message	Date
Ruoyu Wang	1cb8553c02	bnxt_en: Handle partially initialized auxiliary devices bnxt_aux_devices_init() calls auxiliary_device_init() before all fields used by bnxt_aux_dev_release() are initialized. After auxiliary_device_init() succeeds, later errors must unwind with auxiliary_device_uninit(), which invokes the release callback. The release callback assumes that aux_priv->id, aux_priv->edev, edev->net and edev->ulp_tbl are all populated. If allocation fails after auxiliary_device_init(), the release path can otherwise dereference or clear partially initialized state. Allocate and attach the bnxt_en_dev and ULP table before calling auxiliary_device_init(), so the release callback only sees a fully initialized auxiliary private object. If auxiliary_device_init() itself fails, free those allocations directly because device_initialize() has not run and the release callback will not be invoked. This issue was found by a static analysis checker and confirmed by manual source review. Fixes: `194fad5b27` ("bnxt_en: Refactor bnxt_rdma_aux_device_init/uninit functions") Signed-off-by: Ruoyu Wang <ruoyuw560@gmail.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20260711163716.3996929-1-ruoyuw560@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-17 12:43:57 +02:00
Zhiping Zhang	df6134b527	net/mlx5: free mlx5_st_idx_data on final dealloc Workloads that repeatedly allocate and release mkeys carrying TPH steering-tag hints (e.g. churning RDMA MRs) leak one struct mlx5_st_idx_data per cycle; kmemleak flags it as unreferenced and the kmalloc slab grows over time. When the last reference to an ST table entry is dropped, mlx5_st_dealloc_index() removed the entry from idx_xa but the backing mlx5_st_idx_data allocation was never freed. Free idx_data after the xa_erase() so the lifetime of the bookkeeping struct matches the lifetime of the ST entry it tracks. Cc: stable@vger.kernel.org Fixes: `888a7776f4` ("net/mlx5: Add support for device steering tag") Reviewed-by: Michael Gur <michaelgur@nvidia.com> Signed-off-by: Zhiping Zhang <zhipingz@meta.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260702222507.1234467-1-zhipingz@meta.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-17 11:50:02 +02:00
Stéphane Grosjean	d83762005c	can: peak: Modification of references to email accounts being deleted Following the sale of PEAK-System France by HMS-Networks, this update is intended to change all my @hms-networks.com email addresses to my new @peak-system.fr address. Signed-off-by: Stéphane Grosjean <s.grosjean@peak-system.fr> Link: https://patch.msgid.link/20260410124251.40506-1-stephane.grosjean@free.fr Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2026-07-16 09:12:24 +02:00
Fan Wu	c43122fef3	can: esd_usb: kill anchored URBs before freeing netdevs esd_usb_disconnect() frees each CAN netdev with free_candev() inside its per-netdev loop and only calls unlink_all_urbs(dev) afterwards. The per-netdev private data (struct esd_usb_net_priv) is embedded in the net_device allocation returned by alloc_candev(), so once free_candev() has run, dev->nets[i] points to freed memory. unlink_all_urbs() then dereferences the freed dev->nets[i] to kill the per-netdev TX anchor (usb_kill_anchored_urbs(&priv->tx_submitted)), clear active_tx_jobs, and reset priv->tx_contexts[]. Reorder the teardown so the anchored URBs are killed before the netdevs are freed, matching other CAN/USB drivers in the same directory such as ems_usb, usb_8dev and mcba_usb, which unregister, then unlink, then free: unregister the netdevs first (which stops their TX queues), call unlink_all_urbs(dev) once, then free the netdevs. This issue was found by an in-house static analysis tool. Fixes: `96d8e90382` ("can: Add driver for esd CAN-USB/2 device") Cc: stable@vger.kernel.org Assisted-by: Codex:gpt-5.5 Signed-off-by: Fan Wu <fanwu01@zju.edu.cn> Link: https://patch.msgid.link/20260709164159.497640-1-fanwu01@zju.edu.cn Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2026-07-14 10:59:53 +02:00
Alexander Hölzl	79adf48fb0	can: vxcan: Kconfig: fix description stating no local echo provided The Kconfig description of the vxcan kernel module erroneously states the the vxcan interface does not provide a local echo of sent can frames. However this behavior changed in commit `259bdba27e` ("vxcan: enable local echo for sent CAN frames") and vxcan interfaces now provide a local echo. Change the description of the vxcan module in the Kconfig to reflect this change. Signed-off-by: Alexander Hölzl <alexander.hoelzl@gmx.net> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://patch.msgid.link/20260619090035.17769-1-alexander.hoelzl@gmx.net [mkl: rephrase patch description] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2026-07-14 10:52:44 +02:00
James Raphael Tiovalen	7410d11460	macsec: fix promiscuity refcount leak in macsec_dev_open() When a MACsec interface with IFF_PROMISC set is brought up on top of a device that has hardware offload enabled, macsec_dev_open() first calls dev_set_promiscuity(real_dev, 1) and then propagates the open to the offload device. If that propagation fails, the error path jumps to the clear_allmulti label, which only reverts allmulti and the unicast address. The promiscuity taken on the lower device is never dropped, so real_dev is left permanently stuck in promiscuous mode. Its promiscuity count can no longer be balanced from software. Add a clear_promisc label that drops the promiscuity reference and route the two offload failure paths to it. The dev_set_promiscuity() failure itself still jumps to clear_allmulti, since on that failure the count was not incremented. Fixes: `3cf3227a21` ("net: macsec: hardware offloading infrastructure") Cc: stable@vger.kernel.org Signed-off-by: James Raphael Tiovalen <jamestiotio@gmail.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/20260705113629.187490-1-jamestiotio@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-11 12:52:15 +02:00
Paolo Abeni	a0d82fb850	Merge tag 'wireless-2026-07-09' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Too many robustness fixes to list. Mostly for - slight out-of-bounds reads of SKBs, - leaks on error conditions, and - malformed netlink input rejection. * tag 'wireless-2026-07-09' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: (46 commits) wifi: cfg80211: bound element ID read when checking non-inheritance wifi: brcmfmac: cyw: fix heap overflow on a short auth frame wifi: brcmfmac: initialize SDIO data work before cleanup wifi: cfg80211: validate assoc response length before status and IE access wifi: cfg80211: validate rx/tx MLME callback frame lengths before access wifi: mac80211: ibss: wait for in-flight TX on disconnect wifi: mac80211: recalculate rx_nss on IBSS peer capability update wifi: cfg80211: use wiphy work for socket owner autodisconnect wifi: mac80211: fix memory leak in ieee80211_register_hw() wifi: mac80211: free AP_VLAN bc_buf SKBs outside IRQ lock wifi: mac80211: validate deauth frame length before reason access wifi: mac80211: avoid non-S1G AID fallback for S1G assoc wifi: cfg80211: reject empty PMSR peer lists wifi: cfg80211: reject unsupported PMSR FTM location requests wifi: cfg80211: validate PMSR FTM preamble range wifi: cfg80211: validate PMSR measurement type data wifi: nl80211: constrain MBSSID TX link ID range wifi: nl80211: validate nested MBSSID IE blobs wifi: ieee80211: validate MLE common info length wifi: cfg80211: derive S1G beacon TSF from S1G fields ... ==================== Link: https://patch.msgid.link/20260709115038.243870-3-johannes@sipsolutions.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-10 16:27:45 +02:00
Norbert Szetei	ec4215683e	ppp: defer channel free to an RCU grace period to fix pppol2tp RX UAF pppol2tp_recv() runs in the L2TP UDP-encap softirq RX path: l2tp_udp_encap_recv() -> l2tp_recv_common() -> pppol2tp_recv() -> ppp_input(&po->chan) It runs under rcu_read_lock() holding only an l2tp_session reference and takes NO reference on the internal PPP channel (struct channel, chan->ppp) that ppp_input() dereferences. The pppox socket is SOCK_RCU_FREE, so 'po' and the embedded ppp_channel are RCU-safe. But the internal struct channel is a separate allocation that ppp_release_channel() frees with a plain kfree(): close(data socket) -> pppol2tp_release() -> pppox_unbind_sock() -> ppp_unregister_channel() -> ppp_release_channel() -> kfree(pch) For a channel that is bound (PPPIOCGCHAN) but not attached to a ppp unit (no PPPIOCCONNECT, pch->ppp == NULL) and not bridged, teardown skips both ppp_disconnect_channel()'s synchronize_net() and ppp_unbridge_channels()'s synchronize_rcu(), so the kfree() has no grace period. rcu_read_lock() in pppol2tp_recv() does not protect against a plain kfree(), so an in-flight ppp_input() on one CPU can dereference the channel just freed by close() on another CPU. The bug is reachable by an unprivileged user. Defer the channel free to an RCU callback via call_rcu() so the grace period fences any in-flight ppp_input(). The disconnect and unbridge teardown paths already fence with synchronize_net()/synchronize_rcu(); call_rcu() does the same here without stalling the close() path. Fixes: `ee40fb2e1e` ("l2tp: protect sock pointer of struct pppol2tp_session with RCU") Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Norbert Szetei <norbert@doyensec.com> Reviewed-by: Qingfang Deng <qingfang.deng@linux.dev> Link: https://patch.msgid.link/E793FCF2-58DE-4387-A983-C7B4BC3158BD@doyensec.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-10 13:31:47 +02:00
Linus Torvalds	2c7c88a412	Merge tag 'net-7.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter, Bluetooth and batman-adv. Current release - regressions: - bluetooth: fix using chan->conn as indication to no remote netdev Current release - new code bugs: - netfilter: cap to maximum number of expectation per master on updates Previous releases - regressions: - bluetooth: - fix UAF of hci_conn_params in add_device_complete - fix null ptr deref in hci_abort_conn() - igmp: remove multicast group from hash table on device destruction - batman-adv: prevent TVLV OOB check overflow - eth: mlx5/mlx5e: - fix off-by-one in single-FDB error rollback - skip peer flow cleanup when LAG seq is unavailable - fix crashes in dynamic per-channel stats and HV VHCA agent - eth: mana: Sync page pool RX frags for CPU Previous releases - always broken: - netfilter: - mark malformed IPv6 extension headers for hotdrop - terminate table name before find_table_lock() - ipvs: use parsed transport offset in TCP state lookup - sched: act_pedit: fix TOCTOU heap OOB write in tc offload - ethtool: rss: fix hfunc and input_xfrm parsing on big endian - ipv4/ipv6: fix UAF and memory leak in IGMP/MLD - tls: consume empty data records in tls_sw_read_sock() - eth: - octeontx2-af: fix VF bringup affecting PF promiscuous state - gue: validate REMCSUM private option length" * tag 'net-7.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (94 commits) macsec: don't read an unset MAC header in macsec_encrypt() dibs: loopback: validate offset and size in move_data() octeontx2-af: fix VF bringup affecting PF promiscuous state ethtool: rss: Fix hfunc and input_xfrm parsing on big endian net/mlx5: Fix L3 tunnel entropy refcount leak net: macb: drop in-flight Tx SKBs on close net: mana: Sync page pool RX frags for CPU net: mana: Validate the packet length reported by the NIC selftests/net: fix EVP_MD_CTX leak in tcp_mmap ipvs: ensure inner headers in ICMP errors are in headroom ipvs: use parsed transport offset in SCTP state lookup ipvs: use parsed transport offset in TCP state lookup ipvs: pass parsed transport offset to state handlers netfilter: handle unreadable frags netfilter: flowtable: support IPIP tunnel with direct xmit netfilter: flowtable: IPIP tunnel hardware offload is not yet support netfilter: flowtable: use dst in this direction when pushing IPIP header netfilter: ipset: allocate the proper memory for the generic hash structure netfilter: ipset: cleanup the add/del backlog when resize failed netfilter: ipset: exclude gc when resize is in progress ...	2026-07-09 08:26:51 -07:00
Daehyeon Ko	f5089008f9	macsec: don't read an unset MAC header in macsec_encrypt() macsec_encrypt() reads the Ethernet header via eth_hdr(skb) (skb->head + skb->mac_header) to memmove() the 12 source/destination MAC bytes forward and make room for the SecTAG. On the AF_PACKET SOCK_RAW + PACKET_QDISC_BYPASS transmit path the skb reaches the macsec ndo_start_xmit() with the MAC header unset, so eth_hdr(skb) resolves to skb->head + (u16)~0 and the read is out of bounds: a 12-byte heap over-read that is also emitted on the wire as the frame's outer source/destination MAC. KASAN reports a slab-out-of-bounds read in macsec_start_xmit() on 6.0; on current mainline a CONFIG_DEBUG_NET build flags it as an unset mac header in skb_mac_header(). On the TX path the L2 header is at skb->data, so use skb_eth_hdr(), added by commit `96cc4b6958` ("macvlan: do not assume mac_header is set in macvlan_broadcast()") for exactly this purpose. Fixes: `c09440f7dc` ("macsec: introduce IEEE 802.1AE driver") Cc: stable@vger.kernel.org Signed-off-by: Daehyeon Ko <4ncienth@gmail.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/20260703083634.2035145-1-4ncienth@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-09 13:05:04 +02:00
Harman Kalra	fabb881df3	octeontx2-af: fix VF bringup affecting PF promiscuous state Mbox handling of nix_set_rx_mode for a VF with promiscuous and all_multi flags set to false causes deletion of the PF's promiscuous and allmulti MCAM rules. This occurs because the APIs that enable/disable these rules operate only on the PF, even when the mbox request is made via a VF interface. Guard both rvu_npc_enable_allmulti_entry() and rvu_npc_enable_promisc_entry() disable paths with an is_vf() check so that a VF bringing up or tearing down its interface cannot inadvertently clear the PF's MCAM rules. Fixes: `967db3529e` ("octeontx2-af: add support for multicast/promisc packet replication feature") Signed-off-by: Harman Kalra <hkalra@marvell.com> Signed-off-by: Nitin Shetty J <nshettyj@marvell.com> Link: https://patch.msgid.link/20260702045616.3002773-2-nshettyj@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-09 12:00:04 +02:00
Li RongQing	c914307e1d	net/mlx5: Fix L3 tunnel entropy refcount leak mlx5_tun_entropy_refcount_inc() counts both VXLAN and L2-to-L3 tunnel reformat entries as entropy-enabling users. The matching decrement path only handled VXLAN, leaving L2-to-L3 tunnel entries counted after release. Handle MLX5_REFORMAT_TYPE_L2_TO_L3_TUNNEL in mlx5_tun_entropy_refcount_dec() as well so the enabling entry refcount remains balanced. Fixes: `f828ca6a2f` ("net/mlx5e: Add support for hw encapsulation of MPLS over UDP") Signed-off-by: Li RongQing <lirongqing@baidu.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260703141423.1723-1-lirongqing@baidu.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-09 11:21:12 +02:00
Théo Lebrun	27f575836c	net: macb: drop in-flight Tx SKBs on close The MACB driver has since forever leaked the outgoing SKBs that have not yet been marked as completed. They live in queue->tx_skb which gets freed without remorse nor checking. macb_free_consistent() gets called in a few codepaths, but only close will trigger the added expressions. In macb_open() and macb_alloc_consistent() failure cases, queues' tx_skb just got allocated and are empty. Fixes: `89e5785fc8` ("[PATCH] Atmel MACB ethernet driver") Cc: stable@vger.kernel.org Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com> Link: https://patch.msgid.link/20260702-macb-drop-tx-v4-1-1c833eebdbc8@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-09 10:48:15 +02:00
Dexuan Cui	c72a0f09c5	net: mana: Sync page pool RX frags for CPU MANA allocates RX buffers from page pool fragments when frag_count is greater than 1. In that case the buffers remain DMA mapped by page pool and the RX completion path does not call dma_unmap_single(). As a result, the implicit sync-for-CPU normally performed by dma_unmap_single() is missing before the packet data is passed to the networking stack. This breaks RX on configurations which require explicit DMA syncing, for example when booted with swiotlb=force. Fix this by recording the page pool page and DMA sync offset when the RX buffer is allocated, and syncing the received packet range for CPU access before handing the RX buffer to the stack. Fixes: `730ff06d3f` ("net: mana: Use page pool fragments for RX buffers instead of full pages to improve memory efficiency.") Cc: stable@vger.kernel.org Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Link: https://patch.msgid.link/20260702041237.617719-3-decui@microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-09 10:36:14 +02:00
Dexuan Cui	2e2a83b499	net: mana: Validate the packet length reported by the NIC Validate the packet length reported in the RX CQE before passing it to skb processing. The CQE is supplied by the NIC device and should not be blindly trusted. Cc: stable@vger.kernel.org Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Fixes: `ca9c54d2d6` ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Link: https://patch.msgid.link/20260702041237.617719-2-decui@microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-09 10:35:54 +02:00
Suman Ghosh	235acadd31	octeontx2-pf: check DMAC extraction support before filtering Currently, configuring a VF MAC address via the PF (e.g., 'ip link set <pf> vf 0 mac <mac>') blindly attempts to install a DMAC-based hardware filter. However, the hardware parser profile might not support DMAC extraction. Check if the hardware parsing profile supports DMAC extraction before adding the filter. Additionally, emit a warning message to inform the operator if the MAC filter installation fails due to missing DMAC extraction support. Update config->mac only after hardware programming succeeds in otx2_set_vf_mac(). Fixes: `f0c2982aaf` ("octeontx2-pf: Add support for SR-IOV management functions") Signed-off-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: Nitin Shetty J <nshettyj@marvell.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20260702033451.2969880-1-nshettyj@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-08 12:01:13 +02:00
Rosen Penev	1a3267a8c9	net: mdio: select REGMAP_MMIO instead of depending on it REGMAP_MMIO is a hidden (non-user-visible) tristate symbol. Using depends on it is incorrect because there is no way for the user to enable it directly. Change to select, which is the convention used by every other driver in the tree that needs REGMAP_MMIO. Fixes: `8057cbb833` ("net: mdio: mscc-miim: Add depend of REGMAP_MMIO on MDIO_MSCC_MIIM") Assisted-by: opencode:big-pickle Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260702032653.1580616-1-rosenp@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-08 10:04:01 +02:00
Enrico Pozzobon	60444706aa	net: usb: lan78xx: disable VLAN filter in promiscuous mode The hardware VLAN filter (RFE_CTL_VLAN_FILTER_) drops VLAN-tagged frames whose VID has not been registered via lan78xx_vlan_rx_add_vid(). It is left enabled in promiscuous mode, so packet capture (e.g. tcpdump or Wireshark) does not see tagged frames for unregistered VIDs. Clear the filter while the interface is promiscuous and restore it from NETIF_F_HW_VLAN_CTAG_FILTER otherwise. Enforce the same condition in lan78xx_set_features() so netdev_update_features() cannot re-enable the filter while promiscuous. Fixes: `55d7de9de6` ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver") Signed-off-by: Enrico Pozzobon <enrico.pozzobon@dissecto.com> Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de> Link: https://patch.msgid.link/20260701-lan78xx-vlan-promisc-v3-1-232266d32743@dissecto.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-07 13:07:51 +02:00
Yuho Choi	5c0e3ba4f5	net/liquidio: drop cached VF pci_dev LUT The PF SR-IOV enable path caches VF pci_dev pointers in dpiring_to_vfpcidev_lut[] by iterating with pci_get_device(). Those entries do not own a reference, because the iterator drops the previous device reference on each step. The cached pointer is then dereferenced later when handling OCTEON_VF_FLR_REQUEST. Replace the cached VF mapping with runtime lookup on the mailbox DPI ring: derive the VF index from q_no, resolve the VF via exported PCI IOV helpers, validate it with the PF pointer and VF ID, then issue pcie_flr() and drop the reference with pci_dev_put(). Remove the unused VF lookup table initialization and cleanup. Fixes: `ca6139ffc6` ("liquidio CN23XX: sysfs VF config support") Fixes: `8c978d0592` ("liquidio CN23XX: Mailbox support") Signed-off-by: Yuho Choi <dbgh9129@gmail.com> Link: https://patch.msgid.link/20260701040847.1897845-1-dbgh9129@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-07 11:03:35 +02:00
Dong Yibo	d9d6d67f4c	net: rnpgbe: fix mailbox endianness and remove pointer casts The rnpgbe mailbox exchanges data through 32-bit MMIO registers in little-endian wire format. The original code had two problems: 1. FW structs (with __le16/__le32 fields) were cast to (u32 ) before reaching the mailbox transport, hiding the endian annotations from sparse. 2. No cpu_to_le32()/le32_to_cpu() conversion was done between CPU-endian MMIO values and the little-endian payload, causing data corruption on big-endian systems. Fix by adding the missing byte-order conversions in the transport layer and introducing union wrappers (mbx_fw_cmd_req_u, mbx_fw_cmd_reply_u) that overlay each FW struct with a __le32 dwords[] array. Callers fill named fields using cpu_to_le16/32(), then pass dwords[] to the transport, which now takes explicit __le32 instead of u32 *. This eliminates all pointer casts on the mailbox data path and lets sparse verify the conversions. Fixes: `4543534c3e` ("net: rnpgbe: Add basic mbx ops support") Signed-off-by: Dong Yibo <dong100@mucse.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260701032208.1843156-2-dong100@mucse.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-07 10:45:42 +02:00
Maoyi Xie	240c8d2c71	wifi: brcmfmac: cyw: fix heap overflow on a short auth frame brcmf_notify_auth_frame_rx() takes the frame length from the firmware event and copies the frame body with the management header offset subtracted: u32 mgmt_frame_len = e->datalen - sizeof(struct brcmf_rx_mgmt_data); ... memcpy(&mgmt_frame->u, frame, mgmt_frame_len - offsetof(struct ieee80211_mgmt, u)); The only length check is e->datalen >= sizeof(*rxframe), so mgmt_frame_len can be anything from 0 up. offsetof(struct ieee80211_mgmt, u) is 24. When mgmt_frame_len is below that, the subtraction wraps as an unsigned value to a huge length. The memcpy then runs far past the kzalloc'd buffer. A malicious or malfunctioning AP can make the frame short during the external SAE auth exchange, so this is a remotely triggered heap overflow. Reject frames shorter than the management header offset before the copy. Fixes: `66f909308a` ("wifi: brcmfmac: cyw: support external SAE authentication in station mode") Link: https://lore.kernel.org/r/178214417708.2368577.16740907093694208834@maoyixie.com Cc: stable@vger.kernel.org Co-developed-by: Kaixuan Li <kaixuan.li@ntu.edu.sg> Signed-off-by: Kaixuan Li <kaixuan.li@ntu.edu.sg> Signed-off-by: Maoyi Xie <maoyixie.tju@gmail.com> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://patch.msgid.link/20260627131313.3878893-1-maoyixie.tju@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-07-07 10:22:35 +02:00
Runyu Xiao	2a665946e0	wifi: brcmfmac: initialize SDIO data work before cleanup brcmf_sdio_probe() stores the newly allocated bus in sdiodev->bus before allocating the ordered workqueue. If that allocation fails, the function jumps to fail and calls brcmf_sdio_remove(). brcmf_sdio_remove() unconditionally cancels bus->datawork. Initialize the work item before the first failure path that can reach brcmf_sdio_remove(), so the cleanup path always observes a valid work object. This issue was found by our static analysis tool and then confirmed by manual review of the probe error path and the remove-time work drain. The problem pattern is an early setup failure that reaches a cleanup helper which cancels an embedded work item before its initializer has run. A QEMU PoC forced alloc_ordered_workqueue() to fail at the same point in brcmf_sdio_probe(), before INIT_WORK(&bus->datawork) is reached. The resulting fail path calls brcmf_sdio_remove(), and DEBUG_OBJECTS reports the invalid work drain with brcmf_sdio_probe() and brcmf_sdio_remove() in the stack. Fixes: `9982464379` ("brcmfmac: make sdio suspend wait for threads to freeze") Signed-off-by: Runyu Xiao <runyu.xiao@seu.edu.cn> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://patch.msgid.link/20260619064401.1048976-1-runyu.xiao@seu.edu.cn Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-07-07 10:10:40 +02:00
Pengpeng Hou	8ecdeb8b8a	wifi: rsi: validate beacon length before fixed buffer copy rsi_prepare_beacon() copies the mac80211 beacon frame after FRAME_DESC_SZ into a management skb whose usable tailroom may be smaller than MAX_MGMT_PKT_SIZE after alignment. Validate the beacon length against the actual tailroom before the copy and skb_put(). Leave ownership of the management skb with the caller on error, matching the existing rsi_send_beacon() cleanup path. Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn> Link: https://patch.msgid.link/20260705084824.68105-1-pengpeng@iscas.ac.cn Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-07-06 14:11:09 +02:00
Pengpeng Hou	74ed3669f2	wifi: libipw: fix key index receive bound checks libipw_rx() reads skb->data[hdrlen + 3] to extract the WEP key index in both the software-decrypt key selection path and the hardware-decrypted IV/ICV strip path. In both places the existing guard only checks skb->len >= hdrlen + 3, which proves bytes up to hdrlen + 2 but not the byte at hdrlen + 3. Require hdrlen + 4 bytes before reading that item in both paths. This is a local source-boundary check only; it does not change the key index semantics. Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn> Link: https://patch.msgid.link/20260705083519.23567-1-pengpeng@iscas.ac.cn Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-07-06 14:11:09 +02:00
Pengpeng Hou	d06a3e60c8	wifi: rsi: bound background scan probe request copy rsi_send_bgscan_probe_req() allocates room for struct rsi_bgscan_probe plus MAX_BGSCAN_PROBE_REQ_LEN bytes, but copies the entire mac80211-generated probe request skb after the fixed header. The probe request length depends on scan IEs and is not checked against the fixed firmware buffer. Reject generated probe requests that do not fit the firmware command buffer before copying them into the skb. Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn> Link: https://patch.msgid.link/20260704011231.45593-1-pengpeng@iscas.ac.cn Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-07-06 14:11:08 +02:00
Pengpeng Hou	13ff543e0b	wifi: libertas: reject short monitor TX frames In monitor mode, lbs_hard_start_xmit() casts skb->data to a radiotap TX header, skips that header, and then copies the 802.11 destination address from offset 4 in the remaining frame. The generic length check only rejects zero-length and oversized skbs, so a short monitor frame can be read past the end of the skb data. Require enough bytes for the radiotap TX header and the destination address field before using the monitor-mode header layout. Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn> Link: https://patch.msgid.link/20260704011140.37639-1-pengpeng@iscas.ac.cn Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-07-06 14:11:08 +02:00
Corentin Labbe	0a2581cbae	wifi: ralink: RT2X00: init EEPROM properly I have an hostapd setup with a 01:00.0 Network controller: Ralink corp. RT2790 Wireless 802.11n 1T/2R PCIe The setup work fine on 6.18.26-gentoo It breaks on 6.18.33-gentoo (and still broken on 6.18.37) I found an hint in dmesg: On 6.18.26-gentoo I see: May 31 15:48:45 trash01 kernel: ieee80211 phy0: rt2x00_set_rf: Info - RF chipset 0003 detected On 6.18.33-gentoo I see: May 31 15:22:57 trash01 kernel: ieee80211 phy0: rt2x00_set_rf: Info - RF chipset 0006 detected The RF chipset seems badly detected. The problem was the EEPROM which was badly initialized. Probably the origin was in some PCI change but unfortunately I couldn't play to bisect/reboot often the board with this card to do it. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Acked-by: Stanislaw Gruszka <stf_xl@wp.pl> Link: https://patch.msgid.link/20260703134932.3786771-1-clabbe@baylibre.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-07-06 14:11:08 +02:00
Pengpeng Hou	843fe9bc58	wifi: rsi: avoid reading TKIP MIC keys for non-TKIP ciphers rsi_hal_load_key() copies tx_mic_key and rx_mic_key from data[16] and data[24] whenever key data is present. Those offsets are only part of the 32-byte TKIP key layout. Shorter keys used by other ciphers, such as CCMP, do not provide those bytes, so the unconditional copies can read past the supplied key buffer. Only copy the MIC keys for TKIP, and reject malformed TKIP keys that are shorter than the expected 32-byte layout. Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn> Link: https://patch.msgid.link/20260701053414.34015-1-pengpeng@iscas.ac.cn [drop useless length check] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-07-06 14:11:08 +02:00
Xiang Mei	ebd6d37fa9	wifi: p54: validate RX frame length in p54_rx_eeprom_readback() p54_rx_eeprom_readback() copies the requested EEPROM slice out of a device-supplied readback frame without checking that the skb actually holds that many bytes. Commit `da1b9a55ff` ("wifi: p54: prevent buffer-overflow in p54_rx_eeprom_readback()") closed the destination overflow by copying a fixed priv->eeprom_slice_size (and rejecting a mismatched advertised len), but the source side is still unbounded: nothing verifies the frame is long enough to supply that many bytes. A malicious USB device can send a short frame whose advertised len matches priv->eeprom_slice_size while the payload is truncated. The equality check passes and memcpy() reads past the end of the skb, leaking adjacent heap: BUG: KASAN: slab-out-of-bounds in p54_rx (drivers/net/wireless/intersil/p54/txrx.c:507) Read of size 1016 at addr ffff88800f077114 by task swapper/0/0 Call Trace: <IRQ> ... __asan_memcpy (mm/kasan/shadow.c:105) p54_rx (drivers/net/wireless/intersil/p54/txrx.c:507) p54u_rx_cb (drivers/net/wireless/intersil/p54/p54usb.c:163) __usb_hcd_giveback_urb (drivers/usb/core/hcd.c:1657) dummy_timer (drivers/usb/gadget/udc/dummy_hcd.c:2005) ... </IRQ> The buggy address belongs to the object at ffff88800f0770c0 which belongs to the cache skbuff_small_head of size 704 The buggy address is located 84 bytes inside of allocated 704-byte region [ffff88800f0770c0, ffff88800f077380) Check that the slice fits in the skb before copying. Fixes: `7cb770729b` ("p54: move eeprom code into common library") Reported-by: Weiming Shi <bestswngs@gmail.com> Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Xiang Mei <xmei5@asu.edu> Acked-by: Christian Lamparter <chunkeey@gmail.com> Link: https://patch.msgid.link/20260628000510.4152481-1-xmei5@asu.edu Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-07-06 14:11:07 +02:00
Yousef Alhouseen	23b493d9dc	wifi: mac80211_hwsim: avoid treating MCS as legacy rate index Injected HT and VHT rates store an MCS value in rates[0].idx rather than an index into the legacy bitrate table. hwsim nevertheless passes these rates to ieee80211_get_tx_rate() while generating monitor frames and timestamps. A crafted injected frame can therefore read beyond the bitrate table. If the resulting bitrate is zero, mac80211_hwsim_write_tsf() also divides by zero, as observed by syzbot. Use ieee80211_get_tx_rate() only for legacy rates. The existing fallback continues to supply a conservative bitrate where hwsim does not yet calculate MCS rates. Reported-by: syzbot+21629c14aa749636db9d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=21629c14aa749636db9d Signed-off-by: Yousef Alhouseen <alhouseenyousef@gmail.com> Link: https://patch.msgid.link/20260628002537.23550-1-alhouseenyousef@gmail.com [drop wrong Fixes tag] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-07-06 14:11:07 +02:00
Dawei Feng	63c2391dee	wifi: libertas: fix memory leak in helper_firmware_cb() helper_firmware_cb() neglects to free the single-stage firmware image after a successful async load, leading to a memory leak in the USB firmware-download path. Fix this memory leak by calling release_firmware() immediately after lbs_fw_loaded() returns. The bug was first flagged by an experimental analysis tool we are developing for kernel memory-management bugs while analyzing v6.13-rc1. The tool is still under development and is not yet publicly available. Manual inspection confirms that the bug is still present in the current wireless tree. An x86_64 allyesconfig build showed no new warnings. As we do not have compatible Libertas USB hardware for exercising this firmware-download path, no runtime testing was able to be performed. Fixes: `1dfba3060f` ("libertas: move firmware lifetime handling to firmware.c") Signed-off-by: Dawei Feng <dawei.feng@seu.edu.cn> Link: https://patch.msgid.link/20260624085343.575508-1-dawei.feng@seu.edu.cn Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-07-06 14:11:07 +02:00
Maoyi Xie	aa6dcd5c8d	wifi: libertas_tf: fix use-after-free in lbtf_free_adapter() lbtf_free_adapter() calls timer_delete(&priv->command_timer), which does not wait for a running command_timer_fn() callback. lbtf_free_adapter() runs on the teardown path right before ieee80211_free_hw() frees priv, both in lbtf_remove_card() and in the probe error path. command_timer is armed by mod_timer() in lbtf_cmd() whenever a firmware command is sent. command_timer_fn() dereferences priv. If a command times out as the device is removed, command_timer_fn() runs concurrently with teardown and dereferences priv after it has been freed. This is the same use-after-free that commit `03cc8f90d0` ("wifi: libertas: fix use-after-free in lbs_free_adapter()") fixed in the sibling libertas driver. The libertas_tf variant has the identical pattern and was left unchanged. Use timer_delete_sync() so any in-flight callback completes before priv is freed. Fixes: `06b16ae531` ("libertas_tf: main.c, data paths and mac80211 handlers") Cc: stable@vger.kernel.org Signed-off-by: Maoyi Xie <maoyixie.tju@gmail.com> Link: https://patch.msgid.link/178211481807.2212567.8773346114561900100@maoyixie.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-07-06 14:11:07 +02:00
Bryam Vargas	10a2b430f8	wifi: mac80211_hwsim: clamp virtio RX length before skb_put hwsim_virtio_rx_work() passes the virtqueue used-ring length reported by the device straight to skb_put() on a fixed-size receive skb. A backend reporting a length larger than the skb tailroom drives skb_put() past the buffer end and hits skb_over_panic() -- a host-triggerable guest panic (denial of service). Clamp the length to the skb's available room before skb_put(). A conforming device never reports more than the posted buffer size, so valid frames are unaffected; a truncated over-report then fails the length/header checks in hwsim_virtio_handle_cmd() and is dropped, so truncating rather than dropping here cannot be turned into a parsing problem. Fixes: `5d44fe7c98` ("mac80211_hwsim: add frame transmission support over virtio") Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me> Link: https://patch.msgid.link/20260620-b4-disp-474bee37-v1-1-1a4d37f3e2d4@proton.me Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-07-06 14:11:06 +02:00
Abdun Nihaal	0d388f6203	wifi: ipw2100: fix potential memory leak in ipw2100_pci_init_one() The memory allocated in the ipw2100_alloc_device() function is not freed in some of the error paths in ipw2100_pci_init_one(). Fix that by converting the direct return into a goto to the error path return. The error path when pci_enable_device() fails cannot jump to fail, since at this point priv is not set, so perform error handling inline. Fixes: `2c86c27501` ("Add ipw2100 wireless driver.") Signed-off-by: Abdun Nihaal <nihaal@cse.iitm.ac.in> Link: https://patch.msgid.link/20260620065242.93798-1-nihaal@cse.iitm.ac.in Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-07-06 14:11:06 +02:00
Eric Dumazet	9e05e91a9a	amt: fix size calculation in amt_get_size() amt_get_size() incorrectly used sizeof(struct iphdr) for the sizes of IFLA_AMT_DISCOVERY_IP, IFLA_AMT_REMOTE_IP, and IFLA_AMT_LOCAL_IP. These attributes contain IPv4 addresses (__be32), not full IP headers. Replace sizeof(struct iphdr) with sizeof(__be32) to avoid over-allocating netlink message space. Fixes: `b9022b53ad` ("amt: add control plane of amt interface") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260701122329.3562825-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-06 13:01:24 +02:00
Xiang Mei	f0f1887a9e	net: qualcomm: rmnet: validate MAP frame length before ingress parsing When ingress deaggregation is disabled, rmnet_map_ingress_handler() passes the skb straight to __rmnet_map_ingress_handler(), skipping the length validation that rmnet_map_deaggregate() performs on the aggregated path. The parser then dereferences the MAP header and csum header/trailer based on the on-wire pkt_len without checking skb->len, so a short frame is read out of bounds: BUG: KASAN: slab-out-of-bounds in rmnet_map_checksum_downlink_packet Read of size 1 at addr ffff88801118ed00 by task exploit/147 Call Trace: ... rmnet_map_checksum_downlink_packet (drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c:413) __rmnet_map_ingress_handler (drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c:96) rmnet_rx_handler (drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c:129) __netif_receive_skb_core.constprop.0 (net/core/dev.c:6089) netif_receive_skb (net/core/dev.c:6460) tun_get_user (drivers/net/tun.c:1955) tun_chr_write_iter (drivers/net/tun.c:2001) vfs_write (fs/read_write.c:688) ksys_write (fs/read_write.c:740) do_syscall_64 (arch/x86/entry/syscall_64.c:94) ... Factor that validation out of rmnet_map_deaggregate() into rmnet_map_validate_packet_len() and run it on the no-aggregation path too. The MAP header is bounds-checked first, since this path can receive a frame shorter than the header. Fixes: `ceed73a2cf` ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation") Reported-by: Weiming Shi <bestswngs@gmail.com> Suggested-by: Subash Abhinov Kasiviswanathan <subash.a.kasiviswanathan@oss.qualcomm.com> Signed-off-by: Xiang Mei <xmei5@asu.edu> Reviewed-by: Subash Abhinov Kasiviswanathan <subash.a.kasiviswanathan@oss.qualcomm.com> Link: https://patch.msgid.link/20260630174110.2003121-1-xmei5@asu.edu Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-06 12:47:02 +02:00
Shigeru Yoshida	a0a558ca7e	qede: fix off-by-one in BD ring consumption on build_skb failure qede_rx_build_skb() and qede_tpa_rx_build_skb() do not check for a NULL return from qede_build_skb(). When it returns NULL under memory pressure, the functions still consume a BD from the ring before returning NULL. The callers then recycle additional BDs, resulting in one extra BD being consumed (off-by-one). This desynchronizes the BD ring, which can corrupt DMA page reference counts and lead to SLUB freelist corruption. Commit `4e910dbe36` ("qede: confirm skb is allocated before using") added a NULL check inside qede_build_skb() to prevent a NULL pointer dereference, but did not address the missing NULL checks in the callers, making this off-by-one reachable. Fix this by adding NULL checks for the return value of qede_build_skb() in both qede_rx_build_skb() and qede_tpa_rx_build_skb(), returning NULL immediately before any BD ring manipulation. Fixes: `8a8633978b` ("qede: Add build_skb() support.") Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Reviewed-by: Jamie Bainbridge <jamie.bainbridge@gmail.com> Link: https://patch.msgid.link/20260630164623.3152625-1-syoshida@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-06 12:03:10 +02:00
Runyu Xiao	536fb3d739	wifi: rt2x00: avoid full teardown before work setup in probe rt2x00lib_probe_dev() uses the full rt2x00lib_remove_dev() teardown for all probe failures. However, drv_data allocation and workqueue allocation can fail before intf_work, autowakeup_work and sleep_work have been initialized. Do not enter the full remove path until the probe has reached the point where those work items are set up. Return directly for drv_data allocation failure, and use a small early cleanup path for workqueue allocation failure. This issue was found by our static analysis tool and then confirmed by manual review of rt2x00lib_probe_dev() and rt2x00lib_remove_dev(). The early probe exits should not call a common teardown path that assumes the later work setup has already completed. A QEMU PoC forced alloc_ordered_workqueue() to fail before the work initializers are reached. The resulting fail path entered rt2x00lib_remove_dev(), and DEBUG_OBJECTS reported invalid work drains with rt2x00lib_probe_dev() and rt2x00lib_remove_dev() in the stack. Fixes: `1ebbc48520` ("rt2x00: Introduce concept of driver data in struct rt2x00_dev.") Fixes: `0439f5367c` ("rt2x00: Move TX/RX work into dedicated workqueue") Cc: stable@vger.kernel.org Signed-off-by: Runyu Xiao <runyu.xiao@seu.edu.cn> Link: https://patch.msgid.link/20260619073104.1809161-1-runyu.xiao@seu.edu.cn Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-07-06 10:09:29 +02:00
Rafael Beims	d78a407bad	wifi: mwifiex: fix permanently busy scans after multiple roam iterations In order for the firmware to sleep, the driver has to confirm a previously received sleep request. The normal sequence of evets goes like this: EVENT_SLEEP -> adapter->ps_state = PS_STATE_PRE_SLEEP -> sleep-confirm -> SLEEP -> EVENT_AWAKE -> AWAKE. Before sending the sleep-confirm command, the driver must make sure there are no commands either running or waiting to be completed. mwifiex_ret_802_11_associate() unconditionally sets ps_state = PS_STATE_AWAKE when it processes the association command response, outside of the normal powersave management flow. If EVENT_SLEEP arrives while the association command is in flight, ps_state is PRE_SLEEP when the association command response is parsed, and the forced AWAKE overwrites it. The deferred sleep-confirm is never sent. A subsequent scan_start command is correctly acknowledged, but the firmware doesn't generate scan_result events. The scan request never finishes, and additional requests from userspace fail with -EBUSY. After testing on both IW412 and W8997, I could only trigger the bug on the IW412 and observed the firmwares behave differently. On the IW412 the firmware still sends EVENT_SLEEP while the authentication / association process is ongoing. A W8997 under the same conditions seems to suppress power-save for the duration of the association, so PRE_SLEEP never coincided with the association response even after extended periods of testing using the loops described below (>12hours). On the IW412, the delay between commands that triggers an EVENT_SLEEP was empirically determined to be ~20ms. This delay can naturally occur when the driver is outputting debugging information (debug_mask = 0x00000037), in which situation the busy scans issue is repeatable while running "test 1)" as described below. If the delay between commands is less than ~20ms, the firmware stays awake and the issue was not reproducible running the same test. The host_mlme=false path also behaves differently. In this case, the entire authentication / association transaction is executed by one command (HostCmd_CMD_802_11_ASSOCIATE), and the firmware doesn't emit EVENT_SLEEP while the command is running. Remove the assignment so the ps_state is only manipulated in the paths that are related to powersave event handling and on the main workqueue for correct sleep confirmation. The following loop tests were performed (with debugging output enabled): 1) force roaming between two AP's, one 5GHz and one 2.4GHz, same SSID. Use wpa_cli to trigger the roaming behavior, sleep 2s between iterations. 2) force a disconnection to AP 1 and a connection to AP 2, test scan. Use wpa_cli to trigger the connection changes, sleep 2s between iterations. Each test ran in each device for at least 3 hours. Fixes: `5e6e3a92b9` ("wireless: mwifiex: initial commit for Marvell mwifiex driver") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Rafael Beims <rafael.beims@toradex.com> Reviewed-by: Jeff Chen <jeff.chen_1@nxp.com> Link: https://patch.msgid.link/20260612122547.1586872-2-rafael@beims.me Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-07-06 10:09:01 +02:00
Rafael Beims	a707e4127c	wifi: mwifiex: fix roaming to different channel in host_mlme mode When host MLME is enabled, mwifiex_cfg80211_authenticate() transmits the authentication frame on a remain-on-channel (ROC) reservation so that the frame is sent on the target BSS's channel. The ROC is only configured when priv->auth_flag is zero. priv->auth_flag is set to HOST_MLME_AUTH_PENDING when the auth frame is queued and advances to HOST_MLME_AUTH_DONE once authentication completes. It is only cleared back to zero on a disconnect, deauth or timeout path; nothing clears it when an association succeeds. It therefore stays at HOST_MLME_AUTH_DONE for the whole connected session. When the station later roams to a BSS on a different channel, the next authentication finds auth_flag != 0, skips the ROC setup, and the auth frame is transmitted on the currently-associated channel instead of the target's channel. Authentication times out on the new AP and the device stays connected to the original AP. Gate the ROC setup on HOST_MLME_AUTH_PENDING instead of on auth_flag being completely clear. This re-arms the remain-on-channel for every new authentication attempt, while still suppressing a redundant ROC during the multi-frame SAE exchange, where auth_flag stays PENDING between the commit and confirm frames. This change was tested in 3 different devices: Verdin AM62 (IW412 SD-UART) - (16.92.21.p142) Verdin iMX8MM (W8997 SD-SD) - (16.68.1.p197) Verdin iMX8MP (W8997 SD-UART) - (16.92.21.p137) There following loop tests were performed: 1) force roaming between two AP's, one 5GHz and one 2.4GHz, same SSID. Use wpa_cli to trigger the roaming behavior, sleep 2s between iterations. 2) force a disconnection to AP 1 and a connection to AP 2, test scan. Use wpa_cli to trigger the connection changes, sleep 2s between iterations. Each test ran in each device for at least 3 hours. Fixes: `36995892c2` ("wifi: mwifiex: add host mlme for client mode") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Rafael Beims <rafael.beims@toradex.com> Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com> Link: https://patch.msgid.link/20260610150021.1018611-1-rafael@beims.me Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-07-06 10:06:40 +02:00
Jens Emil Schulz Østergaard	d7a8d500d7	net: microchip: vcap: fix races on the shared Super VCAP block The VCAP instances on a chip are not independent, yet they are locked independently. On sparx5 and lan969x the IS0 and IS2 instances are backed by the same Super VCAP hardware block and share its cache and command registers: every access drives the shared VCAP_SUPER_CTRL register and moves data through the shared cache registers. Accessing one instance therefore races with accessing another. The per-instance admin->lock cannot prevent this, as each instance takes a different lock. The locking issue is mostly disguised by the fact that the core usage of the vcap api runs under rtnl. However, the full rule dump in debugfs decodes rules straight from hardware (a READ command followed by a cache read) and runs outside rtnl, so it races a concurrent tc-flower rule write to another Super VCAP instance. Besides corrupting the dump, the read repopulates the shared cache between the writers cache fill and its write command, so the writer commits the wrong data and corrupts the hardware entry. Introduce vcap_lock() and vcap_unlock() helpers and route every rule lock site in the VCAP API and its debugfs code through them. Replace the per-instance admin->lock with a single mutex in struct vcap_control that serializes access to all instances. The helpers reach it through a new admin->vctrl back-pointer, and the clients initialise and destroy the control lock instead of a per-instance one. No path holds more than one instance lock, so collapsing them onto a single mutex cannot self-deadlock. Fixes: `71c9de9952` ("net: microchip: sparx5: Add VCAP locking to protect rules") Signed-off-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com> Link: https://patch.msgid.link/20260630-microchip_fix_vcap_locking-v1-1-f60a4596734d@microchip.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-05 09:18:47 +02:00
Feng Liu	5a799714e8	net/mlx5e: Fix publication race for priv->channel_stats[] mlx5e_channel_stats_alloc() publishes a new entry to priv->channel_stats[] and then increments priv->stats_nch as a publication token, but neither store carries any memory barrier: priv->channel_stats[ix] = kvzalloc_node(...); if (!priv->channel_stats[ix]) return -ENOMEM; priv->stats_nch++; Concurrent readers compute the loop bound from priv->stats_nch and then dereference priv->channel_stats[i] using plain accesses, e.g. for (i = 0; i < priv->stats_nch; i++) { struct mlx5e_channel_stats *cs = priv->channel_stats[i]; ... cs->rq.packets ... } On weakly-ordered architectures (ARM, PowerPC, RISC-V) the writes to channel_stats[ix] and stats_nch may become visible to other CPUs out of program order. A reader can observe stats_nch == N while still seeing channel_stats[N-1] == NULL, leading to a NULL pointer dereference in the channel_stats loop. This has been observed in production on BlueField-3 DPUs (arm64), where ovs-vswitchd queries netdev statistics over netlink during NIC bringup, racing mlx5e_open_channel() -> mlx5e_channel_stats_alloc() on another CPU: Unable to handle kernel NULL pointer dereference at virtual address 0x840 Hardware name: BlueField-3 DPU pc : mlx5e_fold_sw_stats64+0x30/0x180 [mlx5_core] Call trace: mlx5e_fold_sw_stats64+0x30/0x180 [mlx5_core] dev_get_stats+0x50/0xc0 ovs_vport_get_stats+0x38/0xac [openvswitch] ovs_vport_cmd_fill_info+0x194/0x290 [openvswitch] ovs_vport_cmd_get+0xbc/0x10c [openvswitch] genl_family_rcv_msg_doit+0xd0/0x160 genl_rcv_msg+0xec/0x1f0 netlink_rcv_skb+0x64/0x130 genl_rcv+0x40/0x60 netlink_unicast+0x2fc/0x370 netlink_sendmsg+0x1dc/0x454 ... __arm64_sys_sendmsg+0x2c/0x40 Add mlx5e_stats_nch_write() and mlx5e_stats_nch_read() helpers in en.h that wrap the smp_store_release()/smp_load_acquire() pair on stats_nch. The release/acquire pair establishes the contract: stats_nch == N => channel_stats[0..N-1] are visible and non-NULL. Publish the stats_nch increment via mlx5e_stats_nch_write() in the writer (mlx5e_channel_stats_alloc()), and read stats_nch via mlx5e_stats_nch_read() in all readers: mlx5e RX/TX queue stats, mlx5e_get_base_stats(), ethtool channels stats, IPoIB stats, the sw_stats fold and the HV VHCA stats agent. Fixes: `fa691d0c9c` ("net/mlx5e: Allocate per-channel stats dynamically at first usage") Signed-off-by: Feng Liu <feliu@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260630115151.729219-4-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-03 18:50:31 +02:00
Feng Liu	89b25b5f46	net/mlx5e: Fix HV VHCA stats agent registration race mlx5e_hv_vhca_stats_create() registers the stats agent through mlx5_hv_vhca_agent_create(). The helper publishes the agent in hv_vhca->agents[type] under agents_lock and immediately schedules an asynchronous control invalidation on the HV VHCA workqueue before returning to mlx5e. The asynchronous invalidation invokes the control agent's invalidate callback, which reads the hypervisor control block and forwards the command to mlx5e_hv_vhca_stats_control(). That callback may either: - call cancel_delayed_work_sync(&priv->stats_agent.work), or - call queue_delayed_work(priv->wq, &sagent->work, sagent->delay). However, the delayed_work and priv->stats_agent.agent are only initialized after mlx5_hv_vhca_agent_create() returns to mlx5e: agent = mlx5_hv_vhca_agent_create(...); /* publish + invalidate / ... priv->stats_agent.agent = agent; / too late / INIT_DELAYED_WORK(&priv->stats_agent.work, ...); / too late / If the asynchronous control path runs before the two assignments above, it can: - Operate on an uninitialized delayed_work whose timer.function is NULL. queue_delayed_work() calls add_timer() unconditionally, so when the timer expires the timer softirq invokes a NULL function pointer. - Re-initialize the timer later through INIT_DELAYED_WORK() while the timer is already enqueued in the timer wheel, corrupting the hlist (entry.pprev cleared while the previous bucket node still points at this entry). - When the worker eventually runs, mlx5e_hv_vhca_stats_work() reads sagent->agent (NULL) and dereferences it inside mlx5_hv_vhca_agent_write(). Fix this by: - Initializing priv->stats_agent.work before invoking mlx5_hv_vhca_agent_create(), so the work is always in a valid state when the control callback observes it. - Adding a struct mlx5_hv_vhca_agent ctx_update out-parameter to mlx5_hv_vhca_agent_create(). The helper writes the agent pointer to ctx_update before publishing into hv_vhca->agents[] and triggering the agents_update flow, so any callback subsequently invoked from that flow already sees a valid priv->stats_agent.agent. This avoids having the control callback participate in agent initialization. While at it, access priv->stats_agent.agent with READ_ONCE()/WRITE_ONCE() for the cross-CPU access with the worker, and clear priv->stats_agent.buf on the agent_create() failure path. Fixes: `cef35af34d` ("net/mlx5e: Add mlx5e HV VHCA stats agent") Signed-off-by: Feng Liu <feliu@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260630115151.729219-3-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-03 18:50:31 +02:00
Feng Liu	25f6b929c7	net/mlx5e: Fix HV VHCA stats zero-sized buffer allocation mlx5e_hv_vhca_stats_create() is called from mlx5e_nic_enable(), before mlx5e_open(). At that point priv->stats_nch is still zero, because it is only ever incremented in mlx5e_channel_stats_alloc(), which is reached only from mlx5e_open_channel(). mlx5e_hv_vhca_stats_buf_size() therefore returns 0, and kvzalloc(0, GFP_KERNEL) returns ZERO_SIZE_PTR ((void )16) rather than NULL. The "if (!buf)" guard does not catch this, and mlx5e_hv_vhca_stats_create() completes "successfully" with priv->stats_agent.buf set to ZERO_SIZE_PTR. Once channels are opened (priv->stats_nch > 0) and the hypervisor enables stats reporting, mlx5e_hv_vhca_stats_work() recomputes buf_len using the new non-zero stats_nch and calls memset(buf, 0, buf_len) on ZERO_SIZE_PTR, faulting at address 0x10. Allocate the buffer based on priv->max_nch, which is set in mlx5e_priv_init() and is the upper bound on stats_nch: - Add a separate helper mlx5e_hv_vhca_stats_buf_max_size() that returns sizeof(per_ring_stats) max(max_nch, stats_nch), and use it for the kvzalloc() in mlx5e_hv_vhca_stats_create(). - Keep mlx5e_hv_vhca_stats_buf_size() (which returns based on stats_nch) for the worker's active payload size, so the wire format (block->rings = stats_nch) and the amount of data filled by mlx5e_hv_vhca_fill_stats() are unchanged. The max(max_nch, stats_nch) guard handles the rare case where mlx5e_attach_netdev() recomputes max_nch downward across a detach/resume cycle while priv->stats_nch persists (mlx5e_detach_netdev does not call mlx5e_priv_cleanup, so stats_nch is only reset when the netdev is destroyed). Without the guard, the worker could compute buf_len from stats_nch and overrun the smaller buffer allocated based on the reduced max_nch. Allocating a non-zero buffer also makes the kvzalloc() failure path in mlx5e_hv_vhca_stats_create() reachable for the first time: it returns early without (re)creating the agent. Clear priv->stats_agent.{agent,buf} in mlx5e_hv_vhca_stats_destroy() after freeing them, so that if a later create() bails out on this path, a subsequent teardown does not double-free the stale agent/buffer left from a previous enable/disable cycle. This mirrors the existing mlx5e pattern of preallocating arrays of size max_nch (e.g. priv->channel_stats) and lazily populating entries up to stats_nch on demand. Fixes: `fa691d0c9c` ("net/mlx5e: Allocate per-channel stats dynamically at first usage") Signed-off-by: Feng Liu <feliu@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260630115151.729219-2-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-03 18:50:31 +02:00
Shay Drory	7bed4af0ce	net/mlx5e: TC, skip peer flow cleanup when LAG seq is unavailable mlx5_lag_get_dev_seq() will return error when the peer isn't in the LAG or when no device is marked as master. Result bad memory access and kernel crash[1]. Hence, skip the peer when lookup fails. Note: In case there are peer flows, they are cleaned before LAG cleared the master mark. [1] RIP: 0010:mlx5e_tc_del_fdb_peers_flow+0x3d/0x350 [mlx5_core] Call Trace: <TASK> mlx5e_tc_clean_fdb_peer_flows+0xc1/0x130 [mlx5_core] mlx5_esw_offloads_unpair+0x3a/0x400 [mlx5_core] mlx5_esw_offloads_devcom_event+0xee/0x360 [mlx5_core] mlx5_devcom_send_event+0x7a/0x140 [mlx5_core] mlx5_esw_offloads_devcom_cleanup+0x2f/0x90 [mlx5_core] mlx5e_tc_esw_cleanup+0x28/0xf0 [mlx5_core] mlx5e_rep_tc_cleanup+0x19/0x30 [mlx5_core] mlx5e_cleanup_uplink_rep_tx+0x36/0x40 [mlx5_core] mlx5e_cleanup_rep_tx+0x55/0x60 [mlx5_core] mlx5e_detach_netdev+0x96/0xf0 [mlx5_core] mlx5e_netdev_change_profile+0x5b/0x120 [mlx5_core] mlx5e_netdev_attach_nic_profile+0x1b/0x30 [mlx5_core] mlx5e_vport_rep_unload+0xdd/0x110 [mlx5_core] __esw_offloads_unload_rep+0x81/0xb0 [mlx5_core] mlx5_eswitch_unregister_vport_reps+0x1d7/0x220 [mlx5_core] mlx5e_rep_remove+0x22/0x30 [mlx5_core] device_release_driver_internal+0x194/0x1f0 bus_remove_device+0xe8/0x1b0 device_del+0x159/0x3c0 mlx5_rescan_drivers_locked+0xbc/0x2d0 [mlx5_core] mlx5_unregister_device+0x54/0x80 [mlx5_core] mlx5_uninit_one+0x73/0x130 [mlx5_core] remove_one+0x78/0xe0 [mlx5_core] pci_device_remove+0x39/0xa0 Fixes: `971b28accc` ("net/mlx5: LAG, replace mlx5_get_dev_index with LAG sequence number") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260630112917.698313-4-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-03 18:39:02 +02:00
Shay Drory	d4b85f9a66	net/mlx5: LAG, MPESW, Fix missing complete() on devcom error mlx5_mpesw_work() returned without calling complete() when mlx5_lag_get_devcom_comp() returned NULL. A caller that queued the work and waited on mpesww->comp would block indefinitely. Funnel the early-return path through a new "complete" label so the waiter is always woken. Fixes: `b430c1b4f6` ("net/mlx5: Replace global mlx5_intf_lock with HCA devcom component lock") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260630112917.698313-3-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-03 18:39:02 +02:00
Shay Drory	0f0e4ae697	net/mlx5: LAG, Fix off-by-one in single-FDB error rollback On failure at index i, the reverse cleanup loop in mlx5_lag_create_single_fdb() starts from i, so the failed index itself is rolled back. That can operate on uninitialized state or double-tear-down a rule the add_one path already self-rolled-back. Start the rollback from i - 1 so only successfully-installed entries are undone. Fixes: `ddbb5ddc43` ("net/mlx5: LAG, Refactor lag logic") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260630112917.698313-2-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-03 18:39:02 +02:00
Linus Torvalds	d2c9a99135	Merge tag 'device-id-rework' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux Pull mod_devicetable.h header split from Uwe Kleine-König: "Split <linux/mod_devicetable.h> in per subsystem headers <linux/mod_devicetable.h> is included transitively in nearly every driver in an x86_64 allmodconfig build of v7.1: $ find drivers -name \.o -not -name \.mod.o \| wc -l 21330 $ find drivers -name \.o.cmd -not -name \.mod.o.cmd \| xargs grep -l mod_devicetable.h \| wc -l 17038 The result of this mixture of different and unrelated subsystem details is that even when touching an obscure device id struct most of the kernel needs to be recompiled. Given that each driver typically only needs one or two of these structures, splitting into per subsystem headers and only including what is really needed reduces the amount of needed recompilation. This split is implemented in the first commit and then after some preparatory work in the following commits, the last two replace includes of <linux/mod_devicetable.h> by the actually needed more specific headers. There are still a few instances left, but the ones with high impact (that is in headers that are used a lot) and the easy ones (.c files) are handled. These remaining includes will be addressed during the next merge window" * tag 'device-id-rework' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux: Replace <linux/mod_devicetable.h> by more specific <linux/device-id/.h> (c files) Replace <linux/mod_devicetable.h> by more specific <linux/device-id/.h> (headers) parisc: #include <linux/compiler.h> for unlikely() in <asm/ptrace.h> media: em28xx: Add include for struct usb_device_id LoongArch: KVM: Add include defining struct cpu_feature ALSA: hda/core: Add include defining struct hda_device_id usb: dwc2: Add include defining struct pci_device_id platform/x86: int3472: Add include defining struct dmi_system_id platform/x86: x86-android-tablets: Add include defining struct dmi_system_id i2c: Let i2c-core.h include <linux/i2c.h> of: Explicitly include <linux/types.h> and <linux/err.h> platform/x86: msi-ec: Ensure dmi_system_id is defined usb: serial: Include <linux/usb.h> in <linux/usb/serial.h> driver core: platform: Include header for struct platform_device_id driver: core: Include headers for acpi_device_id and of_device_id for struct device_driver media: ti: vpe: #include <linux/platform_device.h> explicitly mod_devicetable.h: Split into per subsystem headers	2026-07-02 20:54:26 -10:00
Dawei Feng	62e7df6d04	octeontx2-pf: fix SQB pointer leak on init failure otx2_init_hw_resources() initializes SQ aura and pool resources before several later setup steps. On failure, err_free_sq_ptrs only frees SQB pages, leaving the per-SQ sqb_ptrs arrays behind. Use otx2_free_sq_res() for the SQ unwind path and let it free sqb_ptrs even when sq->sqe has not been allocated yet. The bug was first flagged by an experimental analysis tool we are developing for kernel memory-management bugs while analyzing v6.13-rc1. The tool is still under development and is not yet publicly available. Manual inspection confirms that the bug is still present in v7.1.1. An x86_64 allyesconfig build showed no new warnings. As we do not have an OcteonTX2 PF device and the corresponding AF mailbox setup to test with, no runtime testing was able to be performed. Fixes: `caa2da34fd` ("octeontx2-pf: Initialize and config queues") Cc: stable@vger.kernel.org Reviewed-by: Ratheesh Kannoth <rkannoth@marvell.com> Signed-off-by: Dawei Feng <dawei.feng@seu.edu.cn> Link: https://patch.msgid.link/20260630071625.349996-1-dawei.feng@seu.edu.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-03 08:35:25 +02:00
Xiang Mei	03f384bc0c	net: usb: net1080: validate packet_len before pad-byte access in rx_fixup For an even packet_len, net1080_rx_fixup() reads the pad byte at skb->data[packet_len] before the skb->len != packet_len check further down, and packet_len is only bounded against NC_MAX_PACKET. A malicious NetChip 1080 device can send a short frame advertising a large even packet_len (e.g. 0x4000), so the pad-byte read lands past the end of the skb: BUG: KASAN: slab-out-of-bounds in net1080_rx_fixup Read of size 1 at addr ffff8880106c83c6 by task ksoftirqd/0/14 ... net1080_rx_fixup (drivers/net/usb/net1080.c:384) usbnet_bh (drivers/net/usb/usbnet.c:1589) process_one_work (kernel/workqueue.c:3322) bh_worker (kernel/workqueue.c:3708) tasklet_action (kernel/softirq.c:965) handle_softirqs (kernel/softirq.c:622) ... Reject the frame when packet_len >= skb->len before reading. Fixes: `904813cd8a` ("[PATCH] USB: usbnet (4/9) module for net1080 cables") Reported-by: Weiming Shi <bestswngs@gmail.com> Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Xiang Mei <xmei5@asu.edu> Link: https://patch.msgid.link/20260630045121.1565324-1-xmei5@asu.edu Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-07-03 08:23:23 +02:00

1 2 3 4 5 ...

140566 Commits