linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-16 11:21:26 -04:00

Author	SHA1	Message	Date
Eric Dumazet	c6bebaa744	ipv4: igmp: annotate data-races in igmp_heard_query() Multiple cpus can run igmp_heard_query() concurrently. Add missing READ_ONCE()/WRITE_ONCE() over following in_dev fields. - mr_qrv - mr_qi - mr_qri - mr_v1_seen - mr_v2_seen Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Reported-by: syzbot+ae9a171f239b14485310@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/69f38675.050a0220.3cbe47.0002.GAE@google.com Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260430164836.872079-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-01 17:11:42 -07:00
Kai Zen	4b9e327991	net: rtnetlink: zero ifla_vf_broadcast to avoid stack infoleak in rtnl_fill_vfinfo rtnl_fill_vfinfo() declares struct ifla_vf_broadcast on the stack without initialisation: struct ifla_vf_broadcast vf_broadcast; The struct contains a single fixed 32-byte field: /* include/uapi/linux/if_link.h / struct ifla_vf_broadcast { __u8 broadcast[32]; }; The function then copies dev->broadcast into it using dev->addr_len as the length: memcpy(vf_broadcast.broadcast, dev->broadcast, dev->addr_len); On Ethernet devices (the overwhelming majority of SR-IOV NICs) dev->addr_len is 6, so only the first 6 bytes of broadcast[] are written. The remaining 26 bytes retain whatever was previously on the kernel stack. The full struct is then handed to userspace via: nla_put(skb, IFLA_VF_BROADCAST, sizeof(vf_broadcast), &vf_broadcast) leaking up to 26 bytes of uninitialised kernel stack per VF per RTM_GETLINK request, repeatable. The other vf_ structs in the same function are explicitly zeroed for exactly this reason - see the memset() calls for ivi, vf_vlan_info, node_guid and port_guid a few lines above. vf_broadcast was simply missed when it was added. Reachability: any unprivileged local process can open AF_NETLINK / NETLINK_ROUTE without capabilities and send RTM_GETLINK with an IFLA_EXT_MASK attribute carrying RTEXT_FILTER_VF. The kernel walks each VF and emits IFLA_VF_BROADCAST, leaking 26 bytes of stack per VF per request. Stack residue at this call site can include return addresses and transient sensitive data; KASAN with stack instrumentation, or KMSAN, will flag the nla_put() when reproduced. Zero the on-stack struct before the partial memcpy, matching the existing pattern used for the other vf_* structs in the same function. Fixes: `75345f888f` ("ipoib: show VF broadcast address") Cc: stable@vger.kernel.org Signed-off-by: Kai Zen <kai.aizen.dev@gmail.com> Link: https://patch.msgid.link/3c506e8f936e52b57620269b55c348af05d413a2.1777557228.git.kai.aizen.dev@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-01 17:04:28 -07:00
Eric Dumazet	4f34002e2e	ipmr: prevent info-leak in pmr_cache_report() Yiming Qian reported: <quote> ipmr_cache_report()` allocates a report skb with `alloc_skb(128, GFP_ATOMIC)` and appends a `struct igmphdr` using `skb_put()`. In the non-`IGMPMSG_WHOLEPKT` path it initializes only: - `igmp->type` - `igmp->code` but does not initialize: - `igmp->csum` - `igmp->group` Later, `igmpmsg_netlink_event()` copies the bytes after `sizeof(struct igmpmsg)` into the `IPMRA_CREPORT_PKT` netlink attribute and emits `RTM_NEWCACHEREPORT` on `RTNLGRP_IPV4_MROUTE_R`. As a result, 6 bytes of stale heap data from the skb head are disclosed to userspace. </quote> Let's use skb_put_zero() instead of skb_put() to fix this bug. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Reported-by: Yiming Qian <yimingqian591@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20260430070611.4004529-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-01 17:01:40 -07:00
Aleksander Jan Bajkowski	f93836b236	net: usb: r8152: add TRENDnet TUC-ET2G v2.0 The TRENDnet TUC-ET2G V2.0 is an RTL8156B based 2.5G Ethernet controller. Add the vendor and product ID values to the driver. This makes Ethernet work with the adapter. Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Birger Koblitz <mail@birger-koblitz.de> Link: https://patch.msgid.link/20260430213435.21821-1-olek2@wp.pl Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-01 16:50:49 -07:00
Jakub Kicinski	2ab02ac411	Merge tag 'nf-26-05-01' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following batch contains Netfilter fixes for net: 1) Replace skb_try_make_writable() by skb_ensure_writable() in nft_fwd_netdev and the flowtable to deal with uncloned packets having their network header in paged fragments. 2) Drop packet if output device does not exist and ensure sufficient headroom in nft_fwd_netdev before transmitting the skb. 3) Use the existing dup recursion counter in nft_fwd_netdev for the neigh_xmit variant, from Weiming Shi. 4) Add .check_hooks interface to x_tables to detach the control plane hook check based on the match/target configuration. Then, update nft_compat to use .check_hooks from .validate path, this fixes a lack of hook validation for several match/targets. 5) Fix incorrect .usersize in xt_CT, from Florian Westphal. 6) Fix a memleak with netdev tables in dormant state, from Florian Westphal. 7) Several patches to check if the packet is a fragment, then skip layer 4 inspection, for x_tables and nf_tables; as well as common nf_socket infrastructure. The xt_hashlimit match drops fragments to stay consistent with the existing approach when failing to parse the layer 4 protocol header. 8) Ensure sufficient headroom in the flowtable before transmitting the skb. 9) Fix the flowtable inline vlan approach for double-tagged vlan: Reverse the iteration over .encap[] since it represents the encapsulation as seen from the ingress path. Postpone pushing layer 2 header so output device is available to calculate needed headroom. Finally, add and use nf_flow_vlan_push() to fix it. 10) Fix flowtable inline pppoe with GSO packets. Moreover, use FLOW_OFFLOAD_XMIT_DIRECT to fill up destination hardware address since neighbour cache does not exist in pppoe. 11) Use skb_pull_rcsum() to decapsulate vlan and pppoe headers, for double-tagged vlan in particular this should provide some benefits in certain scenarios. More notes regarding 9-11): - sashiko is also signalling to use it for IPIP headers, but that needs more adjustments such setting skb->protocol after removing the IPIP header, will follow up in a separated patch. - I plan to submit selftests to cover double-tagged-vlan. As for pppoe, it should be possible but that would mandate a few userspace dependencies. This has been semi-automatically tested by me and reporters describing broken double-vlan-tagged and pppoe currently in the flowtable. * tag 'nf-26-05-01' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: flowtable: use skb_pull_rcsum() to pop vlan/pppoe header netfilter: flowtable: fix inline pppoe encapsulation in xmit path netfilter: flowtable: fix inline vlan encapsulation in xmit path netfilter: flowtable: ensure sufficient headroom in xmit path netfilter: xtables: fix L4 header parsing for non-first fragments netfilter: nf_tables: skip L4 header parsing for non-first fragments netfilter: nf_socket: skip socket lookup for non-first fragments netfilter: nf_tables: fix netdev hook allocation memleak with dormant tables netfilter: xt_CT: fix usersize for v1 and v2 revision netfilter: nft_compat: run xt_check_hooks_{match,target}() from .validate netfilter: x_tables: add .check_hooks to matches and targets netfilter: nft_fwd_netdev: use recursion counter in neigh egress path netfilter: nft_fwd_netdev: add device and headroom validate with neigh forwarding netfilter: replace skb_try_make_writable() by skb_ensure_writable() ==================== Link: https://patch.msgid.link/20260501122237.296262-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-01 16:45:42 -07:00
Pablo Neira Ayuso	baa3c65435	netfilter: flowtable: use skb_pull_rcsum() to pop vlan/pppoe header This adjusts the checksum, if required, after pulling the layer 2 header, either the pppoe header or the inner vlan header in the double-tagged vlan packets. Fixes: `4cd91f7c29` ("netfilter: flowtable: add vlan support") Fixes: `72efd585f7` ("netfilter: flowtable: add pppoe support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-05-01 12:39:23 +02:00
Jakub Kicinski	85da3965df	Merge branch 'octeontx2-af-npc-cn20k-mcam-fixes' Ratheesh Kannoth says: ==================== octeontx2-af: npc: cn20k: MCAM fixes This series tightens Marvell OcteonTX2 AF NPC support for CN20K silicon around MCAM key typing, optional debugfs setup, defrag allocation rollback, defrag entry relocation bookkeeping, logical MCAM clear and programming, default-rule index handling with explicit teardown, and NIXLF reserved-slot lookup when default rules are missing. Patches 1 through 3 focus on AF error handling: propagate npc_mcam_idx_2_key_type() failures through cn20k MCAM enable, config, copy, and read paths; treat cn20k NPC debugfs nodes as optional so probe does not fail when debugfs is unavailable; and fix defrag MCAM allocation rollback so allocation errno is not overwritten during subbank index resolution. Patch 4 fixes npc_defrag_move_vdx_to_free(): when an MCAM line is moved to a new physical index, move entry2target_pffunc[] association to the new slot, clear the old slot, and retarget the matching mcam_rules entry so software state matches hardware after defrag. Patches 5 through 7 refine cn20k MCAM programming: clear entries using the logical MCAM index and resolved key width, fix bank/CFG sequencing in npc_cn20k_config_mcam_entry(), and read action metadata from the correct bank in npc_cn20k_read_mcam_entry(). Patches 8 through 10 complete default-rule lifecycle handling: initialize default-rule index outputs eagerly, tear down reserved default MCAM rules explicitly (coordinated with npc_mcam_free_all_entries()), and reject USHRT_MAX sentinel indices from npc_get_nixlf_mcam_index() on cn20k. ==================== Link: https://patch.msgid.link/20260429022722.1110289-1-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 18:50:19 -07:00
Ratheesh Kannoth	bc968f61bf	octeontx2-af: npc: cn20k: Reject missing default-rule MCAM indices When cn20k default L2 rules are not installed, npc_cn20k_dft_rules_idx_get() leaves broadcast, multicast, promiscuous, and unicast slots at USHRT_MAX. npc_get_nixlf_mcam_index() previously returned that sentinel as a valid MCAM index, so callers could program hardware with an invalid index. Return -EINVAL from the cn20k branches of npc_get_nixlf_mcam_index() when the requested slot is still USHRT_MAX. Harden cn20k NPC MCAM entry helpers to reject out-of-range indices before touching hardware. Drop the early bounds check in npc_enable_mcam_entry() for cn20k so invalid indices are validated inside npc_cn20k_enable_mcam_entry() instead of being silently ignored. In rvu_npc_update_flowkey_alg_idx(), treat negative MCAM indices like out-of-range values, and only update RSS actions for promiscuous and all-multi paths when the resolved index is non-negative. Cc: Suman Ghosh <sumang@marvell.com> Fixes: `6d1e70282f` ("octeontx2-af: npc: cn20k: Use common APIs") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260429022722.1110289-11-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 18:50:17 -07:00
Ratheesh Kannoth	013717353c	octeontx2-af: npc: cn20k: Tear down default MCAM rules explicitly on free npc_cn20k_dft_rules_free() used the NPC MCAM mbox "free all" path, which does not match how cn20k tracks default-rule MCAM slots indexes. Resolve the default-rule indices, then for each valid slot clear the bitmap entry, drop the PF/VF map, disable the MCAM line, clear the target function, and npc_cn20k_idx_free(). Remove any matching software mcam_rules nodes. On hard failure from idx_free, WARN and stop so the box stays up for analysis. In npc_mcam_free_all_entries(), prefetch the same default-rule indices and, on cn20k, skip bitmap clear and idx_free when the scanned entry is one of those reserved defaults (they are released by npc_cn20k_dft_rules_free). Fixes: `09d3b7a140` ("octeontx2-af: npc: cn20k: Allocate default MCAM indexes") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260429022722.1110289-10-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 18:50:17 -07:00
Ratheesh Kannoth	afb474bd4f	octeontx2-af: npc: cn20k: Initialize default-rule index outputs up front npc_cn20k_dft_rules_idx_get() wrote USHRT_MAX into individual outputs only on some error paths (lbk promisc lookup, VF ucast lookup, and the PF rule walk), which could leave other caller slots stale across retries. Set every non-NULL bcast/mcast/promisc/ucast pointer to USHRT_MAX once at entry, then drop the duplicate assignments on failure. Successful lookups still overwrite the relevant slot before returning. Fixes: `09d3b7a140` ("octeontx2-af: npc: cn20k: Allocate default MCAM indexes") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260429022722.1110289-9-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 18:50:17 -07:00
Ratheesh Kannoth	f6803eb070	octeontx2-af: npc: cn20k: Fix MCAM actions read npc_cn20k_read_mcam_entry() always reloaded action and vtag_action from bank 0 after programming the CAM words. Use the bank returned by npc_get_bank() for the ACTION reads as well, and read those registers once up front so both X2 and X4 paths share the same metadata. Return directly from the X2 keyword path now that the action fields are already populated. Cc: Suman Ghosh <sumang@marvell.com> Fixes: `6d1e70282f` ("octeontx2-af: npc: cn20k: Use common APIs") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260429022722.1110289-8-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 18:50:16 -07:00
Ratheesh Kannoth	2b6d6bb728	octeontx2-af: npc: cn20k: Fix bank value For X4 keys its loop reused the bank parameter as the loop counter, so bank no longer reflected the caller's bank after the loop and the control flow was hard to follow. Program NPC_AF_CN20K_MCAMEX_BANKX_CFG_EXT directly in npc_cn20k_config_mcam_entry(): one CFG write for X2 using the computed bank, and one CFG write per bank inside the X4 action loop. Enable the entry at the end with npc_cn20k_enable_mcam_entry(..., true) instead of embedding the enable bit in bank_cfg via the removed helper. Cc: Suman Ghosh <sumang@marvell.com> Fixes: `4e527f1e5c` ("octeontx2-af: npc: cn20k: Add new mailboxes for CN20K silicon") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260429022722.1110289-7-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 18:50:16 -07:00
Ratheesh Kannoth	d2dabf0963	octeontx2-af: npc: cn20k: Clear MCAM entries by index and key width Replace the old four-argument CN20K MCAM clear with a per-bank static helper and npc_cn20k_clear_mcam_entry() that takes a logical MCAM index, resolves the key width via npc_mcam_idx_2_key_type(), and clears either one bank (X2) or every bank (X4). Call it from npc_clear_mcam_entry() on cn20k and log when key-type lookup fails. Use the per-bank helper from npc_cn20k_config_mcam_entry() for pre-program clears. For loopback VFs, use the promisc MCAM index as ucast_idx when copying RSS action for promisc, matching cn20k default-rule layout. Cc: Suman Ghosh <sumang@marvell.com> Fixes: `6d1e70282f` ("octeontx2-af: npc: cn20k: Use common APIs") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260429022722.1110289-6-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 18:50:16 -07:00
Ratheesh Kannoth	d7e5940c4c	octeontx2-af: npc: cn20k: Fix target map and rule npc_defrag_move_vdx_to_free() disables, copies, and enables the MCAM entry at a new index but previously left entry2target_pffunc[] and the mcam_rules list still keyed to the old index. Copy the target PF association to the new slot, clear the old one, and retarget the rule entry so software state matches the relocated hardware context. Fixes: `645c6e3c19` ("octeontx2-af: npc: cn20k: virtual index support") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260429022722.1110289-5-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 18:50:16 -07:00
Ratheesh Kannoth	adb5ff41ef	octeontx2-af: npc: cn20k: Propagate errors in defrag MCAM alloc rollback npc_defrag_alloc_free_slots() allocates MCAM indexes in up to two passes on bank0 then bank1. On failure it rolls back by freeing entries already placed in save[]. __npc_subbank_alloc() can return a negative errno while only part of the indexes are valid. The rollback loop used rc for npc_mcam_idx_2_subbank_idx() as well, so a successful lookup stored zero in rc and a later __npc_subbank_free() failure could still end with return 0 when the allocation path had also left rc at zero (for example shortfall after zero return values from the alloc helpers). Jump to the rollback path immediately when either __npc_subbank_alloc() call fails, preserving its errno. If both calls succeed but the total allocated count is still less than cnt, set rc to -ENOSPC before rollback. Use a separate err variable for npc_mcam_idx_2_subbank_idx() so a successful lookup no longer clears a non-zero rc from the allocation phase. Cc: Dan Carpenter <error27@gmail.com> Fixes: `645c6e3c19` ("octeontx2-af: npc: cn20k: virtual index support") Link: https://lore.kernel.org/netdev/adjNJEpILRZATB2N@stanley.mountain/ Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260429022722.1110289-4-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 18:50:16 -07:00
Ratheesh Kannoth	1100af13fd	octeontx2-af: npc: cn20k: Drop debugfs_create_file() error checks in init debugfs is not intended to be checked for allocation failures the way other kernel APIs are: callers should not fail probe or subsystem init because a debugfs node could not be created, including when debugfs is disabled in Kconfig. Replacing NULL checks with IS_ERR() checks is similarly wrong for optional debugfs. Remove dentry checks and -EFAULT returns from npc_cn20k_debugfs_init(). See: https://staticthinking.wordpress.com/2023/07/24/ debugfs-functions-are-not-supposed-to-be-checked/ Cc: Dan Carpenter <error27@gmail.com> Fixes: `528530dff5` ("octeontx2-af: npc: cn20k: add debugfs support") Link: https://lore.kernel.org/netdev/adjNGPWKMOk3KgWL@stanley.mountain/ Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260429022722.1110289-3-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 18:50:15 -07:00
Ratheesh Kannoth	aaadccde31	octeontx2-af: npc: cn20k: Propagate MCAM key-type errors on cn20k npc_mcam_idx_2_key_type() can fail; callers used to ignore it and still used kw_type when enabling, configuring, copying, and reading MCAM entries. That could program or decode hardware with an undefined key type. Return -EINVAL when key-type lookup fails. Return -EINVAL from npc_cn20k_copy_mcam_entry() when src and dest key types differ instead of failing silently. Change npc_cn20k_{enable,config,copy,read}_mcam_entry() to return int on success or error. Thread those errors through the cn20k MCAM write and read mbox handlers, the cn20k baseline steer read path, NPC defrag move (disable/copy/enable with dev_err and -EFAULT), and the DMAC update path in rvu_npc_fs.c. Make npc_copy_mcam_entry() return int so the cn20k branch can return npc_cn20k_copy_mcam_entry() without a void/int mismatch, and fail NPC_MCAM_SHIFT_ENTRY when copy fails. Cc: Suman Ghosh <sumang@marvell.com> Cc: Dan Carpenter <error27@gmail.com> Fixes: `6d1e70282f` ("octeontx2-af: npc: cn20k: Use common APIs") Link: https://lore.kernel.org/netdev/adiQJvuKlEhq2ILx@stanley.mountain/ Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260429022722.1110289-2-rkannoth@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 18:50:15 -07:00
Lorenzo Bianconi	75df490c9e	net: airoha: Move entries to queue head in case of DMA mapping failure in airoha_dev_xmit() In order to respect the original descriptor order and avoid any potential IOMMU fault or memory corruption, move pending queue entries to the head of hw queue tx_list if the DMA mapping of current inflight packet fails in airoha_dev_xmit routine. Fixes: `3f47e67dff` ("net: airoha: Add the capability to consume out-of-order DMA tx descriptors") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260429-airoha-xmit-unmap-error-path-v2-1-32e43b7c6d25@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 18:08:48 -07:00
Jiawen Wu	7a33345153	net: libwx: use request_irq for VF misc interrupt Currently, request_threaded_irq() is used with a primary handler but a NULL threaded handler, while also setting the IRQF_ONESHOT flag. This specific combination triggers a WARNING since the commit `aef30c8d56` ("genirq: Warn about using IRQF_ONESHOT without a threaded handler"). WARNING: kernel/irq/manage.c:1502 at __setup_irq+0x4fa/0x760 Fix the issue by switching to request_irq(), which is the appropriate interface or a non-threaded interrupt handler, and removing the unnecessary IRQF_ONESHOT flag. Fixes: `eb4898fde1` ("net: libwx: add wangxun vf common api") Cc: stable@vger.kernel.org Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://patch.msgid.link/786DDC7D5CCA6D0A+20260429083743.88961-2-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 18:07:21 -07:00
Jiawen Wu	694de316f6	net: libwx: fix VF illegal register access Register WX_CFG_PORT_ST is a PF restricted register. When a VF is initialized, attempting to read this register triggers an illegal register access, which lead to a system hang. When the device is VF, the bus function ID can be obtained directly from the PCI_FUNC(pdev->devfn). Fixes: `a04ea57aae` ("net: libwx: fix device bus LAN ID") Cc: stable@vger.kernel.org Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://patch.msgid.link/4D1F4452D21DE107+20260429083743.88961-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 18:07:21 -07:00
Wei Fang	26ebd12e67	net: enetc: fix VSI mailbox timeout handling and DMA lifecycle In the current VSI mailbox implementation, the VSI allocates a DMA buffer to store the message sent to the PSI. When the PSI receives the message request from the VSI, the hardware copies the message data from this DMA buffer to PSI's DMA buffer for processing. When enetc_msg_vsi_send() times out, two scenarios can occur: 1) Use-after-free: If the hardware hasn't completed message copying when the VSI frees the buffer, the hardware may subsequently copy the data from freed memory to PSI's DMA buffer. 2) Message race: If PSI hasn't processed the previous message when the next message is sent, the VSI may receive the previous message's reply, leading to incorrect handling. To address these issues, implement the following changes: - Check the mailbox busy status before sending a new message. If the mailbox is in busy state, it indicates the previous message is still being processed, so return an error immediately. - Add the 'msg' field to struct enetc_si to preserve the DMA buffer information. The caller of enetc_msg_vsi_send() no longer frees the DMA buffer. Instead, defer freeing until it is safe to do so (when mailbox is not busy on next send). - Add cleanup in enetc_vf_remove() to free the last message buffer. This ensures the DMA buffer remains valid during message copying and prevents message reply mismatches. Fixes: `beb74ac878` ("enetc: Add vf to pf messaging support") Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260429081930.3259824-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 17:35:56 -07:00
Daniel Borkmann	3744b0964d	ipv6: Implement limits on extension header parsing ipv6_{skip_exthdr,find_hdr}() and ip6_{tnl_parse_tlv_enc_lim, protocol_deliver_rcu}() iterate over IPv6 extension headers until they find a non-extension-header protocol or run out of packet data. The loops have no iteration counter, relying solely on the packet length to bound them. For a crafted packet with 8-byte extension headers filling a 64KB jumbogram, this means a worst case of up to ~8k iterations with a skb_header_pointer call each. ipv6_skip_exthdr(), for example, is used where it parses the inner quoted packet inside an incoming ICMPv6 error: - icmpv6_rcv - checksum validation - case ICMPV6_DEST_UNREACH - icmpv6_notify - pskb_may_pull() <- pull inner IPv6 header - ipv6_skip_exthdr() <- iterates here - pskb_may_pull() - ipprot->err_handler() <- sk lookup The per-iteration cost of ipv6_skip_exthdr itself is generally light, but skb_header_pointer becomes more costly on reassembled packets: the first ~1232 bytes of the inner packet are in the skb's linear area, but the remaining ~63KB are in the frag_list where skb_copy_bits is needed to read data. Initially, the idea was to add a configurable limit via a new sysctl knob with default 8, in line with knobs from commit `47d3d7ac65` ("ipv6: Implement limits on Hop-by-Hop and Destination options"), but two reasons eventually argued against it: - It adds to UAPI that needs to be maintained forever, and upcoming work is restricting extension header ordering anyway, leaving little reason for another sysctl knob - exthdrs_core.c is always built-in even when CONFIG_IPV6=n, where struct net has no .ipv6 member, so the read site would need an ifdef'd fallback to a constant anyway Therefore, just use a constant (IP6_MAX_EXT_HDRS_CNT). All four extension header walking functions are now bound by this limit. Note that the check in ip6_protocol_deliver_rcu() happens right before the goto resubmit, such that we don't have to have a test for ipv6_ext_hdr() in the fast-path. There's an ongoing IETF draft-iurman-6man-eh-occurrences to enforce IPv6 extension headers ordering and occurrence. The latter also discusses security implications. As per RFC8200 section 4.1, the occurrence rules for extension headers provide a practical upper bound which is 8. In order to be conservative, let's define IP6_MAX_EXT_HDRS_CNT as 12 to leave enough room for quirky setups. In the unlikely event that this is still not enough, then we might need to reconsider a sysctl. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Justin Iurman <justin.iurman@gmail.com> Link: https://patch.msgid.link/20260429154648.809751-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 17:21:45 -07:00
Robert Marko	e027c218c4	net: phy: micrel: fix LAN8814 QSGMII soft reset LAN8814 QSGMII soft reset was moved into the probe function to avoid triggering it for each of 4 PHY-s in the package. However, that broke QSGMII link between the MAC and PHY on most LAN8814 PHY-s, specificaly for us on the Microchip LAN969x switch. Reading the QSGMII status registers it was visible that lanes were only partially synced. It looks like the reset timing is crucial, so lets move the reset back into the .config_init function but guard it with phy_package_init_once() to avoid it being triggered on each of 4 PHY-s in the package. Change the probe function to use phy_package_probe_once() for coma and PtP setup. Fixes: `96a9178a29` ("net: phy: micrel: lan8814 fix reset of the QSGMII interface") Signed-off-by: Robert Marko <robert.marko@sartura.hr> Link: https://patch.msgid.link/20260428134138.1741253-1-robert.marko@sartura.hr Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 16:49:23 -07:00
Pablo Neira Ayuso	69c54f80f4	netfilter: flowtable: fix inline pppoe encapsulation in xmit path Address two issues in the inline pppoe encapsulation: - Add needs_gso_segment flag to segment PPPoE packets in software given that there is no GSO support for this. - Use FLOW_OFFLOAD_XMIT_DIRECT since neighbour cache is not available in point-to-point device, use the hardware address that is obtained via flowtable path discovery (ie. fill_forward_path). Fixes: `18d27bed08` ("netfilter: flowtable: inline pppoe encapsulation in xmit path") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-05-01 01:24:01 +02:00
Pablo Neira Ayuso	a177ae30f7	netfilter: flowtable: fix inline vlan encapsulation in xmit path Several issues in the inline vlan support: - The layer 2 encapsulation representation in the tuple takes encap[0] as the outer header and encap[1] as the inner header as seen from the ingress path. Reverse the encap loop to push first the inner then the outer vlan header. - Postpone pushing the layer 2 header once destination device is known. This allows to calculate the needed hearoom via LL_RESERVED_SPACE to accommodate the layer 2 headers. - Add and use nf_flow_vlan_push() as suggested by Eric Woudstra, this is a simplified version of skb_vlan_push() for egress path only. Fixes: `c653d5a78f` ("netfilter: flowtable: inline vlan encapsulation in xmit path") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-05-01 01:23:47 +02:00
Jakub Kicinski	a54c9a13cf	Merge branch 'net-mctp-test-minor-kunit-test-fixes' Jeremy Kerr says: ==================== net: mctp: test: minor kunit test fixes This series provides two fixes in the MCTP kunit tests - one exposed by ktr, and one found while debugging the former on different VM configs. ==================== Link: https://patch.msgid.link/20260429-dev-mctp-test-fixes-v1-0-1127b7425809@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 13:36:49 -07:00
Jeremy Kerr	7687297106	net: mctp: test: Use dev_direct_xmit for TX to our test device In our test cases, we typically feed a packet sequence into the routing code, then inspect the device's TXed skbs to assert specific behaviours. Using dev_queue_xmit() for our TX path introduces a fair bit of complexity between the test packet sequence and the test device's ndo_start_xmit callback; which may mean that the skbs have not hit the device at the point we're inspecting the TXed skb list. Use dev_direct_xmit instead, as we want a direct a path as possible here, and the test dev does not need any queueing, scheduling or flow control. Fixes: `6ab578739a` ("net: mctp: test: move TX packetqueue from dst to dev") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202604281320.525eee17-lkp@intel.com Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260429-dev-mctp-test-fixes-v1-2-1127b7425809@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 13:36:47 -07:00
Jeremy Kerr	18ed60e33e	net: mctp: test: use a zeroed struct sockaddr_mctp Invalid sockaddr padding will cause bind() to fail; ensure we have a zeroed address in the testcase. Fixes: `0d8647bc74` ("net: mctp: don't require a route for null-EID ingress") Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260429-dev-mctp-test-fixes-v1-1-1127b7425809@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-30 13:36:47 -07:00
Pablo Neira Ayuso	ef4f741e86	netfilter: flowtable: ensure sufficient headroom in xmit path Check for headroom and call skb_expand_head() like in the IP output path to ensure there is sufficient headroom for the mac header when forwarding this packet as suggested by sashiko. Fixes: `b5964aac51` ("netfilter: flowtable: consolidate xmit path") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-04-30 17:59:01 +02:00
Fernando Fernandez Mancera	952e121c96	netfilter: xtables: fix L4 header parsing for non-first fragments Multiple targets and matches relies on L4 header to operate. For fragmented packets, every fragment carries the transport protocol identifier, but only the first fragment contains the L4 header. As the 'raw' table can be configured to run at priority -450 (before defragmentation at -400), the target/match can be reached before reassembly. In this case, non-first fragments have their payload incorrectly parsed as a TCP/UDP header. This would be of course a misconfiguration scenario. In most of the cases this just lead to a unreliable behavior for fragmented traffic. Add a fragment check to ensure target/match only evaluates unfragmented packets or the first fragment in the stream. Fixes: `902d6a4c2a` ("netfilter: nf_defrag: Skip defrag if NOTRACK is set") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-04-30 17:59:01 +02:00
Fernando Fernandez Mancera	009d203e56	netfilter: nf_tables: skip L4 header parsing for non-first fragments The tproxy, osf and exthdr (SCTP) expressions rely on the presence of transport layer headers to perform socket lookups, fingerprint matching, or chunk extraction. For fragmented packets, while the IP protocol remains constant across all fragments, only the first fragment contains the actual L4 header. The expressions could be attached to a chain with a priority lower than -400, bypassing defragmentation. Or could be used in stateless environments where defragmentation is not happening at all. This could result in garbage data being used for the matching. Add a check for pkt->fragoff so only unfragmented packets or the first fragment is processed. Fixes: `133dc203d7` ("netfilter: nft_exthdr: Support SCTP chunks") Fixes: `4ed8eb6570` ("netfilter: nf_tables: Add native tproxy support") Fixes: `b96af92d6e` ("netfilter: nf_tables: implement Passive OS fingerprint module in nft_osf") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-04-30 17:59:01 +02:00
Fernando Fernandez Mancera	0bf00859d7	netfilter: nf_socket: skip socket lookup for non-first fragments Both nft_socket and xt_socket relies on L4 headers to perform socket lookup in the slow path. For fragmented packets, while the IP protocol remains constant across all fragments, only the first fragment contains the actual L4 header. As the expression/match could be attached to a chain with a priority lower than -400, it could bypass defragmentation. Add a check for fragmentation in the lookup functions directly so the problem is handled for both nft_socket and xt_socket at the same time. In addition, future users of the functions would not need to care about this. Fixes: `902d6a4c2a` ("netfilter: nf_defrag: Skip defrag if NOTRACK is set") Fixes: `554ced0a6e` ("netfilter: nf_tables: add support for native socket matching") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-04-30 17:59:01 +02:00
Linus Torvalds	08d0d34666	Merge tag 'net-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter. Current release - regressions: - ipmr: free mr_table after RCU grace period. Previous releases - regressions: - core: add net_iov_init() and use it to initialize ->page_type - sched: taprio: fix NULL pointer dereference in class dump - netfilter: nf_tables: - use list_del_rcu for netlink hooks - fix strict mode inbound policy matching - tcp: make probe0 timer handle expired user timeout - vrf: fix a potential NPD when removing a port from a VRF - eth: ice: - fix NULL pointer dereference in ice_reset_all_vfs() - fix infinite recursion in ice_cfg_tx_topo via ice_init_dev_hw Previous releases - always broken: - page_pool: fix memory-provider leak in error path - sched: sch_cake: annotate data-races in cake_dump_stats() - mptcp: fix scheduling with atomic in timestamp sockopt - psp: check for device unregister when creating assoc - tls: fix strparser anchor skb leak on offload RX setup failure - eth: - stmmac: prevent NULL deref when RX memory exhausted - airoha: do not read uninitialized fragment address - rtl8150: fix use-after-free in rtl8150_start_xmit() Misc: - add Ido Schimmel as IPv4/IPv6 maintainer - add David Heidelberg as NFC subsystem maintainer" * tag 'net-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (79 commits) net/sched: cls_flower: revert unintended changes sfc: fix error code in efx_devlink_info_running_versions() net: tls: fix strparser anchor skb leak on offload RX setup failure ice: add dpll peer notification for paired SMA and U.FL pins ice: fix missing dpll notifications for SW pins dpll: export __dpll_pin_change_ntf() for use under dpll_lock ice: fix SMA and U.FL pin state changes affecting paired pin ice: fix missing SMA pin initialization in DPLL subsystem ice: fix infinite recursion in ice_cfg_tx_topo via ice_init_dev_hw ice: fix NULL pointer dereference in ice_reset_all_vfs() iavf: add VIRTCHNL_OP_ADD_VLAN to success completion handler iavf: wait for PF confirmation before removing VLAN filters iavf: stop removing VLAN filters from PF on interface down iavf: rename IAVF_VLAN_IS_NEW to IAVF_VLAN_ADDING page_pool: fix memory-provider leak in page_pool_create_percpu() error path bonding: 3ad: implement proper RCU rules for port->aggregator net: airoha: Do not return err in ndo_stop() callback hv_sock: fix ARM64 support MAINTAINERS: update the IPv4/IPv6 entry and add Ido Schimmel selftests: drv-net: clarify linters and frameworks in README ...	2026-04-30 08:45:43 -07:00
Linus Torvalds	6cd70263a6	Merge tag 'ata-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fix from Niklas Cassel: - Fix a reference leak on device_register() failure in pata_parport * tag 'ata-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: pata_parport: switch to dynamic root device	2026-04-30 08:35:36 -07:00
Linus Torvalds	2aa0a36917	Merge tag 'sound-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A bunch of small fixes. One minor fix is found in the core side for data race in PCM OSS layer, while remaining changes are various device-specific fixes and quirks. - Core: PCM OSS data race fix - HD-audio: Fixes for TAS2781, CS35L56, and Realtek/Conexant quirks; avoidance of a WARN_ON for HDMI channel mapping - USB-audio: Improvements in UAC3 parsing robustness (leaks, size checks) and fixes for potential endless loops - ASoC: Driver-specific fixes for CS35L56, Intel bytcr_wm5102, Spacemit, AW88395, and others, plus a new quirk for Steam Deck OLED - Misc: A UAF fix in aloop driver, division by zero fix in ua101 driver and leak fixes in caiaq driver" * tag 'sound-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (32 commits) ALSA: hda/tas2781: Fix incorrect bit update for non-book-zero or book 0 pages >1 ALSA: hda: cs35l56: Fix uninitialized value in cs35l56_hda_read_acpi() ALSA: hda/conexant: Fix missing error check for jack detection ALSA: hda: Avoid WARN_ON() for HDMI chmap slot checks ALSA: usb-audio: Fix quirk entry placement for PreSonus AudioBox USB ASoC: spacemit: adjust FIFO trigger threshold to half FIFO size ASoC: spacemit: move hw constraints from hw_params to startup ASoC: codecs: ab8500: Fix casting of private data ASoC: cs35l56: Fix illegal writes to OTP_MEM registers ASoC: Intel: bytcr_wm5102: Fix MCLK leak on platform_clock_control error ALSA: usb-audio: Avoid potential endless loop in convert_chmap_v3() ALSA: usb-audio: Fix potential leak of pd at parsing UAC3 streams ALSA: caiaq: Don't abort when no input device is available ALSA: caiaq: Fix potentially leftover ep1_in_urb at error path ASoC: aw88395: Fix kernel panic caused by invalid GPIO error pointer ALSA: caiaq: fix usb_dev refcount leak on probe failure sound: ua101: fix division by zero at probe ALSA: usb-audio: apply quirk for Playstation PDP Riffmaster ALSA: hda: Remove duplicate cmedia entries in codecs Makefile ALSA: hda/realtek: Add micmute LED quirk for Acer Aspire A315-44P ...	2026-04-30 08:29:56 -07:00
Paolo Abeni	1e01abec85	net/sched: cls_flower: revert unintended changes While applying the blamed commit `4ca07b9239` ("net: mctp i2c: check length before marking flow active"), I unintentionally included unrelated and unacceptable changes. Revert them. Fixes: `4ca07b9239` ("net: mctp i2c: check length before marking flow active") Reported-by: Jeremy Kerr <jk@codeconstruct.com.au> Closes: https://lore.kernel.org/netdev/bd8704fe0bd53e278add5cde4873256656623e2e.camel@codeconstruct.com.au/ Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/043026a53ff84da88b17648c4b0d17f0331749cb.1777447863.git.pabeni@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-30 13:47:01 +02:00
Dan Carpenter	051ffb001b	sfc: fix error code in efx_devlink_info_running_versions() Return -EIO if efx_mcdi_rpc() doesn't return enough space. Fixes: `14743ddd24` ("sfc: add devlink info support for ef100") Signed-off-by: Dan Carpenter <error27@gmail.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/afGpsbLRHL4_H0KS@stanley.mountain Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-30 13:44:30 +02:00
Jakub Kicinski	58689498ca	net: tls: fix strparser anchor skb leak on offload RX setup failure When tls_set_device_offload_rx() fails at tls_dev_add(), the error path calls tls_sw_free_resources_rx() to clean up the SW context that was initialized by tls_set_sw_offload(). This function calls tls_sw_release_resources_rx() (which stops the strparser via tls_strp_stop()) and tls_sw_free_ctx_rx() (which kfrees the context), but never frees the anchor skb that was allocated by alloc_skb(0) in tls_strp_init(). Note that tls_sw_free_resources_rx() is exclusively used for this "failed to start offload" code path, there's no other caller. The leak did not exist before commit `84c61fe1a7` ("tls: rx: do not use the standard strparser"), because the standard strparser doesn't try to pre-allocate an skb. The normal close path in tls_sk_proto_close() handles cleanup by calling tls_sw_strparser_done() (which calls tls_strp_done()) after dropping the socket lock, because tls_strp_done() does cancel_work_sync() and the strparser work handler takes the socket lock. Fixes: `84c61fe1a7` ("tls: rx: do not use the standard strparser") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20260428231559.1358502-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-30 13:38:29 +02:00
Paolo Abeni	47888597a3	Merge branch 'intel-wired-lan-update-2026-04-27-ice-iavf' Jacob Keller says: ==================== Intel Wired LAN Update 2026-04-27 (ice, iavf) Petr Oros from RedHat has accumulated a number of fixes for the Intel ice and iavf drivers, bundled together in this series. First, a series of 4 fixes to resolve issues with the iavf driver logic for handling VLAN filters. This includes keeping VLAN filters while the interface is brought down, waiting for confirmation on filter deletion before deleting filters from the driver tracking structures, and handling the VIRTCHNL_OP_ADD_VLAN for the old v1 VLAN_ADD command. A fix for a crash in ice_reset_all_vfs(), properly checking for errors when ice_vf_rebuild_vsi() fails. A fix for a possible infinite recursion in ice_cfg_tx_topo() that occurs when trying to apply invalid Tx topology configuration. A fix to initialize the SMA pins in the DPLL subsystem properly. A fix to change the SMA and U.FL pin state for paired pins, ensuring that all flows changing one pin will also update its shared pin appropriately. A preparatory patch to export __dpll_pin_change_ntf() so that drivers can notify pin changes while already holding the dpll_lock. A fix to ensure DPLL notifications are sent for the software-controlled pins which wrap the physical CGU input/output pins. A fix to add DPLL notifications for peer pins when changing the SMA or U.FL pins, ensuring DPLL subsystem is notified about the paired connected pins. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> ==================== Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-0-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-30 11:37:43 +02:00
Petr Oros	9e5dead140	ice: add dpll peer notification for paired SMA and U.FL pins SMA and U.FL pins share physical signal paths in pairs (SMA1/U.FL1 and SMA2/U.FL2). When one pin's state changes via a PCA9575 GPIO write, the paired pin's state also changes, but no notification is sent for the peer pin. Userspace consumers monitoring the peer via dpll netlink subscribe never learn about the update. Add ice_dpll_sw_pin_notify_peer() which sends a change notification for the paired SW pin. Call it from ice_dpll_pin_sma_direction_set(), ice_dpll_sma_pin_state_set(), and ice_dpll_ufl_pin_state_set() after pf->dplls.lock is released. Use __dpll_pin_change_ntf() because dpll_lock is still held by the dpll netlink layer (dpll_pin_pre_doit). Fixes: `2dd5d03c77` ("ice: redesign dpll sma/u.fl pins control") Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-11-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-30 11:37:39 +02:00
Petr Oros	1a41b58fd4	ice: fix missing dpll notifications for SW pins The SMA/U.FL pin redesign (commit `2dd5d03c77` ("ice: redesign dpll sma/u.fl pins control")) introduced software-controlled pins that wrap backing CGU input/output pins, but never updated the notification and data paths to propagate pin events to these SW wrappers. The periodic work sends dpll_pin_change_ntf() only for direct CGU input pins. SW pins that wrap these inputs never receive change or phase offset notifications, so userspace consumers such as synce4l monitoring SMA pins via dpll netlink never learn about state transitions or phase offset updates. Similarly, ice_dpll_phase_offset_get() reads the SW pin's own phase_offset field which is never updated; the PPS monitor writes to the backing CGU input's field instead. Fix by introducing ice_dpll_pin_ntf(), a wrapper around dpll_pin_change_ntf() that also notifies any registered SMA/U.FL pin whose backing CGU input matches. Replace all direct dpll_pin_change_ntf() calls in the periodic notification paths with this wrapper. Fix ice_dpll_phase_offset_get() to return the backing CGU input's phase_offset for input-direction SW pins. Fixes: `2dd5d03c77` ("ice: redesign dpll sma/u.fl pins control") Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-10-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-30 11:37:39 +02:00
Ivan Vecera	620055cb10	dpll: export __dpll_pin_change_ntf() for use under dpll_lock Export __dpll_pin_change_ntf() so that drivers can send pin change notifications from within pin callbacks, which are already called under dpll_lock. Using dpll_pin_change_ntf() in that context would deadlock. Add lockdep_assert_held() to catch misuse without the lock held. Acked-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-9-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-30 11:37:39 +02:00
Petr Oros	6f9d8393c9	ice: fix SMA and U.FL pin state changes affecting paired pin SMA and U.FL pins share physical signal paths in pairs (SMA1/U.FL1 and SMA2/U.FL2) controlled by the PCA9575 GPIO expander. Each pair can only have one active pin at a time: SMA1 output and U.FL1 output share the same CGU output, SMA2 input and U.FL2 input share the same CGU input. The PCA9575 register bits determine which connector in each pair owns the signal path. The driver does not account for this pairing in two places: ice_dpll_ufl_pin_state_set() modifies PCA9575 bits and disables the backing CGU pin without checking whether the U.FL pin is currently active. Disconnecting an already inactive U.FL pin flips bits that the paired SMA pin relies on, breaking its connection. ice_dpll_sma_direction_set() does not propagate direction changes to the paired U.FL pin. For SMA2/U.FL2 the ICE_SMA2_UFL2_RX_DIS bit is never managed, so U.FL2 stays disconnected after SMA2 switches to output. For both pairs the backing CGU pin of the U.FL side is never enabled when a direction change activates it, so userspace sees the pin as disconnected even though the routing is correct. Fix by guarding the U.FL disconnect path against inactive pins and by updating the paired U.FL pin fully on SMA direction changes: manage ICE_SMA2_UFL2_RX_DIS for the SMA2/U.FL2 pair and enable the backing CGU pin whenever the peer becomes active. Fixes: `2dd5d03c77` ("ice: redesign dpll sma/u.fl pins control") Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-8-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-30 11:37:39 +02:00
Petr Oros	56a643aed0	ice: fix missing SMA pin initialization in DPLL subsystem The DPLL SMA/U.FL pin redesign introduced ice_dpll_sw_pin_frequency_get() which gates frequency reporting on the pin's active flag. This flag is determined by ice_dpll_sw_pins_update() from the PCA9575 GPIO expander state. Before the redesign, SMA pins were exposed as direct HW input/output pins and ice_dpll_frequency_get() returned the CGU frequency unconditionally — the PCA9575 state was never consulted. The PCA9575 powers on with all outputs high, setting ICE_SMA1_DIR_EN, ICE_SMA1_TX_EN, ICE_SMA2_DIR_EN and ICE_SMA2_TX_EN. Nothing in the driver writes the register during initialization, so ice_dpll_sw_pins_update() sees all pins as inactive and ice_dpll_sw_pin_frequency_get() permanently returns 0 Hz for every SW pin. Fix this by writing a default SMA configuration in ice_dpll_init_info_sw_pins(): clear all SMA bits, then set SMA1 and SMA2 as active inputs (DIR_EN=0) with U.FL1 output and U.FL2 input disabled. Each SMA/U.FL pair shares a physical signal path so only one pin per pair can be active at a time. U.FL pins still report frequency 0 after this fix: U.FL1 (output-only) is disabled by ICE_SMA1_TX_EN which keeps the TX output buffer off, and U.FL2 (input-only) is disabled by ICE_SMA2_UFL2_RX_DIS. They can be activated by changing the corresponding SMA pin direction via dpll netlink. Fixes: `2dd5d03c77` ("ice: redesign dpll sma/u.fl pins control") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-7-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-30 11:37:38 +02:00
Petr Oros	70ad216411	ice: fix infinite recursion in ice_cfg_tx_topo via ice_init_dev_hw On certain E810 configurations where firmware supports Tx scheduler topology switching (tx_sched_topo_comp_mode_en), ice_cfg_tx_topo() may need to apply a new 5-layer or 9-layer topology from the DDP package. If the AQ command to set the topology fails (e.g. due to invalid DDP data or firmware limitations), the global configuration lock must still be cleared via a CORER reset. Commit `86aae43f21` ("ice: don't leave device non-functional if Tx scheduler config fails") correctly fixed this by refactoring ice_cfg_tx_topo() to always trigger CORER after acquiring the global lock and re-initialize hardware via ice_init_hw() afterwards. However, commit `8a37f9e2ff` ("ice: move ice_deinit_dev() to the end of deinit paths") later moved ice_init_dev_hw() into ice_init_hw(), breaking the reinit path introduced by `86aae43f21`. This creates an infinite recursive call chain: ice_init_hw() ice_init_dev_hw() ice_cfg_tx_topo() # topology change needed ice_deinit_hw() ice_init_hw() # reinit after CORER ice_init_dev_hw() # recurse ice_cfg_tx_topo() ... # stack overflow Fix by moving ice_init_dev_hw() back out of ice_init_hw() and calling it explicitly from ice_probe() and ice_devlink_reinit_up(). The third caller, ice_cfg_tx_topo(), intentionally does not need ice_init_dev_hw() during its reinit, it only needs the core HW reinitialization. This breaks the recursion cleanly without adding flags or guards. The deinit ordering changes from commit `8a37f9e2ff` ("ice: move ice_deinit_dev() to the end of deinit paths") which fixed slow rmmod are preserved, only the init-side placement of ice_init_dev_hw() is reverted. Fixes: `8a37f9e2ff` ("ice: move ice_deinit_dev() to the end of deinit paths") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Alexander Nowlin <alexander.nowlin@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-6-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-30 11:37:38 +02:00
Petr Oros	54ef024879	ice: fix NULL pointer dereference in ice_reset_all_vfs() ice_reset_all_vfs() ignores the return value of ice_vf_rebuild_vsi(). When the VSI rebuild fails (e.g. during NVM firmware update via nvmupdate64e), ice_vsi_rebuild() tears down the VSI on its error path, leaving txq_map and rxq_map as NULL. The subsequent unconditional call to ice_vf_post_vsi_rebuild() leads to a NULL pointer dereference in ice_ena_vf_q_mappings() when it accesses vsi->txq_map[0]. The single-VF reset path in ice_reset_vf() already handles this correctly by checking the return value of ice_vf_reconfig_vsi() and skipping ice_vf_post_vsi_rebuild() on failure. Apply the same pattern to ice_reset_all_vfs(): check the return value of ice_vf_rebuild_vsi() and skip ice_vf_post_vsi_rebuild() and ice_eswitch_attach_vf() on failure. The VF is left safely disabled (ICE_VF_STATE_INIT not set, VFGEN_RSTAT not set to VFACTIVE) and can be recovered via a VFLR triggered by a PCI reset of the VF (sysfs reset or driver rebind). Note that this patch does not prevent the VF VSI rebuild from failing during NVM update — the underlying cause is firmware being in a transitional state while the EMP reset is processed, which can cause Admin Queue commands (ice_add_vsi, ice_cfg_vsi_lan) to fail. This patch only prevents the subsequent NULL pointer dereference that crashes the kernel when the rebuild does fail. crash> bt PID: 50795 TASK: ff34c9ee708dc680 CPU: 1 COMMAND: "kworker/u512:5" #0 [ff72159bcfe5bb50] machine_kexec at ffffffffaa8850ee #1 [ff72159bcfe5bba8] __crash_kexec at ffffffffaaa15fba #2 [ff72159bcfe5bc68] crash_kexec at ffffffffaaa16540 #3 [ff72159bcfe5bc70] oops_end at ffffffffaa837eda #4 [ff72159bcfe5bc90] page_fault_oops at ffffffffaa893997 #5 [ff72159bcfe5bce8] exc_page_fault at ffffffffab528595 #6 [ff72159bcfe5bd10] asm_exc_page_fault at ffffffffab600bb2 [exception RIP: ice_ena_vf_q_mappings+0x79] RIP: ffffffffc0a85b29 RSP: ff72159bcfe5bdc8 RFLAGS: 00010206 RAX: 00000000000f0000 RBX: ff34c9efc9c00000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000010 RDI: ff34c9efc9c00000 RBP: ff34c9efc27d4828 R8: 0000000000000093 R9: 0000000000000040 R10: ff34c9efc27d4828 R11: 0000000000000040 R12: 0000000000100000 R13: 0000000000000010 R14: R15: ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ff72159bcfe5bdf8] ice_sriov_post_vsi_rebuild at ffffffffc0a85e2e [ice] #8 [ff72159bcfe5be08] ice_reset_all_vfs at ffffffffc0a920b4 [ice] #9 [ff72159bcfe5be48] ice_service_task at ffffffffc0a31519 [ice] #10 [ff72159bcfe5be88] process_one_work at ffffffffaa93dca4 #11 [ff72159bcfe5bec8] worker_thread at ffffffffaa93e9de #12 [ff72159bcfe5bf18] kthread at ffffffffaa946663 #13 [ff72159bcfe5bf50] ret_from_fork at ffffffffaa8086b9 The panic occurs attempting to dereference the NULL pointer in RDX at ice_sriov.c:294, which loads vsi->txq_map (offset 0x4b8 in ice_vsi). The faulting VSI is an allocated slab object but not fully initialized after a failed ice_vsi_rebuild(): crash> struct ice_vsi 0xff34c9efc27d4828 netdev = 0x0, rx_rings = 0x0, tx_rings = 0x0, q_vectors = 0x0, txq_map = 0x0, rxq_map = 0x0, alloc_txq = 0x10, num_txq = 0x10, alloc_rxq = 0x10, num_rxq = 0x10, The nvmupdate64e process was performing NVM firmware update: crash> bt 0xff34c9edd1a30000 PID: 49858 TASK: ff34c9edd1a30000 CPU: 1 COMMAND: "nvmupdate64e" #0 [ff72159bcd617618] __schedule at ffffffffab5333f8 #4 [ff72159bcd617750] ice_sq_send_cmd at ffffffffc0a35347 [ice] #5 [ff72159bcd6177a8] ice_sq_send_cmd_retry at ffffffffc0a35b47 [ice] #6 [ff72159bcd617810] ice_aq_send_cmd at ffffffffc0a38018 [ice] #7 [ff72159bcd617848] ice_aq_read_nvm at ffffffffc0a40254 [ice] #8 [ff72159bcd6178b8] ice_read_flat_nvm at ffffffffc0a4034c [ice] #9 [ff72159bcd617918] ice_devlink_nvm_snapshot at ffffffffc0a6ffa5 [ice] dmesg: ice 0000:13:00.0: firmware recommends not updating fw.mgmt, as it may result in a downgrade. continuing anyways ice 0000:13:00.1: ice_init_nvm failed -5 ice 0000:13:00.1: Rebuild failed, unload and reload driver Fixes: `12bb018c53` ("ice: Refactor VF reset") Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-5-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-30 11:37:38 +02:00
Petr Oros	34d33313b5	iavf: add VIRTCHNL_OP_ADD_VLAN to success completion handler The V1 ADD_VLAN opcode had no success handler; filters sent via V1 stayed in ADDING state permanently. Add a fallthrough case so V1 filters also transition ADDING -> ACTIVE on PF confirmation. Critically, add an `if (v_retval) break` guard: the error switch in iavf_virtchnl_completion() does NOT return after handling errors, it falls through to the success switch. Without this guard, a PF-rejected ADD would incorrectly mark ADDING filters as ACTIVE, creating a driver/HW mismatch where the driver believes the filter is installed but the PF never accepted it. For V2, this is harmless: iavf_vlan_add_reject() in the error block already kfree'd all ADDING filters, so the success handler finds nothing to transition. Fixes: `968996c070` ("iavf: Fix VLAN_V2 addition/rejection") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-4-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-30 11:37:38 +02:00
Petr Oros	bbcbe4ed70	iavf: wait for PF confirmation before removing VLAN filters The VLAN filter DELETE path was asymmetric with the ADD path: ADD waits for PF confirmation (ADD -> ADDING -> ACTIVE), but DELETE immediately frees the filter struct after sending the DEL message without waiting for the PF response. This is problematic because: - If the PF rejects the DEL, the filter remains in HW but the driver has already freed the tracking structure, losing sync. - Race conditions between DEL pending and other operations (add, reset) cannot be properly resolved if the filter struct is already gone. Add IAVF_VLAN_REMOVING state to make the DELETE path symmetric: REMOVE -> REMOVING (send DEL) -> PF confirms -> kfree -> PF rejects -> ACTIVE In iavf_del_vlans(), transition filters from REMOVE to REMOVING instead of immediately freeing them. The new DEL completion handler in iavf_virtchnl_completion() frees filters on success or reverts them to ACTIVE on error. Update iavf_add_vlan() to handle the REMOVING state: if a DEL is pending and the user re-adds the same VLAN, queue it for ADD so it gets re-programmed after the PF processes the DEL. The !VLAN_FILTERING_ALLOWED early-exit path still frees filters directly since no PF message is sent in that case. Also update iavf_del_vlan() to skip filters already in REMOVING state: DEL has been sent to PF and the completion handler will free the filter when PF confirms. Without this guard, the sequence DEL(pending) -> user-del -> second DEL could cause the PF to return an error for the second DEL (filter already gone), causing the completion handler to incorrectly revert a deleted filter back to ACTIVE. Fixes: `968996c070` ("iavf: Fix VLAN_V2 addition/rejection") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-3-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-30 11:37:38 +02:00
Petr Oros	f2ce65b9b9	iavf: stop removing VLAN filters from PF on interface down When a VF goes down, the driver currently sends DEL_VLAN to the PF for every VLAN filter (ACTIVE -> DISABLE -> send DEL -> INACTIVE), then re-adds them all on UP (INACTIVE -> ADD -> send ADD -> ADDING -> ACTIVE). This round-trip is unnecessary because: 1. The PF disables the VF's queues via VIRTCHNL_OP_DISABLE_QUEUES, which already prevents all RX/TX traffic regardless of VLAN filter state. 2. The VLAN filters remaining in PF HW while the VF is down is harmless - packets matching those filters have nowhere to go with queues disabled. 3. The DEL+ADD cycle during down/up creates race windows where the VLAN filter list is incomplete. With spoofcheck enabled, the PF enables TX VLAN filtering on the first non-zero VLAN add, blocking traffic for any VLANs not yet re-added. Remove the entire DISABLE/INACTIVE state machinery: - Remove IAVF_VLAN_DISABLE and IAVF_VLAN_INACTIVE enum values - Remove iavf_restore_filters() and its call from iavf_open() - Remove VLAN filter handling from iavf_clear_mac_vlan_filters(), rename it to iavf_clear_mac_filters() - Remove DEL_VLAN_FILTER scheduling from iavf_down() - Remove all DISABLE/INACTIVE handling from iavf_del_vlans() VLAN filters now stay ACTIVE across down/up cycles. Only explicit user removal (ndo_vlan_rx_kill_vid) or PF/VF reset triggers VLAN filter deletion/re-addition. Fixes: `ed1f5b58ea` ("i40evf: remove VLAN filters on close") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-2-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-30 11:37:38 +02:00
Petr Oros	70d62b669f	iavf: rename IAVF_VLAN_IS_NEW to IAVF_VLAN_ADDING Rename the IAVF_VLAN_IS_NEW state to IAVF_VLAN_ADDING to better describe what the state represents: an ADD request has been sent to the PF and is waiting for a response. This is a pure rename with no behavioral change, preparing for a cleanup of the VLAN filter state machine. Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-1-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-30 11:37:38 +02:00

1 2 3 4 5 ...

1444301 Commits