Accesses to the PTP_CLK_CTRL register are done through a hardcoded
address which doesn't match with the KSZ8463's register layout.
Add a new entry for the PTP_CLK_CTRL register in the regs[] tables.
Use the regs[] table to retrieve the PTP_CLK_CTRL register address
when accessing it.
Remove the macro defining the address to prevent further use.
Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com>
Link: https://patch.msgid.link/20260105-ksz-rework-v1-3-a68df7f57375@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The IRQ logic of the KSZ8463 differs from that of other KSZ switches.
It doesn't have a 'mask' register but an 'enable' one instead. The
common IRQ framework can still be used though as soon as we reverse
the logic (using '1' to enable interrupts instead of '0') for KSZ8463
cases.
Move the initialization of the kirq->masked outside of
ksz_irq_common_setup() to keep this function truly common when
IRQ support for the KSZ8463 is added.
Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com>
Link: https://patch.msgid.link/20260105-ksz-rework-v1-1-a68df7f57375@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Convert the Google Virtual Ethernet (GVE) driver to use the new
.get_rx_ring_count ethtool operation instead of handling
ETHTOOL_GRXRINGS in .get_rxnfc. This simplifies the code by moving the
ring count query to a dedicated callback.
The new callback provides the same functionality in a more direct way,
following the ongoing ethtool API modernization.
Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260105-gxring_google-v2-1-e7cfe924d429@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Replace rio_probe1() printk(KERN_INFO) messages with netdev_{info,dbg}().
Keep one netdev_info() line for device identification; move the rest to
netdev_dbg() to avoid spamming the kernel log.
Log rx_timeout on a separate line since netdev_*() prefixes each
message and the multi-line formatting looks broken otherwise.
No functional change intended.
Tested-on: D-Link DGE-550T Rev-A3
Signed-off-by: Yeounsu Moon <yyyynoom@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260105130552.8721-2-yyyynoom@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
prestera_ldr_wait_buf() returns the result of readl_poll_timeout(),
which is 0 on success or -ETIMEDOUT on failure. Its current return
type is u32.
Assigning this u32 value to an int variable works today because the
bit pattern of (u32)-ETIMEDOUT (0xffffff92) is correctly interpreted
as -ETIMEDOUT when stored in an int. However, keeping the function
return type as u32 is misleading and fragile.
Change the return type to int to reflect the actual semantic of
returning negative error codes and to avoid potential issues with
unsigned casts. There is no functional change.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260103174313.1172197-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In the non-RT kernel, local_bh_disable() merely disables preemption,
whereas it maps to an actual spin lock in the RT kernel. Consequently,
when attempting to refill RX buffers via netdev_alloc_skb() in
macb_mac_link_up(), a deadlock scenario arises as follows:
WARNING: possible circular locking dependency detected
6.18.0-08691-g2061f18ad76e #39 Not tainted
------------------------------------------------------
kworker/0:0/8 is trying to acquire lock:
ffff00080369bbe0 (&bp->lock){+.+.}-{3:3}, at: macb_start_xmit+0x808/0xb7c
but task is already holding lock:
ffff000803698e58 (&queue->tx_ptr_lock){+...}-{3:3}, at: macb_start_xmit
+0x148/0xb7c
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (&queue->tx_ptr_lock){+...}-{3:3}:
rt_spin_lock+0x50/0x1f0
macb_start_xmit+0x148/0xb7c
dev_hard_start_xmit+0x94/0x284
sch_direct_xmit+0x8c/0x37c
__dev_queue_xmit+0x708/0x1120
neigh_resolve_output+0x148/0x28c
ip6_finish_output2+0x2c0/0xb2c
__ip6_finish_output+0x114/0x308
ip6_output+0xc4/0x4a4
mld_sendpack+0x220/0x68c
mld_ifc_work+0x2a8/0x4f4
process_one_work+0x20c/0x5f8
worker_thread+0x1b0/0x35c
kthread+0x144/0x200
ret_from_fork+0x10/0x20
-> #2 (_xmit_ETHER#2){+...}-{3:3}:
rt_spin_lock+0x50/0x1f0
sch_direct_xmit+0x11c/0x37c
__dev_queue_xmit+0x708/0x1120
neigh_resolve_output+0x148/0x28c
ip6_finish_output2+0x2c0/0xb2c
__ip6_finish_output+0x114/0x308
ip6_output+0xc4/0x4a4
mld_sendpack+0x220/0x68c
mld_ifc_work+0x2a8/0x4f4
process_one_work+0x20c/0x5f8
worker_thread+0x1b0/0x35c
kthread+0x144/0x200
ret_from_fork+0x10/0x20
-> #1 ((softirq_ctrl.lock)){+.+.}-{3:3}:
lock_release+0x250/0x348
__local_bh_enable_ip+0x7c/0x240
__netdev_alloc_skb+0x1b4/0x1d8
gem_rx_refill+0xdc/0x240
gem_init_rings+0xb4/0x108
macb_mac_link_up+0x9c/0x2b4
phylink_resolve+0x170/0x614
process_one_work+0x20c/0x5f8
worker_thread+0x1b0/0x35c
kthread+0x144/0x200
ret_from_fork+0x10/0x20
-> #0 (&bp->lock){+.+.}-{3:3}:
__lock_acquire+0x15a8/0x2084
lock_acquire+0x1cc/0x350
rt_spin_lock+0x50/0x1f0
macb_start_xmit+0x808/0xb7c
dev_hard_start_xmit+0x94/0x284
sch_direct_xmit+0x8c/0x37c
__dev_queue_xmit+0x708/0x1120
neigh_resolve_output+0x148/0x28c
ip6_finish_output2+0x2c0/0xb2c
__ip6_finish_output+0x114/0x308
ip6_output+0xc4/0x4a4
mld_sendpack+0x220/0x68c
mld_ifc_work+0x2a8/0x4f4
process_one_work+0x20c/0x5f8
worker_thread+0x1b0/0x35c
kthread+0x144/0x200
ret_from_fork+0x10/0x20
other info that might help us debug this:
Chain exists of:
&bp->lock --> _xmit_ETHER#2 --> &queue->tx_ptr_lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&queue->tx_ptr_lock);
lock(_xmit_ETHER#2);
lock(&queue->tx_ptr_lock);
lock(&bp->lock);
*** DEADLOCK ***
Call trace:
show_stack+0x18/0x24 (C)
dump_stack_lvl+0xa0/0xf0
dump_stack+0x18/0x24
print_circular_bug+0x28c/0x370
check_noncircular+0x198/0x1ac
__lock_acquire+0x15a8/0x2084
lock_acquire+0x1cc/0x350
rt_spin_lock+0x50/0x1f0
macb_start_xmit+0x808/0xb7c
dev_hard_start_xmit+0x94/0x284
sch_direct_xmit+0x8c/0x37c
__dev_queue_xmit+0x708/0x1120
neigh_resolve_output+0x148/0x28c
ip6_finish_output2+0x2c0/0xb2c
__ip6_finish_output+0x114/0x308
ip6_output+0xc4/0x4a4
mld_sendpack+0x220/0x68c
mld_ifc_work+0x2a8/0x4f4
process_one_work+0x20c/0x5f8
worker_thread+0x1b0/0x35c
kthread+0x144/0x200
ret_from_fork+0x10/0x20
Notably, invoking the mog_init_rings() callback upon link establishment
is unnecessary. Instead, we can exclusively call mog_init_rings() within
the ndo_open() callback. This adjustment resolves the deadlock issue.
Furthermore, since MACB_CAPS_MACB_IS_EMAC cases do not use mog_init_rings()
when opening the network interface via at91ether_open(), moving
mog_init_rings() to macb_open() also eliminates the MACB_CAPS_MACB_IS_EMAC
check.
Fixes: 633e98a711 ("net: macb: use resolved link config in mac_link_up()")
Cc: stable@vger.kernel.org
Suggested-by: Kevin Hao <kexin.hao@windriver.com>
Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
Link: https://patch.msgid.link/20251222015624.1994551-1-xiaolei.wang@windriver.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
syzbot reported a crash [1] in dql_completed() after recent usbnet
BQL adoption.
The reason for the crash is that netdev_reset_queue() is called too soon.
It should be called after cancel_work_sync(&dev->bh_work) to make
sure no more TX completion can happen.
[1]
kernel BUG at lib/dynamic_queue_limits.c:99 !
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 1 UID: 0 PID: 5197 Comm: udevd Tainted: G L syzkaller #0 PREEMPT(full)
Tainted: [L]=SOFTLOCKUP
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
RIP: 0010:dql_completed+0xbe1/0xbf0 lib/dynamic_queue_limits.c:99
Call Trace:
<IRQ>
netdev_tx_completed_queue include/linux/netdevice.h:3864 [inline]
netdev_completed_queue include/linux/netdevice.h:3894 [inline]
usbnet_bh+0x793/0x1020 drivers/net/usb/usbnet.c:1601
process_one_work kernel/workqueue.c:3257 [inline]
process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340
bh_worker+0x2b1/0x600 kernel/workqueue.c:3611
tasklet_action+0xc/0x70 kernel/softirq.c:952
handle_softirqs+0x27d/0x850 kernel/softirq.c:622
__do_softirq kernel/softirq.c:656 [inline]
invoke_softirq kernel/softirq.c:496 [inline]
__irq_exit_rcu+0xca/0x1f0 kernel/softirq.c:723
irq_exit_rcu+0x9/0x30 kernel/softirq.c:739
Fixes: 7ff14c5204 ("usbnet: Add support for Byte Queue Limits (BQL)")
Reported-by: syzbot+5b55e49f8bbd84631a9c@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6945644f.a70a0220.207337.0113.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Simon Schippers <simon.schippers@tu-dortmund.de>
Link: https://patch.msgid.link/20251219144459.692715-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Currently, interrupts are automatically enabled immediately upon
request. This allows interrupt to fire before the associated NAPI
context is fully initialized and cause failures like below:
[ 0.946369] Call Trace:
[ 0.946369] <IRQ>
[ 0.946369] __napi_poll+0x2a/0x1e0
[ 0.946369] net_rx_action+0x2f9/0x3f0
[ 0.946369] handle_softirqs+0xd6/0x2c0
[ 0.946369] ? handle_edge_irq+0xc1/0x1b0
[ 0.946369] __irq_exit_rcu+0xc3/0xe0
[ 0.946369] common_interrupt+0x81/0xa0
[ 0.946369] </IRQ>
[ 0.946369] <TASK>
[ 0.946369] asm_common_interrupt+0x22/0x40
[ 0.946369] RIP: 0010:pv_native_safe_halt+0xb/0x10
Use the `IRQF_NO_AUTOEN` flag when requesting interrupts to prevent auto
enablement and explicitly enable the interrupt in NAPI initialization
path (and disable it during NAPI teardown).
This ensures that interrupt lifecycle is strictly coupled with
readiness of NAPI context.
Cc: stable@vger.kernel.org
Fixes: 1dfc2e4611 ("gve: Refactor napi add and remove functions")
Signed-off-by: Ankit Garg <nktgrg@google.com>
Reviewed-by: Jordan Rhee <jordanrhee@google.com>
Reviewed-by: Joshua Washington <joshwash@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20251219102945.2193617-1-hramamurthy@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
There is a crash issue when running zero copy XDP_TX action, the crash
log is shown below.
[ 216.122464] Unable to handle kernel paging request at virtual address fffeffff80000000
[ 216.187524] Internal error: Oops: 0000000096000144 [#1] SMP
[ 216.301694] Call trace:
[ 216.304130] dcache_clean_poc+0x20/0x38 (P)
[ 216.308308] __dma_sync_single_for_device+0x1bc/0x1e0
[ 216.313351] stmmac_xdp_xmit_xdpf+0x354/0x400
[ 216.317701] __stmmac_xdp_run_prog+0x164/0x368
[ 216.322139] stmmac_napi_poll_rxtx+0xba8/0xf00
[ 216.326576] __napi_poll+0x40/0x218
[ 216.408054] Kernel panic - not syncing: Oops: Fatal exception in interrupt
For XDP_TX action, the xdp_buff is converted to xdp_frame by
xdp_convert_buff_to_frame(). The memory type of the resulting xdp_frame
depends on the memory type of the xdp_buff. For page pool based xdp_buff
it produces xdp_frame with memory type MEM_TYPE_PAGE_POOL. For zero copy
XSK pool based xdp_buff it produces xdp_frame with memory type
MEM_TYPE_PAGE_ORDER0. However, stmmac_xdp_xmit_back() does not check the
memory type and always uses the page pool type, this leads to invalid
mappings and causes the crash. Therefore, check the xdp_buff memory type
in stmmac_xdp_xmit_back() to fix this issue.
Fixes: bba2556efa ("net: stmmac: Enable RX via AF_XDP zero-copy")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Hariprasad Kelam <hkelam@marvell.com>
Link: https://patch.msgid.link/20251204071332.1907111-1-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Johannes Berg says:
====================
Various fixes all over, most are recent regressions but
also some long-standing issues:
- cfg80211:
- fix an issue with overly long SSIDs
- mac80211:
- long-standing beacon protection issue on some devices
- for for a multi-BSSID AP-side issue
- fix a syzbot warning on OCB (not really used in practice)
- remove WARN on connections using disabled channels,
as that can happen due to changes in the disable flag
- fix monitor mode list iteration
- iwlwifi:
- fix firmware loading on certain (really old) devices
- add settime64 to PTP clock to avoid a warning and clock
registration failure, but it's not actually supported
- rtw88:
- remove WQ_UNBOUND since it broke USB adapters
(because it can't be used with WQ_BH)
- fix SDIO issues with certain devices
- rtl8192cu: fix TID array out-of-bounds (since 6.9)
- wlcore (TI): add missing skb push headroom increase
* tag 'wireless-2025-12-17' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: iwlwifi: Implement settime64 as stub for MVM/MLD PTP
wifi: iwlwifi: Fix firmware version handling
wifi: mac80211: ocb: skip rx_no_sta when interface is not joined
wifi: mac80211: do not use old MBSSID elements
wifi: mac80211: don't WARN for connections on invalid channels
wifi: wlcore: ensure skb headroom before skb_push
wifi: cfg80211: sme: store capped length in __cfg80211_connect_result()
wifi: mac80211: fix list iteration in ieee80211_add_virtual_monitor()
wifi: mac80211: Discard Beacon frames to non-broadcast address
Revert "wifi: rtw88: add WQ_UNBOUND to alloc_workqueue users"
wifi: rtlwifi: 8192cu: fix tid out of range in rtl92cu_tx_fill_desc()
wifi: rtw88: limit indirect IO under powered off for RTL8822CS
====================
Link: https://patch.msgid.link/20251217201441.59876-3-johannes@sipsolutions.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
When mana_serv_reset() encounters -ETIMEDOUT or -EPROTO from
mana_gd_resume(), it performs a PCI rescan via mana_serv_rescan().
mana_serv_rescan() calls pci_stop_and_remove_bus_device(), which can
invoke the driver's remove path and free the gdma_context associated
with the device. After returning, mana_serv_reset() currently jumps to
the out label and attempts to clear gc->in_service, dereferencing a
freed gdma_context.
The issue was observed with the following call logs:
[ 698.942636] BUG: unable to handle page fault for address: ff6c2b638088508d
[ 698.943121] #PF: supervisor write access in kernel mode
[ 698.943423] #PF: error_code(0x0002) - not-present page
[S[ 698.943793] Pat Dec 6 07:GD5 100000067 P4D 1002f7067 PUD 1002f8067 PMD 101bef067 PTE 0
0:56 2025] hv_[n e 698.944283] Oops: Oops: 0002 [#1] SMP NOPTI
tvsc f8615163-00[ 698.944611] CPU: 28 UID: 0 PID: 249 Comm: kworker/28:1
...
[Sat Dec 6 07:50:56 2025] R10: [ 699.121594] mana 7870:00:00.0 enP30832s1: Configured vPort 0 PD 18 DB 16
000000000000001b R11: 0000000000000000 R12: ff44cf3f40270000
[Sat Dec 6 07:50:56 2025] R13: 0000000000000001 R14: ff44cf3f402700c8 R15: ff44cf3f4021b405
[Sat Dec 6 07:50:56 2025] FS: 0000000000000000(0000) GS:ff44cf7e9fcf9000(0000) knlGS:0000000000000000
[Sat Dec 6 07:50:56 2025] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Sat Dec 6 07:50:56 2025] CR2: ff6c2b638088508d CR3: 000000011fe43001 CR4: 0000000000b73ef0
[Sat Dec 6 07:50:56 2025] Call Trace:
[Sat Dec 6 07:50:56 2025] <TASK>
[Sat Dec 6 07:50:56 2025] mana_serv_func+0x24/0x50 [mana]
[Sat Dec 6 07:50:56 2025] process_one_work+0x190/0x350
[Sat Dec 6 07:50:56 2025] worker_thread+0x2b7/0x3d0
[Sat Dec 6 07:50:56 2025] kthread+0xf3/0x200
[Sat Dec 6 07:50:56 2025] ? __pfx_worker_thread+0x10/0x10
[Sat Dec 6 07:50:56 2025] ? __pfx_kthread+0x10/0x10
[Sat Dec 6 07:50:56 2025] ret_from_fork+0x21a/0x250
[Sat Dec 6 07:50:56 2025] ? __pfx_kthread+0x10/0x10
[Sat Dec 6 07:50:56 2025] ret_from_fork_asm+0x1a/0x30
[Sat Dec 6 07:50:56 2025] </TASK>
Fix this by returning immediately after mana_serv_rescan() to avoid
accessing GC state that may no longer be valid.
Fixes: 9bf66036d6 ("net: mana: Handle hardware recovery events when probing the device")
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
Link: https://patch.msgid.link/20251218131054.GA3173@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
port_fdb_dump() is supposed to only add fdb entries, but we iterate over
the full ARL table, which also includes multicast entries.
So check if the entry is a multicast entry before passing it on to the
callback().
Additionally, the port of those entries is a bitmask, not a port number,
so any included entries would have even be for the wrong port.
Fixes: 1da6df85c6 ("net: dsa: b53: Implement ARL add/del/dump operations")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20251217205756.172123-1-jonas.gorski@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-12-17 (i40e, iavf, idpf, e1000)
For i40e:
Przemyslaw immediately schedules service task following changes to
filters to ensure timely setup for PTP.
Gregory Herrero adjusts VF descriptor size checks to be device specific.
For iavf:
Kohei Enju corrects a couple of condition checks which caused off-by-one
issues.
For idpf:
Larysa fixes LAN memory region call to follow expected requirements.
Brian Vazquez reduces mailbox wait time during init to avoid lengthy
delays.
For e1000:
Guangshuo Li adds validation of data length to prevent out-of-bounds
access.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
e1000: fix OOB in e1000_tbi_should_accept()
idpf: reduce mbx_task schedule delay to 300us
idpf: fix LAN memory regions command on some NVMs
iavf: fix off-by-one issues in iavf_config_rss_reg()
i40e: validate ring_len parameter against hardware-specific values
i40e: fix scheduling in set_rx_mode
====================
Link: https://patch.msgid.link/
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The LIBWX library code is what calls into phylink, so any user of
it has to select CONFIG_PHYLINK at the moment, with NGBEVF missing this:
x86_64-linux-ld: drivers/net/ethernet/wangxun/libwx/wx_ethtool.o: in function `wx_nway_reset':
wx_ethtool.c:(.text+0x613): undefined reference to `phylink_ethtool_nway_reset'
x86_64-linux-ld: drivers/net/ethernet/wangxun/libwx/wx_ethtool.o: in function `wx_get_link_ksettings':
wx_ethtool.c:(.text+0x62b): undefined reference to `phylink_ethtool_ksettings_get'
x86_64-linux-ld: drivers/net/ethernet/wangxun/libwx/wx_ethtool.o: in function `wx_set_link_ksettings':
wx_ethtool.c:(.text+0x643): undefined reference to `phylink_ethtool_ksettings_set'
x86_64-linux-ld: drivers/net/ethernet/wangxun/libwx/wx_ethtool.o: in function `wx_get_pauseparam':
wx_ethtool.c:(.text+0x65b): undefined reference to `phylink_ethtool_get_pauseparam'
x86_64-linux-ld: drivers/net/ethernet/wangxun/libwx/wx_ethtool.o: in function `wx_set_pauseparam':
wx_ethtool.c:(.text+0x677): undefined reference to `phylink_ethtool_set_pauseparam'
Add the 'select PHYLINK' line in the libwx option directly so this will
always be enabled for all current and future wangxun drivers, and remove
the now duplicate lines.
Fixes: a0008a3658 ("net: wangxun: add ngbevf build")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251216213547.115026-1-arnd@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
In async_set_registers(), when usb_submit_urb() fails, the allocated
async_req structure and URB are not freed, causing a memory leak.
The completion callback async_set_reg_cb() is responsible for freeing
these allocations, but it is only called after the URB is successfully
submitted and completes (successfully or with error). If submission
fails, the callback never runs and the memory is leaked.
Fix this by freeing both the URB and the request structure in the error
path when usb_submit_urb() fails.
Reported-by: syzbot+8dd915c7cb0490fc8c52@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=8dd915c7cb0490fc8c52
Fixes: 4d12997a9b ("drivers: net: usb: rtl8150: concurrent URB bugfix")
Signed-off-by: Deepakkumar Karn <dkarn@redhat.com>
Link: https://patch.msgid.link/20251216151304.59865-2-dkarn@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
During the stress tests, early RX adaptation handshakes can fail, such
as missing the RX_ADAPT ACK or not receiving a coefficient update before
block lock is established. Continuing to retry RX adaptation in this
state is often ineffective if the current mode selection is not viable.
Resetting the RX adaptation retry counter when an RX_ADAPT request fails
to receive ACK or a coefficient update prior to block lock, and clearing
mode_set so the next bring-up performs a fresh mode selection rather
than looping on a likely invalid configuration.
Fixes: 4f3b20bfbb ("amd-xgbe: add support for rx-adaptation")
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Link: https://patch.msgid.link/20251215151728.311713-1-Raju.Rangoju@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Since airoha_probe() is not executed under rtnl lock, there is small race
where a given device is configured by user-space while the remaining ones
are not completely loaded from the dts yet. This condition will allow a
hw device misconfiguration since there are some conditions (e.g. GDM2 check
in airoha_dev_init()) that require all device are properly loaded from the
device tree. Fix the issue moving net_devices registration at the end of
the airoha_probe routine.
Fixes: 9cd451d414 ("net: airoha: Add loopback support for GDM2")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251214-airoha-fix-dev-registration-v1-1-860e027ad4c6@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
When nvmem_cell_read() fails in mt798x_phy_calibration(), the function
returns without calling nvmem_cell_put(), leaking the cell reference.
Move nvmem_cell_put() right after nvmem_cell_read() to ensure the cell
reference is always released regardless of the read result.
Found via static analysis and code review.
Fixes: 98c485eaf5 ("net: phy: add driver for MediaTek SoC built-in GE PHYs")
Cc: stable@vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20251211081313.2368460-1-linmq006@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The Aspeed MDIO controller may return incorrect data when a read operation
follows immediately after a write. Due to a controller bug, the subsequent
read can latch stale data, causing the polling logic to terminate earlier
than expected.
To work around this hardware issue, insert a dummy read after each write
operation. This ensures that the next actual read returns the correct
data and prevents premature polling exit.
This workaround has been verified to stabilize MDIO transactions on
affected Aspeed platforms.
Fixes: f160e99462 ("net: phy: Add mdio-aspeed")
Signed-off-by: Jacky Chou <jacky_chou@aspeedtech.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20251211-aspeed_mdio_add_dummy_read-v3-1-382868869004@aspeedtech.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pull networking fixes from Paolo Abeni:
"Including fixes from netfilter and CAN.
Current release - regressions:
- netfilter: nf_conncount: fix leaked ct in error paths
- sched: act_mirred: fix loop detection
- sctp: fix potential deadlock in sctp_clone_sock()
- can: fix build dependency
- eth: mlx5e: do not update BQL of old txqs during channel
reconfiguration
Previous releases - regressions:
- sched: ets: always remove class from active list before deleting it
- inet: frags: flush pending skbs in fqdir_pre_exit()
- netfilter: nf_nat: remove bogus direction check
- mptcp:
- schedule rtx timer only after pushing data
- avoid deadlock on fallback while reinjecting
- can: gs_usb: fix error handling
- eth:
- mlx5e:
- avoid unregistering PSP twice
- fix double unregister of HCA_PORTS component
- bnxt_en: fix XDP_TX path
- mlxsw: fix use-after-free when updating multicast route stats
Previous releases - always broken:
- ethtool: avoid overflowing userspace buffer on stats query
- openvswitch: fix middle attribute validation in push_nsh() action
- eth:
- mlx5: fw_tracer, validate format string parameters
- mlxsw: spectrum_router: fix neighbour use-after-free
- ipvlan: ignore PACKET_LOOPBACK in handle_mode_l2()
Misc:
- Jozsef Kadlecsik retires from maintaining netfilter
- tools: ynl: fix build on systems with old kernel headers"
* tag 'net-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits)
net: hns3: add VLAN id validation before using
net: hns3: using the num_tqps to check whether tqp_index is out of range when vf get ring info from mbx
net: hns3: using the num_tqps in the vf driver to apply for resources
net: enetc: do not transmit redirected XDP frames when the link is down
selftests/tc-testing: Test case exercising potential mirred redirect deadlock
net/sched: act_mirred: fix loop detection
sctp: Clear inet_opt in sctp_v6_copy_ip_options().
sctp: Fetch inet6_sk() after setting ->pinet6 in sctp_clone_sock().
net/handshake: duplicate handshake cancellations leak socket
net/mlx5e: Don't include PSP in the hard MTU calculations
net/mlx5e: Do not update BQL of old txqs during channel reconfiguration
net/mlx5e: Trigger neighbor resolution for unresolved destinations
net/mlx5e: Use ip6_dst_lookup instead of ipv6_dst_lookup_flow for MAC init
net/mlx5: Serialize firmware reset with devlink
net/mlx5: fw_tracer, Handle escaped percent properly
net/mlx5: fw_tracer, Validate format string parameters
net/mlx5: Drain firmware reset in shutdown callback
net/mlx5: fw reset, clear reset requested on drain_fw_reset
net: dsa: mxl-gsw1xx: manually clear RANEG bit
net: dsa: mxl-gsw1xx: fix .shutdown driver operation
...
Marc Kleine-Budde says:
====================
pull-request: can 2025-12-18
this is a pull request of 3 patches for net/main.
Tetsuo Handa contributes 2 patches to fix race windows in the j1939
protocol to properly handle disappearing network devices.
The last patch is by me, it fixes a build dependency with the CAN
drivers, that got introduced while fixing a dependency between the CAN
protocol and CAN device code.
linux-can-fixes-for-6.19-20251218
* tag 'linux-can-fixes-for-6.19-20251218' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: fix build dependency
can: j1939: make j1939_sk_bind() fail if device is no longer registered
can: j1939: make j1939_session_activate() fail if device is no longer registered
====================
Link: https://patch.msgid.link/20251218123132.664533-1-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Currently, the VLAN id may be used without validation when
receive a VLAN configuration mailbox from VF. The length of
vlan_del_fail_bmap is BITS_TO_LONGS(VLAN_N_VID). It may cause
out-of-bounds memory access once the VLAN id is bigger than
or equal to VLAN_N_VID.
Therefore, VLAN id needs to be checked to ensure it is within
the range of VLAN_N_VID.
Fixes: fe4144d47e ("net: hns3: sync VLAN filter entries when kill VLAN ID failed")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251211023737.2327018-4-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Currently, rss_size = num_tqps / tc_num. If tc_num is 1, then num_tqps
equals rss_size. However, if the tc_num is greater than 1, then rss_size
will be less than num_tqps, causing the tqp_index check for subsequent TCs
using rss_size to always fail.
This patch uses the num_tqps to check whether tqp_index is out of range,
instead of rss_size.
Fixes: 326334aad0 ("net: hns3: add a check for tqp_index in hclge_get_ring_chain_from_mbx()")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251211023737.2327018-3-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Currently, hdev->htqp is allocated using hdev->num_tqps, and kinfo->tqp
is allocated using kinfo->num_tqps. However, kinfo->num_tqps is set to
min(new_tqps, hdev->num_tqps); Therefore, kinfo->num_tqps may be smaller
than hdev->num_tqps, which causes some hdev->htqp[i] to remain
uninitialized in hclgevf_knic_setup().
Thus, this patch allocates hdev->htqp and kinfo->tqp using hdev->num_tqps,
ensuring that the lengths of hdev->htqp and kinfo->tqp are consistent
and that all elements are properly initialized.
Fixes: e2cb1dec97 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251211023737.2327018-2-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
In the current implementation, the enetc_xdp_xmit() always transmits
redirected XDP frames even if the link is down, but the frames cannot
be transmitted from TX BD rings when the link is down, so the frames
are still kept in the TX BD rings. If the XDP program is uninstalled,
users will see the following warning logs.
fsl_enetc 0000:00:00.0 eno0: timeout for tx ring #6 clear
More worse, the TX BD ring cannot work properly anymore, because the
HW PIR and CIR are not equal after the re-initialization of the TX
BD ring. At this point, the BDs between CIR and PIR are invalid,
which will cause a hardware malfunction.
Another reason is that there is internal context in the ring prefetch
logic that will retain the state from the first incarnation of the ring
and continue prefetching from the stale location when we re-initialize
the ring. The internal context is only reset by an FLR. That is to say,
for LS1028A ENETC, software cannot set the HW CIR and PIR when
initializing the TX BD ring.
It does not make sense to transmit redirected XDP frames when the link is
down. Add a link status check to prevent transmission in this condition.
This fixes part of the issue, but more complex cases remain. For example,
the TX BD ring may still contain unsent frames when the link goes down.
Those situations require additional patches, which will build on this
one.
Fixes: 9d2b68cc10 ("net: enetc: add support for XDP_REDIRECT")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Hariprasad Kelam <hkelam@marvell.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20251211020919.121113-1-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Commit [1] added the 40 bytes required by the PSP header+trailer and the
UDP header to MLX5E_ETH_HARD_MTU, which limits the device-wide max
software MTU that could be set. This is not okay, because most packets
are not PSP packets and it doesn't make sense to always reserve space
for headers which won't get added in most cases.
As it turns out, for TCP connections, PSP overhead is already taken into
account in the TCP MSS calculations via inet_csk(sk)->icsk_ext_hdr_len.
This was added in commit [2]. This means that the extra space reserved
in the hard MTU for mlx5 ends up unused and wasted.
Remove the unnecessary 40 byte reservation from hard MTU.
[1] commit e5a1861a29 ("net/mlx5e: Implement PSP Tx data path")
[2] commit e97269257f ("net: psp: update the TCP MSS to reflect PSP
packet overhead")
Fixes: e5a1861a29 ("net/mlx5e: Implement PSP Tx data path")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1765284977-1363052-10-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
During channel reconfiguration (e.g., ethtool private flags changes),
the driver can trigger a kernel BUG_ON in dql_completed() with the error
"kernel BUG at lib/dynamic_queue_limits.c:99".
The issue occurs in the following sequence:
During mlx5e_safe_switch_params(), old channels are deactivated via
mlx5e_deactivate_txqsq(). New channels are created and activated, taking
ownership of the netdev_queues and their BQL state.
When old channels are closed via mlx5e_close_txqsq(), there may be
pending TX descriptors (sq->cc != sq->pc) that were in-flight during the
deactivation.
mlx5e_free_txqsq_descs() frees these pending descriptors and attempts to
complete them via netdev_tx_completed_queue().
However, the BQL state (dql->num_queued and dql->num_completed) have
been reset in mlx5e_activate_txqsq and belong to the new queue owner,
leading to dql->num_queued - dql->num_completed < nbytes.
This triggers BUG_ON(count > num_queued - num_completed) in
dql_completed().
Fixes: 3b88a535a8 ("net/mlx5e: Defer channels closure to reduce interface down time")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: William Tu <witu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Link: https://patch.msgid.link/1765284977-1363052-9-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
When initializing the MAC addresses for an outbound IPsec packet offload
rule in mlx5e_ipsec_init_macs, the call to dst_neigh_lookup is used to
find the next-hop neighbor (typically the gateway in tunnel mode).
This call might create a new neighbor entry if one doesn't already
exist. This newly created entry starts in the INCOMPLETE state, as the
kernel hasn't yet sent an ARP or NDISC probe to resolve the MAC
address. In this case, neigh_ha_snapshot will correctly return an
all-zero MAC address.
IPsec packet offload requires the actual next-hop MAC address to
program the rule correctly. If the neighbor state is INCOMPLETE when
the rule is created, the hardware rule is programmed with an all-zero
destination MAC address. Packets sent using this rule will be
subsequently dropped by the receiving network infrastructure or host.
This patch adds a check specifically for the outbound offload path. If
neigh_ha_snapshot returns an all-zero MAC address, it proactively
calls neigh_event_send(n, NULL). This ensures the kernel immediately
sends the initial ARP or NDISC probe if one isn't already pending,
accelerating the resolution process. This helps prevent the hardware
rule from being programmed with an invalid MAC address and avoids
packet drops due to unresolved neighbors.
Fixes: 71670f766b ("net/mlx5e: Support routed networks during IPsec MACs initialization")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1765284977-1363052-8-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Replace ipv6_stub->ipv6_dst_lookup_flow() with ip6_dst_lookup() in
mlx5e_ipsec_init_macs() since IPsec transformations are not needed
during Security Association setup - only basic routing information is
required for nexthop MAC address resolution.
This resolves an issue where XfrmOutNoStates error counter would be
incremented when xfrm policy is configured before xfrm state, as the
IPsec-aware routing function would attempt policy checks during SA
initialization.
Fixes: 71670f766b ("net/mlx5e: Support routed networks during IPsec MACs initialization")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1765284977-1363052-7-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The firmware reset mechanism can be triggered by asynchronous events,
which may race with other devlink operations like devlink reload or
devlink dev eswitch set, potentially leading to inconsistent states.
This patch addresses the race by using the devl_lock to serialize the
firmware reset against other devlink operations. When a reset is
requested, the driver attempts to acquire the lock. If successful, it
sets a flag to block devlink reload or eswitch changes, ACKs the reset
to firmware and then releases the lock. If the lock is already held by
another operation, the driver NACKs the firmware reset request,
indicating that the reset cannot proceed.
Firmware reset does not keep the devl_lock and instead uses an internal
firmware reset bit. This is because firmware resets can be triggered by
asynchronous events, and processed in different threads. It is illegal
and unsafe to acquire a lock in one thread and attempt to release it in
another, as lock ownership is intrinsically thread-specific.
This change ensures that firmware resets and other devlink operations
are mutually exclusive during the critical reset request phase,
preventing race conditions.
Fixes: 38b9f903f2 ("net/mlx5: Handle sync reset request event")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mateusz Berezecki <mberezecki@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1765284977-1363052-6-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>