Commit Graph

1429271 Commits

Author SHA1 Message Date
Jakub Kicinski
ff1cb3ad2a selftests: drv-net: gro: add test for packet ordering
Add a test to check if the NIC reorders packets if the hit GRO.

Link: https://patch.msgid.link/20260318033819.1469350-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-19 16:57:28 -07:00
Jakub Kicinski
ba5d4128fc selftests: drv-net: gro: test GRO stats
Test accuracy of GRO stats. We want to cover two potentially tricky
cases:
 - single segment GRO
 - packets which were eligible but didn't get GRO'd

The first case is trivial, teach gro.c to send one packet, and check
GRO stats didn't move.

Second case requires gro.c to send a lot of flows expecting the NIC
to run out of GRO flow capacity.

To avoid system traffic noise we steer the packets to a dedicated
queue and operate on qstat.

Link: https://patch.msgid.link/20260318033819.1469350-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-19 16:57:28 -07:00
Jakub Kicinski
f26d43acf1 selftests: drv-net: gro: use SO_TXTIME to schedule packets together
Longer packet sequence tests are quite flaky when the test is run
over a real network. Try to avoid at least the jitter on the sender
side by scheduling all the packets to be sent at once using SO_TXTIME.
Use hardcoded tx time of 5msec in the future. In my test increasing
this time past 2msec makes no difference so 5msec is plenty of margin.
Since we now expect more output buffering make sure to raise SNDBUF.

Note that this is an opportunistic reliability improvement which
will only work if the qdisc can schedule Tx time for us (fq).
Fiddling with qdisc config was deemed too complex, so it's not
part of the patch.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260318033819.1469350-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-19 16:57:28 -07:00
Jakub Kicinski
9b29afa116 selftests: drv-net: give HW stats sync time extra 25% of margin
There are transient failures for devices which update stats
periodically, especially if it's the FW DMA'ing the stats
rather than host periodic work querying the FW. Wait 25%
longer than strictly necessary.

For devices which don't report stats-block-usecs we retain
25 msec as the default wait time (0.025sec == 20,000usec * 1.25).

Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260318033819.1469350-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-19 16:57:28 -07:00
Jakub Kicinski
8888bf4fb9 selftests: net: move gro to lib for HW vs SW reuse
The gro.c packet sender is used for SW testing but bulk of incoming
new tests will be HW-specific. So it's better to put them under
drivers/net/hw/, to avoid tip-toeing around netdevsim. Move gro.c
to lib so we can reuse it.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260318033819.1469350-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-19 16:57:28 -07:00
Jakub Kicinski
edab1ca5ec Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-7.0-rc5).

net/netfilter/nft_set_rbtree.c
  598adea720 ("netfilter: revert nft_set_rbtree: validate open interval overlap")
  3aea466a43 ("netfilter: nft_set_rbtree: don't disable bh when acquiring tree lock")
https://lore.kernel.org/abgaQBpeGstdN4oq@sirena.org.uk

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-19 14:16:00 -07:00
Linus Torvalds
a1d9d8e833 Merge tag 'net-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from wireless, Bluetooth and netfilter.

  Nothing too exciting here, mostly fixes for corner cases.

  Current release - fix to a fix:

   - bonding: prevent potential infinite loop in bond_header_parse()

  Current release - new code bugs:

   - wifi: mac80211: check tdls flag in ieee80211_tdls_oper

  Previous releases - regressions:

   - af_unix: give up GC if MSG_PEEK intervened

   - netfilter: conntrack: add missing netlink policy validations

   - NFC: nxp-nci: allow GPIOs to sleep"

* tag 'net-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (78 commits)
  MPTCP: fix lock class name family in pm_nl_create_listen_socket
  icmp: fix NULL pointer dereference in icmp_tag_validation()
  net: dsa: bcm_sf2: fix missing clk_disable_unprepare() in error paths
  net: shaper: protect from late creation of hierarchy
  net: shaper: protect late read accesses to the hierarchy
  net: mvpp2: guard flow control update with global_tx_fc in buffer switching
  nfnetlink_osf: validate individual option lengths in fingerprints
  netfilter: nf_tables: release flowtable after rcu grace period on error
  netfilter: bpf: defer hook memory release until rcu readers are done
  net: bonding: fix NULL deref in bond_debug_rlb_hash_show
  udp_tunnel: fix NULL deref caused by udp_sock_create6 when CONFIG_IPV6=n
  net/mlx5e: Fix race condition during IPSec ESN update
  net/mlx5e: Prevent concurrent access to IPSec ASO context
  net/mlx5: qos: Restrict RTNL area to avoid a lock cycle
  ipv6: add NULL checks for idev in SRv6 paths
  NFC: nxp-nci: allow GPIOs to sleep
  net: macb: fix uninitialized rx_fs_lock
  net: macb: fix use-after-free access to PTP clock
  netdevsim: drop PSP ext ref on forward failure
  wifi: mac80211: always free skb on ieee80211_tx_prepare_skb() failure
  ...
2026-03-19 11:25:40 -07:00
Li Xiasong
7ab4a7c5d9 MPTCP: fix lock class name family in pm_nl_create_listen_socket
In mptcp_pm_nl_create_listen_socket(), use entry->addr.family
instead of sk->sk_family for lock class setup. The 'sk' parameter
is a netlink socket, not the MPTCP subflow socket being created.

Fixes: cee4034a3d ("mptcp: fix lockdep false positive in mptcp_pm_nl_create_listen_socket()")
Signed-off-by: Li Xiasong <lixiasong1@huawei.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260319112159.3118874-1-lixiasong1@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-19 09:37:48 -07:00
Weiming Shi
614aefe56a icmp: fix NULL pointer dereference in icmp_tag_validation()
icmp_tag_validation() unconditionally dereferences the result of
rcu_dereference(inet_protos[proto]) without checking for NULL.
The inet_protos[] array is sparse -- only about 15 of 256 protocol
numbers have registered handlers. When ip_no_pmtu_disc is set to 3
(hardened PMTU mode) and the kernel receives an ICMP Fragmentation
Needed error with a quoted inner IP header containing an unregistered
protocol number, the NULL dereference causes a kernel panic in
softirq context.

 Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] SMP KASAN NOPTI
 KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
 RIP: 0010:icmp_unreach (net/ipv4/icmp.c:1085 net/ipv4/icmp.c:1143)
 Call Trace:
  <IRQ>
  icmp_rcv (net/ipv4/icmp.c:1527)
  ip_protocol_deliver_rcu (net/ipv4/ip_input.c:207)
  ip_local_deliver_finish (net/ipv4/ip_input.c:242)
  ip_local_deliver (net/ipv4/ip_input.c:262)
  ip_rcv (net/ipv4/ip_input.c:573)
  __netif_receive_skb_one_core (net/core/dev.c:6164)
  process_backlog (net/core/dev.c:6628)
  handle_softirqs (kernel/softirq.c:561)
  </IRQ>

Add a NULL check before accessing icmp_strict_tag_validation. If the
protocol has no registered handler, return false since it cannot
perform strict tag validation.

Fixes: 8ed1dc44d3 ("ipv4: introduce hardened ip_no_pmtu_disc mode")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Link: https://patch.msgid.link/20260318130558.1050247-4-bestswngs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-19 09:27:36 -07:00
Anas Iqbal
b487318496 net: dsa: bcm_sf2: fix missing clk_disable_unprepare() in error paths
Smatch reports:
drivers/net/dsa/bcm_sf2.c:997 bcm_sf2_sw_resume() warn:
'priv->clk' from clk_prepare_enable() not released on lines: 983,990.

The clock enabled by clk_prepare_enable() in bcm_sf2_sw_resume()
is not released if bcm_sf2_sw_rst() or bcm_sf2_cfp_resume() fails.

Add the missing clk_disable_unprepare() calls in the error paths
to properly release the clock resource.

Fixes: e9ec5c3bd2 ("net: dsa: bcm_sf2: request and handle clocks")
Reviewed-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Anas Iqbal <mohd.abd.6602@gmail.com>
Link: https://patch.msgid.link/20260318084212.1287-1-mohd.abd.6602@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-19 09:26:40 -07:00
Linus Torvalds
e9825d1c79 Merge tag 'pm-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
 "These fix an idle loop issue exposed by recent changes and a race
  condition related to device removal in the runtime PM core code:

   - Consolidate the handling of two special cases in the idle loop that
     occur when only one CPU idle state is present (Rafael Wysocki)

   - Fix a race condition related to device removal in the runtime PM
     core code that may cause a stale device object pointer to be
     dereferenced (Bart Van Assche)"

* tag 'pm-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: runtime: Fix a race condition related to device removal
  sched: idle: Consolidate the handling of two special cases
2026-03-19 08:45:34 -07:00
Linus Torvalds
d107dc8c9c Merge tag 'acpi-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI support fixes from Rafael Wysocki:
 "These fix an MFD child automatic modprobe issue introduced recently,
  an ACPI processor driver issue introduced by a previous fix and an
  ACPICA issue causing confusing messages regarding _DSM arguments to be
  printed:

   - Update the format of the last argument of _DSM to avoid printing
     confusing error messages in some cases (Saket Dumbre)

   - Fix MFD child automatic modprobe issue by removing a stale check
     from acpi_companion_match() (Pratap Nirujogi)

   - Prevent possible use-after-free in acpi_processor_errata_piix4()
     from occurring by rearranging the code to print debug messages
     while holding references to relevant device objects (Rafael
     Wysocki)"

* tag 'acpi-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: bus: Fix MFD child automatic modprobe issue
  ACPI: processor: Fix previous acpi_processor_errata_piix4() fix
  ACPICA: Update the format of Arg3 of _DSM
2026-03-19 08:42:59 -07:00
Paolo Abeni
e7577a06ae Merge tag 'nf-26-03-19' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Florian Westphal says:

====================
netfilter: updates for net

The following patchset contains Netfilter fixes for *net*:

1) Fix UaF when netfilter bpf link goes away while nfnetlink dumps
   current hook list, we have to wait until rcu readers are gone.

2) Fix UaF when flowtable fails to register all devices, similar
   bug as 1). From Pablo Neira Ayuso.

3) nfnetlink_osf fails to properly validate option length fields.
   From Weiming Shi.

netfilter pull request nf-26-03-19

* tag 'nf-26-03-19' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  nfnetlink_osf: validate individual option lengths in fingerprints
  netfilter: nf_tables: release flowtable after rcu grace period on error
  netfilter: bpf: defer hook memory release until rcu readers are done
====================

Link: https://patch.msgid.link/20260319093834.19933-1-fw@strlen.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-19 15:39:33 +01:00
Paolo Abeni
9ac76f3d0b Merge tag 'wireless-next-2026-03-19' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:

====================
Aside from various small improvements/cleanups, not much:
 - cfg80211/mac80211: S1G and UHR improvements
 - hwsim: incumbent signal report test support

* tag 'wireless-next-2026-03-19' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (31 commits)
  qtnfmac: use alloc_netdev macro for single queue devices
  wifi: libertas: don't kill URBs in interrupt context
  wifi: libertas: use USB anchors for tracking in-flight URBs
  wifi: nl80211: use int for band coming from netlink
  wifi: rsi_91x_usb: do not pause rfkill polling when stopping mac80211
  wifi: mac80211: fix STA link removal during link removal
  wifi: nl80211: reject S1G/60G with HT chantype
  wifi: ieee80211: fix definition of EHT-MCS 15 in MRU
  wifi: cfg80211: check non-S1G width with S1G chandef
  wifi: cfg80211: restrict cfg80211_chandef_create() to only HT-based bands
  wifi: mac80211: don't use cfg80211_chandef_create() for default chandef
  wifi: mac80211: Remove deleted sta links in ieee80211_ml_reconf_work()
  wifi: b43: use register definitions in nphy_op_software_rfkill
  wifi: cfg80211: split control freq check from chandef check
  wifi: mac80211: always use full chanctx compatible check
  wifi: mac80211: refactor chandef tracing macros
  wifi: mac80211: validate HE 6 GHz operation when EHT is used
  wifi: nl80211: split out UHR operation information
  wifi: mwifiex: drop redundant device reference
  wifi: rt2x00: drop redundant device reference
  ...
====================

Link: https://patch.msgid.link/20260319082439.79875-3-johannes@sipsolutions.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-19 15:30:20 +01:00
Rafael J. Wysocki
5cbcd6c074 Merge branches 'acpica' and 'acpi-bus'
Merge an ACPICA fix and a core ACPI support code fix for 7.0-rc5:

 - Update the format of the last argument of _DSM to avoid printing
   confusing error messages in some cases (Saket Dumbre)

 - Fix MFD child automatic modprobe issue by removing a stale check
   from acpi_companion_match() (Pratap Nirujogi)

* acpica:
  ACPICA: Update the format of Arg3 of _DSM

* acpi-bus:
  ACPI: bus: Fix MFD child automatic modprobe issue
2026-03-19 14:57:06 +01:00
Rafael J. Wysocki
9633370653 Merge branch 'pm-runtime'
Merge a fix for a race condition related to device removal (Bart Van
Assche) for 7.0-rc5.

* pm-runtime:
  PM: runtime: Fix a race condition related to device removal
2026-03-19 14:49:44 +01:00
Jakub Kicinski
d75ec7e8ba net: shaper: protect from late creation of hierarchy
We look up a netdev during prep of Netlink ops (pre- callbacks)
and take a ref to it. Then later in the body of the callback
we take its lock or RCU which are the actual protections.

The netdev may get unregistered in between the time we take
the ref and the time we lock it. We may allocate the hierarchy
after flush has already run, which would lead to a leak.

Take the instance lock in pre- already, this saves us from the race
and removes the need for dedicated lock/unlock callbacks completely.
After all, if there's any chance of write happening concurrently
with the flush - we're back to leaking the hierarchy.

We may take the lock for devices which don't support shapers but
we're only dealing with SET operations here, not taking the lock
would be optimizing for an error case.

Fixes: 93954b40f6 ("net-shapers: implement NL set and delete operations")
Link: https://lore.kernel.org/20260309173450.538026-1-p@1g4.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20260317161014.779569-2-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-19 13:47:15 +01:00
Jakub Kicinski
0f9ea7141f net: shaper: protect late read accesses to the hierarchy
We look up a netdev during prep of Netlink ops (pre- callbacks)
and take a ref to it. Then later in the body of the callback
we take its lock or RCU which are the actual protections.

This is not proper, a conversion from a ref to a locked netdev
must include a liveness check (a check if the netdev hasn't been
unregistered already). Fix the read cases (those under RCU).
Writes needs a separate change to protect from creating the
hierarchy after flush has already run.

Fixes: 4b623f9f0f ("net-shapers: implement NL get operation")
Reported-by: Paul Moses <p@1g4.org>
Link: https://lore.kernel.org/20260309173450.538026-1-p@1g4.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20260317161014.779569-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-19 13:47:15 +01:00
Eric Woudstra
96450df197 bridge: No DEV_PATH_BR_VLAN_UNTAG_HW for dsa foreign
In network setup as below:

             fastpath bypass
 .----------------------------------------.
/                                          \
|                        IP - forwarding    |
|                       /                \  v
|                      /                  wan ...
|                     /
|                     |
|                     |
|                   brlan.1
|                     |
|    +-------------------------------+
|    |           vlan 1              |
|    |                               |
|    |     brlan (vlan-filtering)    |
|    |               +---------------+
|    |               |  DSA-SWITCH   |
|    |    vlan 1     |               |
|    |      to       |               |
|    |   untagged    1     vlan 1    |
|    +---------------+---------------+
.         /                   \
 ----->wlan1                 lan0
       .                       .
       .                       ^
       ^                     vlan 1 tagged packets
     untagged packets

br_vlan_fill_forward_path_mode() sets DEV_PATH_BR_VLAN_UNTAG_HW when
filling in from brlan.1 towards wlan1. But it should be set to
DEV_PATH_BR_VLAN_UNTAG in this case. Using BR_VLFLAG_ADDED_BY_SWITCHDEV
is not correct. The dsa switchdev adds it as a foreign port.

The same problem for all foreignly added dsa vlans on the bridge.

First add the vlan, trying only native devices.
If this fails, we know this may be a vlan from a foreign device.

Use BR_VLFLAG_TAGGING_BY_SWITCHDEV to make sure DEV_PATH_BR_VLAN_UNTAG_HW
is set only when there if no foreign device involved.

Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Eric Woudstra <ericwouds@gmail.com>
Link: https://patch.msgid.link/20260317110347.363875-1-ericwouds@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-19 13:14:00 +01:00
Paolo Abeni
0c45064487 Merge tag 'ovpn-net-next-20260317' of https://github.com/OpenVPN/ovpn-net-next
Antonio Quartulli says:

====================
Included features:
* use bitops.h API when possible
* send netlink notification in case of client float event
* implement support for asymmetric peer IDs
* consolidate memory allocations during crypto operations
* add netlink notification check in selftests
* add FW mark check in selftest

* tag 'ovpn-net-next-20260317' of https://github.com/OpenVPN/ovpn-net-next:
  ovpn: consolidate crypto allocations in one chunk
  selftests: ovpn: add test for the FW mark feature
  selftests: ovpn: check asymmetric peer-id
  ovpn: add support for asymmetric peer IDs
  selftests: ovpn: add notification parsing and matching
  ovpn: notify userspace on client float event
  ovpn: pktid: use bitops.h API
  ovpn: use correct array size to parse nested attributes in ovpn_nl_key_swap_doit
  selftests: ovpn: allow compiling ovpn-cli.c with mbedtls3
====================

Link: https://patch.msgid.link/20260317104023.192548-1-antonio@openvpn.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-19 12:50:42 +01:00
Qingfang Deng
9f4960b94f l2tp: ppp: use max L2TP header size for PPP channel hdrlen
chan.hdrlen is read once at channel registration by
ppp_register_net_channel(), and used to set the PPP net device's
hard_header_len. It was set to PPPOL2TP_L2TP_HDR_SIZE_NOSEQ (6), which
is 4 bytes too small if sequence numbers are later enabled via
setsockopt(PPPOL2TP_SO_SENDSEQ), causing unnecessary skb reallocations
on the TX path.

The setsockopt handler attempted to change netdev's hard_header_len by
updating chan.hdrlen, but the PPP layer never re-reads it after the
registration, so the update had no effect.

To avoid the unnecessary reallocations, set chan.hdrlen to
PPPOL2TP_L2TP_HDR_SIZE_SEQ (10) unconditionally at registration and
remove the ineffective update in the setsockopt callback.

Signed-off-by: Qingfang Deng <dqfext@gmail.com>
Link: https://patch.msgid.link/20260317054141.524879-1-dqfext@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-19 11:00:04 +01:00
Muhammad Hammad Ijaz
8a63baadf0 net: mvpp2: guard flow control update with global_tx_fc in buffer switching
mvpp2_bm_switch_buffers() unconditionally calls
mvpp2_bm_pool_update_priv_fc() when switching between per-cpu and
shared buffer pool modes. This function programs CM3 flow control
registers via mvpp2_cm3_read()/mvpp2_cm3_write(), which dereference
priv->cm3_base without any NULL check.

When the CM3 SRAM resource is not present in the device tree (the
third reg entry added by commit 60523583b0 ("dts: marvell: add CM3
SRAM memory to cp11x ethernet device tree")), priv->cm3_base remains
NULL and priv->global_tx_fc is false. Any operation that triggers
mvpp2_bm_switch_buffers(), for example an MTU change that crosses
the jumbo frame threshold, will crash:

  Unable to handle kernel NULL pointer dereference at
  virtual address 0000000000000000
  Mem abort info:
    ESR = 0x0000000096000006
    EC = 0x25: DABT (current EL), IL = 32 bits
  pc : readl+0x0/0x18
  lr : mvpp2_cm3_read.isra.0+0x14/0x20
  Call trace:
   readl+0x0/0x18
   mvpp2_bm_pool_update_fc+0x40/0x12c
   mvpp2_bm_pool_update_priv_fc+0x94/0xd8
   mvpp2_bm_switch_buffers.isra.0+0x80/0x1c0
   mvpp2_change_mtu+0x140/0x380
   __dev_set_mtu+0x1c/0x38
   dev_set_mtu_ext+0x78/0x118
   dev_set_mtu+0x48/0xa8
   dev_ifsioc+0x21c/0x43c
   dev_ioctl+0x2d8/0x42c
   sock_ioctl+0x314/0x378

Every other flow control call site in the driver already guards
hardware access with either priv->global_tx_fc or port->tx_fc.
mvpp2_bm_switch_buffers() is the only place that omits this check.

Add the missing priv->global_tx_fc guard to both the disable and
re-enable calls in mvpp2_bm_switch_buffers(), consistent with the
rest of the driver.

Fixes: 3a616b92a9 ("net: mvpp2: Add TX flow control support for jumbo frames")
Signed-off-by: Muhammad Hammad Ijaz <mhijaz@amazon.com>
Reviewed-by: Gunnar Kudrjavets <gunnarku@amazon.com>
Link: https://patch.msgid.link/20260316193157.65748-1-mhijaz@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-19 10:31:19 +01:00
Weiming Shi
dbdfaae960 nfnetlink_osf: validate individual option lengths in fingerprints
nfnl_osf_add_callback() validates opt_num bounds and string
NUL-termination but does not check individual option length fields.
A zero-length option causes nf_osf_match_one() to enter the option
matching loop even when foptsize sums to zero, which matches packets
with no TCP options where ctx->optp is NULL:

 Oops: general protection fault
 KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
 RIP: 0010:nf_osf_match_one (net/netfilter/nfnetlink_osf.c:98)
 Call Trace:
  nf_osf_match (net/netfilter/nfnetlink_osf.c:227)
  xt_osf_match_packet (net/netfilter/xt_osf.c:32)
  ipt_do_table (net/ipv4/netfilter/ip_tables.c:293)
  nf_hook_slow (net/netfilter/core.c:623)
  ip_local_deliver (net/ipv4/ip_input.c:262)
  ip_rcv (net/ipv4/ip_input.c:573)

Additionally, an MSS option (kind=2) with length < 4 causes
out-of-bounds reads when nf_osf_match_one() unconditionally accesses
optp[2] and optp[3] for MSS value extraction.  While RFC 9293
section 3.2 specifies that the MSS option is always exactly 4
bytes (Kind=2, Length=4), the check uses "< 4" rather than
"!= 4" because lengths greater than 4 do not cause memory
safety issues -- the buffer is guaranteed to be at least
foptsize bytes by the ctx->optsize == foptsize check.

Reject fingerprints where any option has zero length, or where an MSS
option has length less than 4, at add time rather than trusting these
values in the packet matching hot path.

Fixes: 11eeef41d5 ("netfilter: passive OS fingerprint xtables match")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
2026-03-19 10:27:07 +01:00
Pablo Neira Ayuso
d73f4b53aa netfilter: nf_tables: release flowtable after rcu grace period on error
Call synchronize_rcu() after unregistering the hooks from error path,
since a hook that already refers to this flowtable can be already
registered, exposing this flowtable to packet path and nfnetlink_hook
control plane.

This error path is rare, it should only happen by reaching the maximum
number hooks or by failing to set up to hardware offload, just call
synchronize_rcu().

There is a check for already used device hooks by different flowtable
that could result in EEXIST at this late stage. The hook parser can be
updated to perform this check earlier to this error path really becomes
rarely exercised.

Uncovered by KASAN reported as use-after-free from nfnetlink_hook path
when dumping hooks.

Fixes: 3b49e2e94e ("netfilter: nf_tables: add flow table netlink frontend")
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
2026-03-19 10:26:31 +01:00
Florian Westphal
24f90fa399 netfilter: bpf: defer hook memory release until rcu readers are done
Yiming Qian reports UaF when concurrent process is dumping hooks via
nfnetlink_hooks:

BUG: KASAN: slab-use-after-free in nfnl_hook_dump_one.isra.0+0xe71/0x10f0
Read of size 8 at addr ffff888003edbf88 by task poc/79
Call Trace:
 <TASK>
 nfnl_hook_dump_one.isra.0+0xe71/0x10f0
 netlink_dump+0x554/0x12b0
 nfnl_hook_get+0x176/0x230
 [..]

Defer release until after concurrent readers have completed.

Reported-by: Yiming Qian <yimingqian591@gmail.com>
Fixes: 84601d6ee6 ("bpf: add bpf_link support for BPF_NETFILTER programs")
Signed-off-by: Florian Westphal <fw@strlen.de>
2026-03-19 10:26:31 +01:00
Simon Baatz
96a584db75 selftests/net: packetdrill: improve tcp_rcv_neg_window.pkt
The test depends on accepting a packet that is larger than the
advertised window and that does not trigger an immediate ACK.

Previously, the test might still pass even if kernel behavior changed
unexpectedly. Add assertions verifying that the large packet was
accepted and no ACK was sent.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
Link: https://patch.msgid.link/20260316-improve_tcp_neg_usable_wnd_test-v1-1-f16d5e365107@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-19 10:26:25 +01:00
Roi L
dee55bc7cb qtnfmac: use alloc_netdev macro for single queue devices
alloc_netdev is a macro for single queue devices, so there's no need to
call alloc_netdev_mqs with a single tx/rx queue.

Signed-off-by: Roi L <roeilev321_@outlook.com>
Link: https://patch.msgid.link/SN6PR05MB58064E57FE979CE7B2BF7EF3DD42A@SN6PR05MB5806.namprd05.prod.outlook.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-03-19 09:08:09 +01:00
Heitor Alves de Siqueira
7c5c2b661b wifi: libertas: don't kill URBs in interrupt context
Serialization for the TX path was enforced by calling
usb_kill_urb()/usb_kill_anchored_urbs(), to prevent transmission before
a previous URB was completed. usb_tx_block() can be called from
interrupt context (e.g. in the HCD giveback path), so we can't always
use it to kill in-flight URBs.

Prevent sleeping during interrupt context by checking the tx_submitted
anchor for existing URBs. We now return -EBUSY, to indicate there's
a pending request.

Reported-by: syzbot+74afbb6355826ffc2239@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=74afbb6355826ffc2239
Fixes: d66676e6ca ("wifi: libertas: fix WARNING in usb_tx_block")
Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com>
Link: https://patch.msgid.link/20260313-libertas-usb-anchors-v1-2-915afbe988d7@igalia.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-03-19 09:07:55 +01:00
Heitor Alves de Siqueira
a57f35fc19 wifi: libertas: use USB anchors for tracking in-flight URBs
The libertas driver currently handles URB lifecycles manually, which
makes it non-trivial to check if specific URBs are pending or not. Add
anchors for TX/RX URBs, and use those to track in-flight requests.

Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com>
Link: https://patch.msgid.link/20260313-libertas-usb-anchors-v1-1-915afbe988d7@igalia.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-03-19 09:07:55 +01:00
Johannes Berg
f10ebd136d wifi: nl80211: use int for band coming from netlink
This was pointed out before, but there are issues with just
removing the <0 check since enum representation isn't fixed,
nla_type() returns int but really can only return small
non-negative values, etc. Now newer versions of sparse are
also starting to warn on it. Just use int for the band var.

Link: https://patch.msgid.link/20260316123050.8c2d9f3426a0.I86acfa785982993fbffd148cc59049991bd6158f@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-03-19 09:07:40 +01:00
Ville Nummela
777d8ba5aa wifi: rsi_91x_usb: do not pause rfkill polling when stopping mac80211
Removing rsi_91x USB adapter could cause rtnetlink to lock up.
When rsi_mac80211_stop is called, wiphy_lock is locked. Call to
wiphy_rfkill_stop_polling would wait until the work queue has
finished, but because the work queue waits for wiphy_lock, that
would never happen.

Moving the call to rsi_disconnect avoids the lock up.

Signed-off-by: Ville Nummela <ville.nummela@kempower.com>
Link: https://patch.msgid.link/20260318081912.87744-1-ville.nummela@kempower.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-03-19 09:07:16 +01:00
Johannes Berg
eb092b188f wifi: mac80211: fix STA link removal during link removal
ieee80211_sta_free_link() only frees the link and doesn't
unhash it, so it can't be used here. Instead this needs
to use ieee80211_sta_remove_link(), which unhashes it. An
argument against it was that it also calls the driver and
that already happened, but calls to the driver removing a
link that's already removed are suppressed, so that's not
actually an issue. Use it to fix the hashtable.

Reported-and-tested-by: Jouni Malinen <j@w1.fi>
Fixes: 84674b03d8 ("wifi: mac80211: Remove deleted sta links in ieee80211_ml_reconf_work()")
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260318180622.9240067117e9.I45fb2b7f04d75e48d2f3e9c6650ef9f54a314f5b@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-03-19 09:06:49 +01:00
Jakub Kicinski
a7fb05cbb8 Merge branch 'add-ethtool-coalesce_rx_cqe_frames-nsecs-and-use-it-in-mana-driver'
Haiyang Zhang says:

====================
add ethtool COALESCE_RX_CQE_FRAMES/NSECS and use it in MANA driver

Add two parameters for drivers supporting Rx CQE Coalescing.

ETHTOOL_A_COALESCE_RX_CQE_FRAMES:
Maximum number of frames that can be coalesced into a CQE or
writeback.

ETHTOOL_A_COALESCE_RX_CQE_NSECS:
Max time in nanoseconds after the first packet arrival in a
coalesced CQE or writeback to be sent.

Also implement it in MANA driver with the new parameter and
counters.
====================

Link: https://patch.msgid.link/20260317191826.1346111-1-haiyangz@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 20:01:14 -07:00
Haiyang Zhang
d01440e10a net: mana: Add ethtool counters for RX CQEs in coalesced type
For RX CQEs with type CQE_RX_COALESCED_4, to measure the coalescing
efficiency, add counters to count how many contains 2, 3, 4 packets
respectively.
Also, add a counter for the error case of first packet with length == 0.

Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/20260317191826.1346111-4-haiyangz@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 20:01:10 -07:00
Haiyang Zhang
c2fe3ff3d6 net: mana: Add support for RX CQE Coalescing
Our NIC can have up to 4 RX packets on 1 CQE. To support this feature,
check and process the type CQE_RX_COALESCED_4. The default setting is
disabled, to avoid possible regression on latency.

And, add ethtool handler to switch this feature. To turn it on, run:
  ethtool -C <nic> rx-cqe-frames 4
To turn it off:
  ethtool -C <nic> rx-cqe-frames 1

The rx-cqe-nsec is the time out value in nanoseconds after the first
packet arrival in a coalesced CQE to be sent. It's read-only for this
NIC.

Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/20260317191826.1346111-3-haiyangz@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 20:01:10 -07:00
Haiyang Zhang
dc3d720e12 net: ethtool: add ethtool COALESCE_RX_CQE_FRAMES/NSECS
Add two parameters for drivers supporting Rx CQE coalescing /
descriptor writeback.

ETHTOOL_A_COALESCE_RX_CQE_FRAMES:
Maximum number of frames that can be coalesced into a CQE or
writeback.

ETHTOOL_A_COALESCE_RX_CQE_NSECS:
Max time in nanoseconds after the first packet arrival in a
coalesced CQE or writeback to be sent.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/20260317191826.1346111-2-haiyangz@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 20:01:10 -07:00
Saeed Mahameed
04cd075557 net/mlx5e: Remove unused field in mlx5e_flow_steering struct
Not used in mlx5e, clean it up.

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20260317104548.15697-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 19:29:01 -07:00
Bobby Eshleman
3883c2b509 selftests/vsock: auto-detect kernel for guest VMs
When running vmtest.sh inside a nested VM the running kernel may not be
installed on the filesystem at the standard /boot/ or /usr/lib/modules/
paths.

Previously, this would cause vng to fail with "does not exist" since it
could not find the kernel image. Instead, this patch uses --dry-run to
detect if the kernel is available. If not, then we fall back to the
kernel in the kernel source tree. If that fails, then we die.

This way runners, like NIPA, can use vng --run arch/x86/boot/bzImage to
setup an outer VM, and vmtest.sh will still do the right thing setting
up the inner VM.

Due to job control issues in vng, a workaround is used to prevent 'make
kselftest TARGETS=vsock' from hanging until test timeout. A PR has been
placed upstream to solve the issue in vng:

https://github.com/arighi/virtme-ng/pull/453

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
Link: https://patch.msgid.link/20260316-vsock-vmtest-autodetect-kernel-v2-1-5eec7b4831f8@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 19:28:23 -07:00
Jakub Kicinski
7c46bd845d Merge tag 'wireless-2026-03-18' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:

====================
Just a few updates:
 - cfg80211:
   - guarantee pmsr work is cancelled
 - mac80211:
   - reject TDLS operations on non-TDLS stations
   - fix crash in AP_VLAN bandwidth change
   - fix leak or double-free on some TX preparation
     failures
   - remove keys needed for beacons _after_ stopping
     those
   - fix debugfs static branch race
   - avoid underflow in inactive time
   - fix another NULL dereference in mesh on invalid
     frames
 - ti/wlcore: avoid infinite realloc loop

* tag 'wireless-2026-03-18' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: mac80211: always free skb on ieee80211_tx_prepare_skb() failure
  wifi: wlcore: Return -ENOMEM instead of -EAGAIN if there is not enough headroom
  wifi: mac80211: fix NULL deref in mesh_matches_local()
  wifi: mac80211: check tdls flag in ieee80211_tdls_oper
  wifi: cfg80211: cancel pmsr_free_wk in cfg80211_pmsr_wdev_down
  wifi: mac80211: Fix static_branch_dec() underflow for aql_disable.
  mac80211: fix crash in ieee80211_chan_bw_change for AP_VLAN stations
  wifi: mac80211: use jiffies_delta_to_msecs() for sta_info inactive times
  wifi: mac80211: remove keys after disabling beaconing
  wifi: mac80211_hwsim: fully initialise PMSR capabilities
====================

Link: https://patch.msgid.link/20260318172515.381148-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 19:25:41 -07:00
Jakub Kicinski
76eea68d5f Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Tariq Toukan says:

====================
mlx5-next updates 2026-03-17

The following pull-request contains common mlx5 updates

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5: Expose MLX5_UMR_ALIGN definition
  {net/RDMA}/mlx5: Add LAG demux table API and vport demux rules
  net/mlx5: Add VHCA RX flow destination support for FW steering
  net/mlx5: LAG, replace mlx5_get_dev_index with LAG sequence number
  net/mlx5: E-switch, modify peer miss rule index to vhca_id
  net/mlx5: LAG, use xa_alloc to manage LAG device indices
  net/mlx5: LAG, replace pf array with xarray
  net/mlx5: Add silent mode set/query and VHCA RX IFC bits
  net/mlx5: Add IFC bits for shared headroom pool PBMC support
  net/mlx5: Expose TLP emulation capabilities
  net/mlx5: Add TLP emulation device capabilities
====================

Link: https://patch.msgid.link/20260317075844.12066-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 19:08:50 -07:00
Eric Biggers
d5516452a3 qed: Reimplement qed_mcast_bin_from_mac() using library functions
The calculation done by qed_calc_crc32c() is the standard
least-significant-bit-first CRC-32C except it uses
most-significant-bit-first order for the actual CRC variable.  That is
equivalent to bit-reflecting the input and output CRC.  Replace it with
equivalent calls to the corresponding library functions.

Tested with a simple userspace program which tested that the old and new
implementations of qed_mcast_bin_from_mac() produce the same outputs.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Link: https://patch.msgid.link/20260316221453.66078-1-ebiggers@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 19:07:37 -07:00
Jakub Kicinski
4455a30b04 Merge branch 'net-mlx5-support-ptm-on-arm-architecture'
Tariq Toukan says:

====================
net/mlx5: Support PTM on ARM architecture

This series by Carolina refactors mlx5 crosststamp initialization and
enables cross-timestamp support on ARM.
====================

Link: https://patch.msgid.link/20260316133607.8738-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 19:05:17 -07:00
Carolina Jubran
96aca5efec net/mlx5: Support cross-timestamping on ARM architectures
Extend cross-timestamp support for ARM systems that implement the ARM
architected timer.

Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260316133607.8738-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 19:05:14 -07:00
Carolina Jubran
f87ca3b905 net/mlx5: Move crosststamp setup into helper function
Move the crosststamp registration logic into a dedicated helper,
mlx5_init_crosststamp().

This prepares the code for a follow-up patch around PTM handling.

Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260316133607.8738-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 19:05:14 -07:00
Jakub Kicinski
2839841f8e Merge branch 'net-mdio-gpio-remove-unneeded-headers'
Bartosz Golaszewski says:

====================
net: mdio-gpio: remove unneeded headers

This removes linux/mdio-gpio.h and linux/platform_data/mdio-gpio.h as
they are not needed due to the symbols either being used by the
mdio-gpio module alone or not used at all.
====================

Link: https://patch.msgid.link/20260316-gpio-mdio-hdr-cleanup-v1-0-2df696f74728@oss.qualcomm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 18:30:00 -07:00
Bartosz Golaszewski
356d4fbcf3 net: mdio-gpio: remove linux/platform_data/mdio-gpio.h
Nobody defines struct mdio_gpio_platform_data. Remove platform data
support from mdio-gpio and drop the header.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260316-gpio-mdio-hdr-cleanup-v1-2-2df696f74728@oss.qualcomm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 18:29:57 -07:00
Bartosz Golaszewski
9c6b4009da net: mdio-gpio: remove linux/mdio-gpio.h
The three defines from the linux/mdio-gpio.h header are only used in the
mdio-gpio module. There's no reason to have them in a public header.
Move them into the driver and remove mdio-gpio.h.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260316-gpio-mdio-hdr-cleanup-v1-1-2df696f74728@oss.qualcomm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 18:29:57 -07:00
Jakub Kicinski
ec03006a11 Merge branch 'remove-kconfig-sysmbol-mdio_bus'
Heiner Kallweit says:

====================
remove Kconfig sysmbol MDIO_BUS

MDIO-based regmap is the last user of config symbol MDIO_BUS.
MDIO access needs a MII bus, which requires PHYLIB for the provider part.
Therefore make REGMAP_MDIO depend on PHYLIB, what allows to remove
config symbol MDIO_BUS.
====================

Link: https://patch.msgid.link/bc63cf87-3dba-4ab6-9c84-caa7357c3273@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 18:27:14 -07:00
Heiner Kallweit
91283bd5b0 net: phy: remove Kconfig symbol MDIO_BUS
After usage of config symbol MDIO_BUS has been removed from REGMAP_MIO
as last user, the symbol can be removed.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/9cdf83e9-470d-45da-8efe-ace0decf0204@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 18:27:12 -07:00
Heiner Kallweit
e611a97032 regmap: mdio: make it depend on PHYLIB
MDIO-based regmap is the last user of config symbol MDIO_BUS.
MDIO access needs a MII bus, which requires PHYLIB for the provider part.
Therefore make REGMAP_MDIO depend on PHYLIB, what allows to remove
config symbol MDIO_BUS in a follow-up patch.

Note: After c5a219395b ("regmap: Move selecting for REGMAP_MDIO and
      REGMAP_IRQ") switching to "depends on" should be fine, w/o risk
      of a circular dependency.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Mark Brown <broonie@kernel.org>
Link: https://patch.msgid.link/a21a3b3e-272e-4c61-986e-48a2cb3421d9@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 18:27:12 -07:00