Commit Graph

82745 Commits

Author SHA1 Message Date
Avraham Stern
853800c746 wifi: nl80211/cfg80211: support operating as RSTA in PMSR FTM request
Add an option to operate as the RSTA in an FTM measurement request.
When requested, the device will dwell on the requested channel until
the peer starts the FTM negotiation. This option is only valid for
trigger-based/non trigger-based measurement with LMR feedback which
will allow the RSTA to receive the results of the measurement.

Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260111190221.1f95fc0afab4.Iae2d32783b8e7c4a29089fec0f4c6bce94d303cc@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-27 13:40:38 +01:00
Avraham Stern
cfd46d1c6f wifi: nl80211/cfg80211: add negotiated burst period to FTM result
The FTM result includes some of the periodic measurement negotiated
parameters (like the burst duration and number of bursts), but it
doesn't include the burst period. Add it to the FTM result
notification.

Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260111190221.e0778f86edef.I3c98c1933eb639963bc3ffdef81a8788b59f2188@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-27 13:40:36 +01:00
Avraham Stern
853ce6943c wifi: nl80211/cfg80211: clarify periodic FTM parameters for non-EDCA based ranging
Periodic FTM request attributes are defined based on the periodic
parameters used in EDCA-based ranging negotiation. However, non-EDCA
based ranging (trigger-based/non-trigger-based) does not include
periodic parameters in the negotiation protocol, even though upper
layers may still request periodic measurements.

Clarify the semantics of periodic ranging attributes when used with
non-EDCA based ranging.

Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260111190221.b89cb3f68e1a.I7a9d8c6d1c66c77f1b43120a841101c96c3f19ad@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-27 13:40:30 +01:00
Avraham Stern
86c6b6e4d1 wifi: nl80211/cfg80211: add new FTM capabilities
Add new capabilities to the PMSR FTM capabilities list. The new
capabilities include 6 GHz support, supported number of spatial streams
and supported number of LTF repetitions.

Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Tested-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260111190221.bf43785c18f6.Ic98cf9790ddee84bf88e5720b93c46c23af3c96c@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-27 13:40:25 +01:00
Johannes Berg
1e1dd9eeaa wifi: mac80211: mark iface work SKBs as consumed
Using kfree_skb() here is misleading when looking at
traces, since these frames have been handled. Use
consume_skb() instead.

Link: https://patch.msgid.link/20260116092115.1db534bdc12c.Ic0adae06684a6871144398d15cf7700c57620baa@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-19 10:22:35 +01:00
Johannes Berg
58dc87d839 wifi: mac80211: remove RX_DROP
Since it's hard to figure out what RX_DROP means when looking
at traces that drop packets in mac80211, add more specific drop
reasons and remove RX_DROP entirely.

Link: https://patch.msgid.link/20260116092025.79d995e87026.I7cde413988f7a382c551cd1c1e2b05a52ec71755@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-19 10:21:47 +01:00
Miri Korenblit
8a42938a28 wifi: nl80211: ignore cluster id after NAN started
After NAN was started, cluster id updates from the user space should not
happen, since the device already started a cluster with the
previousely provided id.

Since NL80211_CMD_CHANGE_NAN_CONFIG requires to set the full NAN
configuration, we can't require that NL80211_NAN_CONF_CLUSTER_ID won't
be included in this command, and keeping the last confgiured value just
to be able to compare it against the new one seems a bit overkill.

Therefore, just ignore cluster id in this command and clarify the
documentation.

Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260107142229.fb55e5853269.I10d18c8f69d98b28916596d6da4207c15ea4abb5@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-19 10:18:49 +01:00
Miri Korenblit
36e83df3a6 wifi: cfg80211: cleanup cluster_id when stopping NAN
When NAN is stopped, cluster_id should be set to 0 to indicate that we
are not part of any cluster.

Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260107142229.9ccb700797ec.I890ac852be6ca0093995655d987ca5c28a26ce3d@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-19 10:18:49 +01:00
Miri Korenblit
f816141cba wifi: cfg80211: limit NAN func management APIs to offloaded DE
A driver that declared that it has userspace DE should not call NAN func
related APIs such as cfg80211_nan_match and cfg80211_nan_func_terminated
Check and warn in such a case, as this indicates a driver bug.

Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260107141549.86fa96c75211.I8fbb0506377170dd7b41234f20bcba057951dd1e@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-19 10:18:25 +01:00
Miri Korenblit
e1696c8bd0 wifi: cfg80211: stop NAN and P2P in cfg80211_leave
Seems that there is an assumption that this function should be called
only for netdev interfaces, but it can also be called in suspend, or
from nl80211_netlink_notify (indirectly).
Note that the documentation of NL80211_ATTR_SOCKET_OWNER explicitly
says that NAN interfaces would be destroyed as well in the
nl80211_netlink_notify case.

Fix this by also stopping P2P and NAN.

Fixes: cb3b7d8765 ("cfg80211: add start / stop NAN commands")
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260107140430.dab142cbef0b.I290cc47836d56dd7e35012ce06bec36c6da688cd@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-19 10:18:14 +01:00
Miri Korenblit
e69fda4d07 wifi: cfg80211: allow only one NAN interface, also in multi radio
According to Wi-Fi Aware (TM) 4.0 specification 2.8, A NAN device can
have one NAN management interface. This applies also to multi radio
devices.
The current code allows a driver to support more than one NAN interface,
if those are not in the same radio.

Fix it.

Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260107135129.fdaecec0fe8a.I246b5ba6e9da3ec1481ff197e47f6ce0793d7118@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-19 10:18:05 +01:00
Kavita Kavita
3d2515fdd3 wifi: mac80211: add support for encryption/decryption of (Re)Association frames
Currently, mac80211 does not encrypt or decrypt (Re)Association frames
(Request and Response) because temporal keys are not yet available at
that stage.

With extensions from IEEE P802.11bi, e.g. EPPKE, temporal keys can be
established before association. This enables the encryption and
decryption of (Re)Association Request/Response frames.

Add support to unset the IEEE80211_TX_INTFL_DONT_ENCRYPT flag when
the peer is marked as an Enhanced Privacy Protection (EPP) peer and
encryption keys are available for the connection in non-AP STA mode,
allowing secure transmission of (Re)Association Request frames.

Drop unprotected (Re)Association Request/Response frames received from
an EPP peer.

Co-developed-by: Sai Pratyusha Magam <quic_smagam@quicinc.com>
Signed-off-by: Sai Pratyusha Magam <quic_smagam@quicinc.com>
Signed-off-by: Kavita Kavita <kavita.kavita@oss.qualcomm.com>
Link: https://patch.msgid.link/20260114111900.2196941-9-kavita.kavita@oss.qualcomm.com
[remove useless parentheses]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-14 14:55:38 +01:00
Kavita Kavita
0e7e509963 wifi: mac80211: add support for EPPKE authentication protocol in non-AP STA mode
Add support for the Enhanced Privacy Protection Key Exchange (EPPKE)
authentication protocol in non-AP STA mode, as specified in
"IEEE P802.11bi/D3.0, 12.16.9".

EPPKE is an RSNA authentication protocol that operates using
Pre-Association Security Negotiation (PASN) procedures. It consists
of three Authentication frames with transaction sequence numbers 1, 2,
and 3. The first and third from the non-AP STA and the second from the
AP STA.

Extend mac80211 to process EPPKE Authentication frames during the
authentication phase. Currently, mac80211 processes only frames with
the expected transaction number. In the case of EPPKE, process the
Authentication frame from the AP only if the transaction number matches
the expected value, which is 2.

After receiving the final Authentication frame with transaction number 3
from the non-AP STA, it indicates that both the non-AP STA and the AP
confirm there are no issues with authentication. Since this is the final
confirmation frame to send out, mark the state as authenticated in
mac80211.

For EPPKE authentication, the Multi-Link element (MLE) must be included
in the Authentication frame body by userspace in case of MLO connection.
If the MLE is not present, reject the Authentication frame.

Signed-off-by: Kavita Kavita <kavita.kavita@oss.qualcomm.com>
Link: https://patch.msgid.link/20260114111900.2196941-8-kavita.kavita@oss.qualcomm.com
[remove a single stray space]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-14 14:52:50 +01:00
Kavita Kavita
5329ed8fce wifi: mac80211: Check for MLE before appending in Authentication frame
Currently, in MLO connections, userspace constructs most of the
Authentication frame body, excluding the Multi-Link element (MLE),
which mac80211 appends later in ieee80211_send_auth(). At present,
mac80211 always adds the MLE itself, since userspace
(e.g. wpa_supplicant) does not yet include it.

However, for new authentication protocols such as Enhanced Privacy
Protection Key Exchange (EPPKE), as specified in
"IEEE P802.11bi/D3.0 section 12.16.9", the MLE must be included in
userspace so that the Message Integrity Code (MIC) can be computed
correctly over the complete frame body. Table 9-71 specifies that
the MIC is mandatory. If mac80211 appends the MLE again, the
Authentication frame becomes invalid.

Add a check in ieee80211_send_auth() to detect whether the MLE is
already present in the Authentication frame body before appending.
Skip the append if the MLE exists, otherwise add it as before.

Signed-off-by: Kavita Kavita <kavita.kavita@oss.qualcomm.com>
Link: https://patch.msgid.link/20260114111900.2196941-7-kavita.kavita@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-14 14:34:16 +01:00
Kavita Kavita
63e7e3b643 wifi: mac80211: allow key installation before association
Currently, mac80211 allows key installation only after association
completes. However, Enhanced Privacy Protection Key Exchange (EPPKE)
requires key installation before association to enable encryption and
decryption of (Re)Association Request and Response frames.

Add support to install keys prior to association when the peer is an
Enhanced Privacy Protection (EPP) peer that requires encryption and
decryption of (Re)Association Request and Response frames.

Introduce a new boolean parameter "epp_peer" in the "ieee80211_sta"
profile to indicate that the peer supports the Enhanced Privacy
Protection Key Exchange (EPPKE) protocol. For non-AP STA mode, it
is set when the authentication algorithm is WLAN_AUTH_EPPKE during
station profile initialization. For AP mode, it is set during
NL80211_CMD_NEW_STA and NL80211_CMD_ADD_LINK_STA.

When "epp_peer" parameter is set, mac80211 now accepts keys before
association and enables encryption of the (Re)Association
Request/Response frames.

Co-developed-by: Sai Pratyusha Magam <sai.magam@oss.qualcomm.com>
Signed-off-by: Sai Pratyusha Magam <sai.magam@oss.qualcomm.com>
Signed-off-by: Kavita Kavita <kavita.kavita@oss.qualcomm.com>
Link: https://patch.msgid.link/20260114111900.2196941-6-kavita.kavita@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-14 14:34:16 +01:00
Sai Pratyusha Magam
6ee3a22c61 wifi: nl80211: Add support for EPP peer indication
Introduce a new netlink attribute NL80211_ATTR_EPP_PEER
to be used with NL80211_CMD_NEW_STA and
NL80211_CMD_ADD_LINK_STA for the userspace to indicate
that a non-AP STA is an Enhanced Privacy Protection (EPP)
peer.

Co-developed-by: Rohan Dutta <quic_drohan@quicinc.com>
Signed-off-by: Rohan Dutta <quic_drohan@quicinc.com>
Signed-off-by: Sai Pratyusha Magam <sai.magam@oss.qualcomm.com>
Signed-off-by: Kavita Kavita <kavita.kavita@oss.qualcomm.com>
Link: https://patch.msgid.link/20260114111900.2196941-5-kavita.kavita@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-14 14:34:16 +01:00
Kavita Kavita
dc54de8db6 wifi: cfg80211: add support for key configuration before association
Currently, cfg80211 does not allow key installation, removal, or
modification prior to association in non-AP STA mode. However,
Enhanced Privacy Protection Key Exchange (EPPKE) requires encryption
keys to be managed before association.

Add support to manage keys before association in non-AP STA mode when
the NL80211_EXT_FEATURE_ASSOC_FRAME_ENCRYPTION feature flag is set.
If the flag is not set, reject the encryption keys.

Signed-off-by: Kavita Kavita <kavita.kavita@oss.qualcomm.com>
Link: https://patch.msgid.link/20260114111900.2196941-4-kavita.kavita@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-14 14:34:16 +01:00
Ainy Kumari
f29c852149 wifi: cfg80211: add support for EPPKE Authentication Protocol
Add an extended feature flag NL80211_EXT_FEATURE_EPPKE to allow a
driver to indicate support for the Enhanced Privacy Protection Key
Exchange (EPPKE) authentication protocol in non-AP STA mode, as
defined in "IEEE P802.11bi/D3.0, 12.16.9".

In case of SME in userspace, the Authentication frame body is prepared
in userspace while the driver finalizes the Authentication frame once
it receives the required fields and elements. The driver indicates
support for EPPKE using the extended feature flag so that userspace
can initiate EPPKE authentication.

When the feature flag is set, process EPPKE Authentication frames from
userspace in non-AP STA mode. If the flag is not set, reject EPPKE
Authentication frames.

Define a new authentication type NL80211_AUTHTYPE_EPPKE for EPPKE.

Signed-off-by: Ainy Kumari <ainy.kumari@oss.qualcomm.com>
Co-developed-by: Kavita Kavita <kavita.kavita@oss.qualcomm.com>
Signed-off-by: Kavita Kavita <kavita.kavita@oss.qualcomm.com>
Link: https://patch.msgid.link/20260114111900.2196941-2-kavita.kavita@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-14 14:34:15 +01:00
Lachlan Hodges
24a5798567 wifi: cfg80211: don't apply HT flags to S1G channels
HT flags don't really make sense when applied to S1G channels
especially given the bandwidths both used for calculations and
conveyed (i.e 20MHz). Similarly with the 80/160/..MHz channels,
each bonded subchannel is validated individually within
cfg80211_s1g_usable(), so the regulatory validation is similarly
redundant. Additionally, usermode application output (such as iwinfo
below) doesn't particularly make sense when enumerating S1G channels:

before:

925.500 MHz (Band: 900 MHz, Channel 47) [NO_HT40+, NO_HT40-, NO_16MHZ]
926.500 MHz (Band: 900 MHz, Channel 49) [NO_HT40+, NO_HT40-, NO_16MHZ]
927.500 MHz (Band: 900 MHz, Channel 51) [NO_HT40+, NO_HT40-, NO_16MHZ, NO_PRIMARY]

after:

925.500 MHz (Band: 900 MHz, Channel 47) [NO_16MHZ]
926.500 MHz (Band: 900 MHz, Channel 49) [NO_16MHZ]
927.500 MHz (Band: 900 MHz, Channel 51) [NO_16MHZ, NO_PRIMARY]

Don't process the S1G band when applying HT flags as both the regulatory
component is redundant and the flags don't make sense for S1G channels.

Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com>
Link: https://patch.msgid.link/20260113030934.18726-1-lachlan.hodges@morsemicro.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-13 10:44:26 +01:00
Jakub Kicinski
669aa3e3fa Merge tag 'wireless-next-2026-01-12' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:

====================
First set of changes for the current -next cycle, of note:

 - ath12k gets an overhaul to support multi-wiphy device
   wiphy and pave the way for future device support in
   the same driver (rather than splitting to ath13k)

 - mac80211 gets some better iteration macros

* tag 'wireless-next-2026-01-12' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (120 commits)
  wifi: mac80211: remove width argument from ieee80211_parse_bitrates
  wifi: mac80211_hwsim: remove NAN by default
  wifi: mac80211: improve station iteration ergonomics
  wifi: mac80211: improve interface iteration ergonomics
  wifi: cfg80211: include S1G_NO_PRIMARY flag when sending channel
  wifi: mac80211: unexport ieee80211_get_bssid()
  wl1251: Replace strncpy with strscpy in wl1251_acx_fw_version
  wifi: iwlegacy: 3945-rs: remove redundant pointer check in il3945_rs_tx_status() and il3945_rs_get_rate()
  wifi: mac80211: don't send an unused argument to ieee80211_check_combinations
  wifi: libertas: fix WARNING in usb_tx_block
  wifi: mwifiex: Allocate dev name earlier for interface workqueue name
  wifi: wlcore: sdio: Use pm_ptr instead of #ifdef CONFIG_PM
  wifi: cfg80211: Fix use_for flag update on BSS refresh
  wifi: brcmfmac: rename function that frees vif
  wifi: brcmfmac: fix/add kernel-doc comments
  wifi: mac80211: Update csa_finalize to use link_id
  wifi: cfg80211: add cfg80211_stop_link() for per-link teardown
  wifi: ath12k: Skip DP peer creation for scan vdev
  wifi: ath12k: move firmware stats request outside of atomic context
  wifi: ath12k: add the missing RCU lock in ath12k_dp_tx_free_txbuf()
  ...
====================

Link: https://patch.msgid.link/20260112185836.378736-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-12 17:02:02 -08:00
Miri Korenblit
46e7ced3ef wifi: mac80211: remove width argument from ieee80211_parse_bitrates
The width parameter in ieee80211_parse_bitrates() is unused. Remove it.
While at it, use the already fetched sband pointer as an argument
instead of dereferencing it once again.

Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260108143257.d13dbbda93f0.Ie70b24af583e3812883b4004ce227e7af1646855@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-12 19:48:41 +01:00
Johannes Berg
f813117f20 wifi: mac80211: improve station iteration ergonomics
Right now, the only way to iterate stations is to declare an
iterator function, possibly data structure to use, and pass all
that to the iteration helper function. This is annoying, and
there's really no inherent need for it.

Add a new for_each_station() macro that does the iteration in
a more ergonomic way. To avoid even more exported functions, do
the old ieee80211_iterate_stations_mtx() as an inline using the
new way, which may also let the compiler optimise it a bit more,
e.g. via inlining the iterator function.

Link: https://patch.msgid.link/20260108143431.d2b641f6f6af.I4470024f7404446052564b15bcf8b3f1ada33655@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-12 19:48:17 +01:00
Johannes Berg
6b3bafa2bd wifi: mac80211: improve interface iteration ergonomics
Right now, the only way to iterate interfaces is to declare an
iterator function, possibly data structure to use, and pass all
that to the iteration helper function. This is annoying, and
there's really no inherent need for it, except it was easier to
implement with the iflist mutex, but that's not used much now.

Add a new for_each_interface() macro that does the iteration in
a more ergonomic way. To avoid even more exported functions, do
the old ieee80211_iterate_active_interfaces_mtx() as an inline
using the new way, which may also let the compiler optimise it
a bit more, e.g. via inlining the iterator function.

Also provide for_each_active_interface() for the common case of
just iterating active interfaces.

Link: https://patch.msgid.link/20260108143431.f2581e0c381a.Ie387227504c975c109c125b3c57f0bb3fdab2835@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-12 19:48:17 +01:00
Lachlan Hodges
e1cbdf78f6 wifi: cfg80211: include S1G_NO_PRIMARY flag when sending channel
When sending a channel ensure we include the IEEE80211_CHAN_S1G_NO_PRIMARY
flag.

Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com>
Link: https://patch.msgid.link/20260109081439.3168-1-lachlan.hodges@morsemicro.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-12 19:47:35 +01:00
Johannes Berg
391234eb48 wifi: mac80211: unexport ieee80211_get_bssid()
This is only used within mac80211, and not even declared in
a public header file. Don't export it.

Link: https://patch.msgid.link/20260109095029.2b4d2fe53fc9.I9f5fa5c84cd42f749be0b87cc61dac8631c4c6d0@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-12 19:47:25 +01:00
Miri Korenblit
33821a2b20 wifi: mac80211: don't send an unused argument to ieee80211_check_combinations
When ieee80211_check_combinations is called with NULL as the chandef,
the chanmode argument is not relevant. Send a don't care (0) instead.

Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260111192411.9aa743647b43.I407b3d878d94464ce01e25f16c6e2b687bcd8b5a@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-12 19:42:39 +01:00
Jakub Kicinski
59ba823e68 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.19-rc5).

No conflicts, or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-08 11:38:33 -08:00
Eric Dumazet
c92510f5e3 arp: do not assume dev_hard_header() does not change skb->head
arp_create() is the only dev_hard_header() caller
making assumption about skb->head being unchanged.

A recent commit broke this assumption.

Initialize @arp pointer after dev_hard_header() call.

Fixes: db5b4e39c4 ("ip6_gre: make ip6gre_header() robust")
Reported-by: syzbot+58b44a770a1585795351@syzkaller.appspotmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260107212250.384552-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-08 09:04:24 -08:00
Jakub Kicinski
804809ae40 Merge tag 'wireless-2026-01-08' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:

====================
Couple of fixes:
 - mac80211:
   - long-standing injection bug due to chanctx rework
   - more recent interface iteration issue
   - collect statistics before removing stations
 - hwsim:
   - fix NAN frequency typo (potential NULL ptr deref)
   - fix locking of radio lock (needs softirqs disabled)
 - wext:
   - ancient issue with compat and events copying some
     uninitialized stack data to userspace

* tag 'wireless-2026-01-08' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: mac80211: collect station statistics earlier when disconnect
  wifi: mac80211: restore non-chanctx injection behaviour
  wifi: mac80211_hwsim: disable BHs for hwsim_radio_lock
  wifi: mac80211: don't iterate not running interfaces
  wifi: mac80211_hwsim: fix typo in frequency notification
  wifi: avoid kernel-infoleak from struct iw_point
====================

Link: https://patch.msgid.link/20260108140141.139687-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-08 08:49:24 -08:00
Willem de Bruijn
7d11e047ed net: do not write to msg_get_inq in callee
NULL pointer dereference fix.

msg_get_inq is an input field from caller to callee. Don't set it in
the callee, as the caller may not clear it on struct reuse.

This is a kernel-internal variant of msghdr only, and the only user
does reinitialize the field. So this is not critical for that reason.
But it is more robust to avoid the write, and slightly simpler code.
And it fixes a bug, see below.

Callers set msg_get_inq to request the input queue length to be
returned in msg_inq. This is equivalent to but independent from the
SO_INQ request to return that same info as a cmsg (tp->recvmsg_inq).
To reduce branching in the hot path the second also sets the msg_inq.
That is WAI.

This is a fix to commit 4d1442979e ("af_unix: don't post cmsg for
SO_INQ unless explicitly asked for"), which fixed the inverse.

Also avoid NULL pointer dereference in unix_stream_read_generic if
state->msg is NULL and msg->msg_get_inq is written. A NULL state->msg
can happen when splicing as of commit 2b514574f7 ("net: af_unix:
implement splice for stream af_unix sockets").

Also collapse two branches using a bitwise or.

Cc: stable@vger.kernel.org
Fixes: 4d1442979e ("af_unix: don't post cmsg for SO_INQ unless explicitly asked for")
Link: https://lore.kernel.org/netdev/willemdebruijn.kernel.24d8030f7a3de@gmail.com/
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260106150626.3944363-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-08 08:45:13 -08:00
Xiang Mei
c1d73b1480 net/sched: sch_qfq: Fix NULL deref when deactivating inactive aggregate in qfq_reset
`qfq_class->leaf_qdisc->q.qlen > 0` does not imply that the class
itself is active.

Two qfq_class objects may point to the same leaf_qdisc. This happens
when:

1. one QFQ qdisc is attached to the dev as the root qdisc, and

2. another QFQ qdisc is temporarily referenced (e.g., via qdisc_get()
/ qdisc_put()) and is pending to be destroyed, as in function
tc_new_tfilter.

When packets are enqueued through the root QFQ qdisc, the shared
leaf_qdisc->q.qlen increases. At the same time, the second QFQ
qdisc triggers qdisc_put and qdisc_destroy: the qdisc enters
qfq_reset() with its own q->q.qlen == 0, but its class's leaf
qdisc->q.qlen > 0. Therefore, the qfq_reset would wrongly deactivate
an inactive aggregate and trigger a null-deref in qfq_deactivate_agg:

[    0.903172] BUG: kernel NULL pointer dereference, address: 0000000000000000
[    0.903571] #PF: supervisor write access in kernel mode
[    0.903860] #PF: error_code(0x0002) - not-present page
[    0.904177] PGD 10299b067 P4D 10299b067 PUD 10299c067 PMD 0
[    0.904502] Oops: Oops: 0002 [#1] SMP NOPTI
[    0.904737] CPU: 0 UID: 0 PID: 135 Comm: exploit Not tainted 6.19.0-rc3+ #2 NONE
[    0.905157] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
[    0.905754] RIP: 0010:qfq_deactivate_agg (include/linux/list.h:992 (discriminator 2) include/linux/list.h:1006 (discriminator 2) net/sched/sch_qfq.c:1367 (discriminator 2) net/sched/sch_qfq.c:1393 (discriminator 2))
[    0.906046] Code: 0f 84 4d 01 00 00 48 89 70 18 8b 4b 10 48 c7 c2 ff ff ff ff 48 8b 78 08 48 d3 e2 48 21 f2 48 2b 13 48 8b 30 48 d3 ea 8b 4b 18 0

Code starting with the faulting instruction
===========================================
   0:	0f 84 4d 01 00 00    	je     0x153
   6:	48 89 70 18          	mov    %rsi,0x18(%rax)
   a:	8b 4b 10             	mov    0x10(%rbx),%ecx
   d:	48 c7 c2 ff ff ff ff 	mov    $0xffffffffffffffff,%rdx
  14:	48 8b 78 08          	mov    0x8(%rax),%rdi
  18:	48 d3 e2             	shl    %cl,%rdx
  1b:	48 21 f2             	and    %rsi,%rdx
  1e:	48 2b 13             	sub    (%rbx),%rdx
  21:	48 8b 30             	mov    (%rax),%rsi
  24:	48 d3 ea             	shr    %cl,%rdx
  27:	8b 4b 18             	mov    0x18(%rbx),%ecx
	...
[    0.907095] RSP: 0018:ffffc900004a39a0 EFLAGS: 00010246
[    0.907368] RAX: ffff8881043a0880 RBX: ffff888102953340 RCX: 0000000000000000
[    0.907723] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[    0.908100] RBP: ffff888102952180 R08: 0000000000000000 R09: 0000000000000000
[    0.908451] R10: ffff8881043a0000 R11: 0000000000000000 R12: ffff888102952000
[    0.908804] R13: ffff888102952180 R14: ffff8881043a0ad8 R15: ffff8881043a0880
[    0.909179] FS:  000000002a1a0380(0000) GS:ffff888196d8d000(0000) knlGS:0000000000000000
[    0.909572] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.909857] CR2: 0000000000000000 CR3: 0000000102993002 CR4: 0000000000772ef0
[    0.910247] PKRU: 55555554
[    0.910391] Call Trace:
[    0.910527]  <TASK>
[    0.910638]  qfq_reset_qdisc (net/sched/sch_qfq.c:357 net/sched/sch_qfq.c:1485)
[    0.910826]  qdisc_reset (include/linux/skbuff.h:2195 include/linux/skbuff.h:2501 include/linux/skbuff.h:3424 include/linux/skbuff.h:3430 net/sched/sch_generic.c:1036)
[    0.911040]  __qdisc_destroy (net/sched/sch_generic.c:1076)
[    0.911236]  tc_new_tfilter (net/sched/cls_api.c:2447)
[    0.911447]  rtnetlink_rcv_msg (net/core/rtnetlink.c:6958)
[    0.911663]  ? __pfx_rtnetlink_rcv_msg (net/core/rtnetlink.c:6861)
[    0.911894]  netlink_rcv_skb (net/netlink/af_netlink.c:2550)
[    0.912100]  netlink_unicast (net/netlink/af_netlink.c:1319 net/netlink/af_netlink.c:1344)
[    0.912296]  ? __alloc_skb (net/core/skbuff.c:706)
[    0.912484]  netlink_sendmsg (net/netlink/af_netlink.c:1894)
[    0.912682]  sock_write_iter (net/socket.c:727 (discriminator 1) net/socket.c:742 (discriminator 1) net/socket.c:1195 (discriminator 1))
[    0.912880]  vfs_write (fs/read_write.c:593 fs/read_write.c:686)
[    0.913077]  ksys_write (fs/read_write.c:738)
[    0.913252]  do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1))
[    0.913438]  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:131)
[    0.913687] RIP: 0033:0x424c34
[    0.913844] Code: 89 02 48 c7 c0 ff ff ff ff eb bd 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 80 3d 2d 44 09 00 00 74 13 b8 01 00 00 00 0f 05 9

Code starting with the faulting instruction
===========================================
   0:	89 02                	mov    %eax,(%rdx)
   2:	48 c7 c0 ff ff ff ff 	mov    $0xffffffffffffffff,%rax
   9:	eb bd                	jmp    0xffffffffffffffc8
   b:	66 2e 0f 1f 84 00 00 	cs nopw 0x0(%rax,%rax,1)
  12:	00 00 00
  15:	90                   	nop
  16:	f3 0f 1e fa          	endbr64
  1a:	80 3d 2d 44 09 00 00 	cmpb   $0x0,0x9442d(%rip)        # 0x9444e
  21:	74 13                	je     0x36
  23:	b8 01 00 00 00       	mov    $0x1,%eax
  28:	0f 05                	syscall
  2a:	09                   	.byte 0x9
[    0.914807] RSP: 002b:00007ffea1938b78 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[    0.915197] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 0000000000424c34
[    0.915556] RDX: 000000000000003c RSI: 000000002af378c0 RDI: 0000000000000003
[    0.915912] RBP: 00007ffea1938bc0 R08: 00000000004b8820 R09: 0000000000000000
[    0.916297] R10: 0000000000000001 R11: 0000000000000202 R12: 00007ffea1938d28
[    0.916652] R13: 00007ffea1938d38 R14: 00000000004b3828 R15: 0000000000000001
[    0.917039]  </TASK>
[    0.917158] Modules linked in:
[    0.917316] CR2: 0000000000000000
[    0.917484] ---[ end trace 0000000000000000 ]---
[    0.917717] RIP: 0010:qfq_deactivate_agg (include/linux/list.h:992 (discriminator 2) include/linux/list.h:1006 (discriminator 2) net/sched/sch_qfq.c:1367 (discriminator 2) net/sched/sch_qfq.c:1393 (discriminator 2))
[    0.917978] Code: 0f 84 4d 01 00 00 48 89 70 18 8b 4b 10 48 c7 c2 ff ff ff ff 48 8b 78 08 48 d3 e2 48 21 f2 48 2b 13 48 8b 30 48 d3 ea 8b 4b 18 0

Code starting with the faulting instruction
===========================================
   0:	0f 84 4d 01 00 00    	je     0x153
   6:	48 89 70 18          	mov    %rsi,0x18(%rax)
   a:	8b 4b 10             	mov    0x10(%rbx),%ecx
   d:	48 c7 c2 ff ff ff ff 	mov    $0xffffffffffffffff,%rdx
  14:	48 8b 78 08          	mov    0x8(%rax),%rdi
  18:	48 d3 e2             	shl    %cl,%rdx
  1b:	48 21 f2             	and    %rsi,%rdx
  1e:	48 2b 13             	sub    (%rbx),%rdx
  21:	48 8b 30             	mov    (%rax),%rsi
  24:	48 d3 ea             	shr    %cl,%rdx
  27:	8b 4b 18             	mov    0x18(%rbx),%ecx
	...
[    0.918902] RSP: 0018:ffffc900004a39a0 EFLAGS: 00010246
[    0.919198] RAX: ffff8881043a0880 RBX: ffff888102953340 RCX: 0000000000000000
[    0.919559] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[    0.919908] RBP: ffff888102952180 R08: 0000000000000000 R09: 0000000000000000
[    0.920289] R10: ffff8881043a0000 R11: 0000000000000000 R12: ffff888102952000
[    0.920648] R13: ffff888102952180 R14: ffff8881043a0ad8 R15: ffff8881043a0880
[    0.921014] FS:  000000002a1a0380(0000) GS:ffff888196d8d000(0000) knlGS:0000000000000000
[    0.921424] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.921710] CR2: 0000000000000000 CR3: 0000000102993002 CR4: 0000000000772ef0
[    0.922097] PKRU: 55555554
[    0.922240] Kernel panic - not syncing: Fatal exception
[    0.922590] Kernel Offset: disabled

Fixes: 0545a30377 ("pkt_sched: QFQ - quick fair queue scheduler")
Signed-off-by: Xiang Mei <xmei5@asu.edu>
Link: https://patch.msgid.link/20260106034100.1780779-1-xmei5@asu.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-08 08:22:28 -08:00
Baochen Qiang
a203dbeeca wifi: mac80211: collect station statistics earlier when disconnect
In __sta_info_destroy_part2(), station statistics are requested after the
IEEE80211_STA_NONE -> IEEE80211_STA_NOTEXIST transition. This is
problematic because the driver may be unable to handle the request due to
the STA being in the NOTEXIST state (i.e. if the driver destroys the
underlying data when transitioning to NOTEXIST).

Move the statistics collection to before the state transition to avoid
this issue.

Signed-off-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Link: https://patch.msgid.link/20251222-mac80211-move-station-stats-collection-earlier-v1-1-12cd4e42c633@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-08 13:33:11 +01:00
Johannes Berg
d594cc6f2c wifi: mac80211: restore non-chanctx injection behaviour
During the transition to use channel contexts throughout, the
ability to do injection while in monitor mode concurrent with
another interface was lost, since the (virtual) monitor won't
have a chanctx assigned in this scenario.

It's harder to fix drivers that actually transitioned to using
channel contexts themselves, such as mt76, but it's easy to do
those that are (still) just using the emulation. Do that.

Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218763
Reported-and-tested-by: Oscar Alfonso Diaz <oscar.alfonso.diaz@gmail.com>
Fixes: 0a44dfc070 ("wifi: mac80211: simplify non-chanctx drivers")
Link: https://patch.msgid.link/20251216105242.18366-2-johannes@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-08 13:33:10 +01:00
Miri Korenblit
c0d82ba961 wifi: mac80211: don't iterate not running interfaces
for_each_chanctx_user_* was introdcued as a replacement for
for_each_sdata_link, which visits also other chanctx users that are not
link.
for_each_sdata_link skips not running interfaces, do the same for
for_each_chanctx_user_*

Fixes: 1ce954c98b ("wifi: mac80211: add and use chanctx usage iteration")
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260107143736.55c084e2a976.I38b7b904a135dadca339321923b501b2c2c5c8c0@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-08 13:33:10 +01:00
Eric Dumazet
21cbf883d0 wifi: avoid kernel-infoleak from struct iw_point
struct iw_point has a 32bit hole on 64bit arches.

struct iw_point {
  void __user   *pointer;       /* Pointer to the data  (in user space) */
  __u16         length;         /* number of fields or size in bytes */
  __u16         flags;          /* Optional params */
};

Make sure to zero the structure to avoid disclosing 32bits of kernel data
to user space.

Fixes: 87de87d5e4 ("wext: Dispatch and handle compat ioctls entirely in net/wireless/wext.c")
Reported-by: syzbot+bfc7323743ca6dbcc3d3@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/695f83f3.050a0220.1c677c.0392.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260108101927.857582-1-edumazet@google.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-08 13:33:05 +01:00
Huang Chenming
4073ea5161 wifi: cfg80211: Fix use_for flag update on BSS refresh
Userspace may fail to connect to certain BSS that were initially
marked as unusable due to regulatory restrictions (use_for = 0,
e.g., 6 GHz power type mismatch). Even after these restrictions
are removed and the BSS becomes usable, connection attempts still
fail.

The issue occurs in cfg80211_update_known_bss() where the use_for
flag is updated using bitwise AND (&=) instead of direct assignment.
Once a BSS is marked with use_for = 0, the AND operation masks out
any subsequent non-zero values, permanently keeping the flag at 0.
This causes __cfg80211_get_bss(), invoked by nl80211_assoc_bss(), to
fail the check "(bss->pub.use_for & use_for) != use_for", thereby
blocking association.

Replace the bitwise AND operation with direct assignment so the use_for
flag accurately reflects the current BSS state.

Fixes: d02a12b8e4 ("wifi: cfg80211: add BSS usage reporting")
Signed-off-by: Huang Chenming <chenming.huang@oss.qualcomm.com>
Link: https://patch.msgid.link/20251209025733.2098456-1-chenming.huang@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-08 13:23:08 +01:00
Aditya Kumar Singh
bfcc957b6a wifi: mac80211: Update csa_finalize to use link_id
With cfg80211_stop_link() adding support to stop a link in AP/P2P_GO
mode, in failure cases only the corresponding link can be stopped,
instead of stopping the whole interface.

Hence, invoke cfg80211_stop_link() directly with the link_id set for
AP/P2P_GO mode when CSA finalization fails.

Signed-off-by: Aditya Kumar Singh <aditya.kumar.singh@oss.qualcomm.com>
Signed-off-by: Manish Dharanenthiran <manish.dharanenthiran@oss.qualcomm.com>
Link: https://patch.msgid.link/20251127-stop_link-v2-2-43745846c5fd@qti.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-08 13:11:01 +01:00
Manish Dharanenthiran
dc4b176cce wifi: cfg80211: add cfg80211_stop_link() for per-link teardown
Currently, whenever cfg80211_stop_iface() is called, the entire iface
is stopped. However, there could be a need in AP/P2P_GO mode, where
one would like to stop a single link in MLO operation instead of the
whole MLD interface.

Hence, introduce cfg80211_stop_link() to allow drivers to tear down
only a specified AP/P2P_GO link during MLO operation. Passing -1
preserves the existing behavior of stopping the whole interface. Make
cfg80211_stop_iface() call this function by passing -1 to keep the
default behavior the same, that is, to stop all links and use
cfg80211_stop_link() with the desired link_id for AP/P2P_GO mode, to
stop only that link.

This brings no behavioral change for single-link/non-MLO interfaces,
and enables drivers to stop an AP/P2P_GO link without disrupting other
links on the same interface.

Signed-off-by: Manish Dharanenthiran <manish.dharanenthiran@oss.qualcomm.com>
Link: https://patch.msgid.link/20251127-stop_link-v2-1-43745846c5fd@qti.qualcomm.com
[make cfg80211_stop_iface() inline]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-08 13:11:01 +01:00
Shivani Gupta
adb25a46dc net/sched: act_api: avoid dereferencing ERR_PTR in tcf_idrinfo_destroy
syzbot reported a crash in tc_act_in_hw() during netns teardown where
tcf_idrinfo_destroy() passed an ERR_PTR(-EBUSY) value as a tc_action
pointer, leading to an invalid dereference.

Guard against ERR_PTR entries when iterating the action IDR so teardown
does not call tc_act_in_hw() on an error pointer.

Fixes: 84a7d6797e ("net/sched: acp_api: no longer acquire RTNL in tc_action_net_exit()")
Link: https://syzkaller.appspot.com/bug?extid=8f1c492ffa4644ff3826
Reported-by: syzbot+8f1c492ffa4644ff3826@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=8f1c492ffa4644ff3826
Signed-off-by: Shivani Gupta <shivani07g@gmail.com>
Link: https://patch.msgid.link/20260105005905.243423-1-shivani07g@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:27:18 -08:00
Eric Dumazet
27a01c1969 net: fully inline backlog_unlock_irq_restore()
Some arches (like x86) do not inline spin_unlock_irqrestore().

backlog_unlock_irq_restore() is in RPS/RFS critical path,
we prefer using spin_unlock() + local_irq_restore() for
optimal performance.

Also change backlog_unlock_irq_restore() second argument
to avoid a pointless dereference.

No difference in net/core/dev.o code size.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260105163054.13698-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:14:35 -08:00
Eric Dumazet
e9cd04b281 udp: udplite is unlikely
Add some unlikely() annotations to speed up the fast path,
at least with clang compiler.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260105101719.2378881-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:06:03 -08:00
Eric Dumazet
e5c8eda39a udp: call skb_orphan() before skb_attempt_defer_free()
Standard UDP receive path does not use skb->destructor.

But skmsg layer does use it, since it calls skb_set_owner_sk_safe()
from udp_read_skb().

This then triggers this warning in skb_attempt_defer_free():

    DEBUG_NET_WARN_ON_ONCE(skb->destructor);

We must call skb_orphan() to fix this issue.

Fixes: 6471658dc6 ("udp: use skb_attempt_defer_free()")
Reported-by: syzbot+3e68572cf2286ce5ebe9@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/695b83bd.050a0220.1c9965.002b.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260105093630.1976085-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:05:17 -08:00
Gustavo A. R. Silva
c86af46b9c ipv4/inet_sock.h: Avoid thousands of -Wflex-array-member-not-at-end warnings
Use DEFINE_RAW_FLEX() to avoid thousands of -Wflex-array-member-not-at-end
warnings.

Remove struct ip_options_data, and adjust the rest of the code so that
flexible-array member struct ip_options_rcu::opt.__data[] ends last
in struct icmp_bxm.

Compensate for this by using the DEFINE_RAW_FLEX() helper to define each
on-stack struct instance that contained struct ip_options_data as a member,
and to define struct ip_options_rcu with a fixed on-stack size for its
nested flexible-array member opt.__data[].

Also, add a couple of code comments to prevent people from adding members
to a struct after another member that contains a flexible array.

With these changes, fix 2600 warnings of the following type:

include/net/inet_sock.h:65:33: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://patch.msgid.link/aVteBadWA6AbTp7X@kspp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:02:52 -08:00
Yumei Huang
cb3de96eea ipv6: preserve insertion order for same-scope addresses
IPv6 addresses with the same scope are returned in reverse insertion
order, unlike IPv4. For example, when adding a -> b -> c, the list is
reported as c -> b -> a, while IPv4 preserves the original order.

This behavior causes:

a. When using `ip -6 a save` and `ip -6 a restore`, addresses are restored
   in the opposite order from which they were saved. See example below
   showing addresses added as 1::1, 1::2, 1::3 but displayed and saved
   in reverse order.

   # ip -6 a a 1::1 dev x
   # ip -6 a a 1::2 dev x
   # ip -6 a a 1::3 dev x
   # ip -6 a s dev x
   2: x: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
       inet6 1::3/128 scope global tentative
       valid_lft forever preferred_lft forever
       inet6 1::2/128 scope global tentative
       valid_lft forever preferred_lft forever
       inet6 1::1/128 scope global tentative
       valid_lft forever preferred_lft forever
   # ip -6 a save > dump
   # ip -6 a d 1::1 dev x
   # ip -6 a d 1::2 dev x
   # ip -6 a d 1::3 dev x
   # ip a d ::1 dev lo
   # ip a restore < dump
   # ip -6 a s dev x
   2: x: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
       inet6 1::1/128 scope global tentative
       valid_lft forever preferred_lft forever
       inet6 1::2/128 scope global tentative
       valid_lft forever preferred_lft forever
       inet6 1::3/128 scope global tentative
       valid_lft forever preferred_lft forever
   # ip a showdump < dump
    if1:
        inet6 ::1/128 scope host proto kernel_lo
        valid_lft forever preferred_lft forever
    if2:
        inet6 1::3/128 scope global tentative
        valid_lft forever preferred_lft forever
    if2:
        inet6 1::2/128 scope global tentative
        valid_lft forever preferred_lft forever
    if2:
        inet6 1::1/128 scope global tentative
        valid_lft forever preferred_lft forever

b. Addresses in pasta to appear in reversed order compared to host
   addresses.

The ipv6 addresses were added in reverse order by commit e55ffac601
("[IPV6]: order addresses by scope"), then it was changed by commit
502a2ffd73 ("ipv6: convert idev_list to list macros"), and restored by
commit b54c9b98bb ("ipv6: Preserve pervious behavior in
ipv6_link_dev_addr()."). However, this reverse ordering within the same
scope causes inconsistency with IPv4 and the issues described above.

This patch aligns IPv6 address ordering with IPv4 for consistency
by changing the comparison from >= to > when inserting addresses
into the address list. Also updates the ioam6 selftest to reflect
the new address ordering behavior. Combine these two changes into
one patch for bisectability.

Link: https://bugs.passt.top/show_bug.cgi?id=175
Suggested-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Yumei Huang <yuhuang@redhat.com>
Acked-by: Justin Iurman <justin.iurman@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260104032357.38555-1-yuhuang@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 16:41:35 -08:00
Mohammad Heib
238e03d046 net: fix memory leak in skb_segment_list for GRO packets
When skb_segment_list() is called during packet forwarding, it handles
packets that were aggregated by the GRO engine.

Historically, the segmentation logic in skb_segment_list assumes that
individual segments are split from a parent SKB and may need to carry
their own socket memory accounting. Accordingly, the code transfers
truesize from the parent to the newly created segments.

Prior to commit ed4cccef64 ("gro: fix ownership transfer"), this
truesize subtraction in skb_segment_list() was valid because fragments
still carry a reference to the original socket.

However, commit ed4cccef64 ("gro: fix ownership transfer") changed
this behavior by ensuring that fraglist entries are explicitly
orphaned (skb->sk = NULL) to prevent illegal orphaning later in the
stack. This change meant that the entire socket memory charge remained
with the head SKB, but the corresponding accounting logic in
skb_segment_list() was never updated.

As a result, the current code unconditionally adds each fragment's
truesize to delta_truesize and subtracts it from the parent SKB. Since
the fragments are no longer charged to the socket, this subtraction
results in an effective under-count of memory when the head is freed.
This causes sk_wmem_alloc to remain non-zero, preventing socket
destruction and leading to a persistent memory leak.

The leak can be observed via KMEMLEAK when tearing down the networking
environment:

unreferenced object 0xffff8881e6eb9100 (size 2048):
  comm "ping", pid 6720, jiffies 4295492526
  backtrace:
    kmem_cache_alloc_noprof+0x5c6/0x800
    sk_prot_alloc+0x5b/0x220
    sk_alloc+0x35/0xa00
    inet6_create.part.0+0x303/0x10d0
    __sock_create+0x248/0x640
    __sys_socket+0x11b/0x1d0

Since skb_segment_list() is exclusively used for SKB_GSO_FRAGLIST
packets constructed by GRO, the truesize adjustment is removed.

The call to skb_release_head_state() must be preserved. As documented in
commit cf673ed0e0 ("net: fix fraglist segmentation reference count
leak"), it is still required to correctly drop references to SKB
extensions that may be overwritten during __copy_skb_header().

Fixes: ed4cccef64 ("gro: fix ownership transfer")
Signed-off-by: Mohammad Heib <mheib@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260104213101.352887-1-mheib@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-05 17:01:28 -08:00
Jamal Hadi Salim
9892353726 net/sched: act_mirred: Fix leak when redirecting to self on egress
Whenever a mirred redirect to self on egress happens, mirred allocates a
new skb (skb_to_send). The loop to self check was done after that
allocation, but was not freeing the newly allocated skb, causing a leak.

Fix this by moving the if-statement to before the allocation of the new
skb.

The issue was found by running the accompanying tdc test in 2/2
with config kmemleak enabled.
After a few minutes the kmemleak thread ran and reported the leak coming from
mirred.

Fixes: 1d856251a0 ("net/sched: act_mirred: fix loop detection")
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260101135608.253079-2-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-05 16:23:42 -08:00
Clara Engler
48a4aa9d9c ipv4: Improve martian logs
At the current moment, the logs for martian packets are as follows:
```
martian source {DST} from {SRC}, on dev {DEV}
martian destination {DST} from {SRC}, dev {DEV}
```

These messages feel rather hard to understand in production, especially
the "martian source" one, mostly because it is grammatically ambitious
to parse which part is now the source address and which part is the
destination address.  For example, "{DST}" may there be interpreted as
the actual source address due to following the word "source", thereby
implying the actual source address to be the destination one.

Personally, I discovered this bug while toying around with TUN
interfaces and using them as a tunnel (receiving packets via a TUN
interface and sending them over a TCP stream; receiving packets from a
TCP stream and writing them to a TUN).[^1]

When these IP addresses contained local IPs (i.e. 10.0.0.0/8 in source
and destination), everything worked fine.  However, sending them to a
real routable IP address on the internet led to them being treated as a
martian packet, obviously.  Using a few sysctl(8) and iptables(8)
settings[^2] fixed it, but while debugging I found the log message
starting with "martian source" rather confusing, as I was unsure on
whether the packet that gets dropped was the packet originating from me
or the response from the endpoint, as "martian source <ROUTABLE IP>"
could also be falsely interpreted as the response packet being martian,
due to the word "source" followed by the routable IP address, implying
the source address of that packet is set to this IP, as explained above.
In the end, I had to look into the source code of the kernel on where
this error message gets generated, which is usually an indicator of
there being room for improvement with regard to this error message.

In terms of improvement, this commit changes the error messages for
martian source and martian destination packets as follows:
```
martian source (src={SRC}, dst={DST}, dev={DEV})
martian destination (src={SRC}, dst={DST}, dev={DEV})
```

These new wordings leave pretty much no room for ambiguity as all
parameters are prefixed with a respective key explaining their semantic
meaning.

See also the following thread on LKML.[^3]

[^1]: <https://backreference.org/2010/03/26/tuntap-interface-tutorial>
[^2]: sysctl net.ipv4.ip_forward=1 && \
      iptables -A INPUT -i tun0 -j ACCEPT && \
      iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
[^3]: <https://lore.kernel.org/all/aSd4Xj8rHrh-krjy@4944566b5c925f79/>

Signed-off-by: Clara Engler <cve@cve.cx>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260101125114.2608-1-cve@cve.cx
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-05 16:15:50 -08:00
Michal Luczaj
ce5e612dd4 vsock: Make accept()ed sockets use custom setsockopt()
SO_ZEROCOPY handling in vsock_connectible_setsockopt() does not get called
on accept()ed sockets due to a missing flag. Flip it.

Fixes: e0718bd82e ("vsock: enable setting SO_ZEROCOPY")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20251229-vsock-child-sock-custom-sockopt-v2-1-64778d6c4f88@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-05 16:14:50 -08:00
Jakub Kicinski
d6f6c6d909 Merge tag 'nf-26-01-02' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Florian Westphal says:

====================
netfilter: updates for net

The following patchset contains Netfilter fixes for *net*:

1) Fix overlap detection for nf_tables with concatenated ranges.
   There are cases where element could not be added due to a conflict
   with existing range, while kernel reports success to userspace.
2) update selftest to cover this bug.
3) synproxy update path should use READ/WRITE once as we replace
   config struct while packet path might read it in parallel.
   This relies on said config struct to fit sizeof(long).
   From Fernando Fernandez Mancera.
4) Don't return -EEXIST from xtables in module load path, a pending
   patch to module infra will spot a warning if this happens.
   From Daniel Gomez.
5) Fix a memory leak in nf_tables when chain hits 2**32 users
   and rule is to be hw-offloaded, from Zilin Guan.
6) Avoid infinite list growth when insert rate is high in nf_conncount,
   also from Fernando.

* tag 'nf-26-01-02' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nf_conncount: update last_gc only when GC has been performed
  netfilter: nf_tables: fix memory leak in nf_tables_newrule()
  netfilter: replace -EEXIST with -EBUSY
  netfilter: nft_synproxy: avoid possible data-race on update operation
  selftests: netfilter: nft_concat_range.sh: add check for overlap detection bug
  netfilter: nft_set_pipapo: fix range overlap detection
====================

Link: https://patch.msgid.link/20260102114128.7007-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-04 10:59:59 -08:00
Florian Westphal
2ef02ac38d inet: frags: drop fraglist conntrack references
Jakub added a warning in nf_conntrack_cleanup_net_list() to make debugging
leaked skbs/conntrack references more obvious.

syzbot reports this as triggering, and I can also reproduce this via
ip_defrag.sh selftest:

 conntrack cleanup blocked for 60s
 WARNING: net/netfilter/nf_conntrack_core.c:2512
 [..]

conntrack clenups gets stuck because there are skbs with still hold nf_conn
references via their frag_list.

   net.core.skb_defer_max=0 makes the hang disappear.

Eric Dumazet points out that skb_release_head_state() doesn't follow the
fraglist.

ip_defrag.sh can only reproduce this problem since
commit 6471658dc6 ("udp: use skb_attempt_defer_free()"), but AFAICS this
problem could happen with TCP as well if pmtu discovery is off.

The relevant problem path for udp is:
1. netns emits fragmented packets
2. nf_defrag_v6_hook reassembles them (in output hook)
3. reassembled skb is tracked (skb owns nf_conn reference)
4. ip6_output refragments
5. refragmented packets also own nf_conn reference (ip6_fragment
   calls ip6_copy_metadata())
6. on input path, nf_defrag_v6_hook skips defragmentation: the
   fragments already have skb->nf_conn attached
7. skbs are reassembled via ipv6_frag_rcv()
8. skb_consume_udp -> skb_attempt_defer_free() -> skb ends up
   in pcpu freelist, but still has nf_conn reference.

Possible solutions:
 1 let defrag engine drop nf_conn entry, OR
 2 export kick_defer_list_purge() and call it from the conntrack
   netns exit callback, OR
 3 add skb_has_frag_list() check to skb_attempt_defer_free()

2 & 3 also solve ip_defrag.sh hang but share same drawback:

Such reassembled skbs, queued to socket, can prevent conntrack module
removal until userspace has consumed the packet. While both tcp and udp
stack do call nf_reset_ct() before placing skb on socket queue, that
function doesn't iterate frag_list skbs.

Therefore drop nf_conn entries when they are placed in defrag queue.
Keep the nf_conn entry of the first (offset 0) skb so that reassembled
skb retains nf_conn entry for sake of TX path.

Note that fixes tag is incorrect; it points to the commit introducing the
'ip_defrag.sh reproducible problem': no need to backport this patch to
every stable kernel.

Reported-by: syzbot+4393c47753b7808dac7d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/693b0fa7.050a0220.4004e.040d.GAE@google.com/
Fixes: 6471658dc6 ("udp: use skb_attempt_defer_free()")
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260102140030.32367-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-04 10:58:04 -08:00