Commit Graph

135728 Commits

Author SHA1 Message Date
Jonas Gorski
e57723fe53 net: dsa: b53: properly bound ARL searches for < 4 ARL bin chips
When iterating over the ARL table we stop at max ARL entries / 2, but
this is only valid if the chip actually returns 2 results at once. For
chips with only one result register we will stop before reaching the end
of the table if it is more than half full.

Fix this by only dividing the maximum results by two if we have a chip
with more than one result register (i.e. those with 4 ARL bins).

Fixes: cd169d799b ("net: dsa: b53: Bound check ARL searches")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20251102100758.28352-4-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-03 16:42:43 -08:00
Jonas Gorski
0be04b5fa6 net: dsa: b53: stop reading ARL entries if search is done
The switch clears the ARL_SRCH_STDN bit when the search is done, i.e. it
finished traversing the ARL table.

This means that there will be no valid result, so we should not attempt
to read and process any further entries.

We only ever check the validity of the entries for 4 ARL bin chips, and
only after having passed the first entry to the b53_fdb_copy().

This means that we always pass an invalid entry at the end to the
b53_fdb_copy(). b53_fdb_copy() does check the validity though before
passing on the entry, so it never gets passed on.

On < 4 ARL bin chips, we will even continue reading invalid entries
until we reach the result limit.

Fixes: 1da6df85c6 ("net: dsa: b53: Implement ARL add/del/dump operations")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20251102100758.28352-3-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-03 16:42:42 -08:00
Jonas Gorski
c264294624 net: dsa: b53: fix enabling ip multicast
In the New Control register bit 1 is either reserved, or has a different
function:

    Out of Range Error Discard

    When enabled, the ingress port discards any frames
    if the Length field is between 1500 and 1536
    (excluding 1500 and 1536) and with good CRC.

The actual bit for enabling IP multicast is bit 0, which was only
explicitly enabled for BCM5325 so far.

For older switch chips, this bit defaults to 0, so we want to enable it
as well, while newer switch chips default to 1, and their documentation
says "It is illegal to set this bit to zero."

So drop the wrong B53_IPMC_FWD_EN define, enable the IP multicast bit
also for other switch chips. While at it, rename it to (B53_)IP_MC as
that is how it is called in Broadcom code.

Fixes: 63cc54a6f0 ("net: dsa: b53: Fix egress flooding settings")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20251102100758.28352-2-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-03 16:42:42 -08:00
Jonas Gorski
3e4ebdc160 net: dsa: b53: fix bcm63xx RGMII port link adjustment
BCM63XX's switch does not support MDIO scanning of external phys, so its
MACs needs to be manually configured for autonegotiated link speeds.

So b53_force_port_config() and b53_force_link() accordingly also when
mode is MLO_AN_PHY for those ports.

Fixes lower speeds than 1000/full on rgmii ports 4 - 7.

This aligns the behaviour with the old bcm63xx_enetsw driver for those
ports.

Fixes: 967dd82ffc ("net: dsa: b53: Add support for Broadcom RoboSwitch")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20251101132807.50419-3-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-03 16:40:35 -08:00
Jonas Gorski
b6a8a5477f net: dsa: b53: fix resetting speed and pause on forced link
There is no guarantee that the port state override registers have their
default values, as not all switches support being reset via register or
have a reset GPIO.

So when forcing port config, we need to make sure to clear all fields,
which we currently do not do for the speed and flow control
configuration. This can cause flow control stay enabled, or in the case
of speed becoming an illegal value, e.g. configured for 1G (0x2), then
setting 100M (0x1), results in 0x3 which is invalid.

For PORT_OVERRIDE_SPEED_2000M we need to make sure to only clear it on
supported chips, as the bit can have different meanings on other chips,
e.g. for BCM5389 this controls scanning PHYs for link/speed
configuration.

Fixes: 5e004460f8 ("net: dsa: b53: Add helper to set link parameters")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20251101132807.50419-2-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-03 16:40:35 -08:00
Alok Tiwari
b2b526c2cf net: mdio: Check regmap pointer returned by device_node_to_regmap()
The call to device_node_to_regmap() in airoha_mdio_probe() can return
an ERR_PTR() if regmap initialization fails. Currently, the driver
stores the pointer without validation, which could lead to a crash
if it is later dereferenced.

Add an IS_ERR() check and return the corresponding error code to make
the probe path more robust.

Fixes: 67e3ba9783 ("net: mdio: Add MDIO bus controller for Airoha AN7583")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20251031161607.58581-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-03 16:20:03 -08:00
Gustavo Luiz Duarte
d7d2fcf7ae netconsole: Acquire su_mutex before navigating configs hierarchy
There is a race between operations that iterate over the userdata
cg_children list and concurrent add/remove of userdata items through
configfs. The update_userdata() function iterates over the
nt->userdata_group.cg_children list, and count_extradata_entries() also
iterates over this same list to count nodes.

Quoting from Documentation/filesystems/configfs.rst:
> A subsystem can navigate the cg_children list and the ci_parent pointer
> to see the tree created by the subsystem.  This can race with configfs'
> management of the hierarchy, so configfs uses the subsystem mutex to
> protect modifications.  Whenever a subsystem wants to navigate the
> hierarchy, it must do so under the protection of the subsystem
> mutex.

Without proper locking, if a userdata item is added or removed
concurrently while these functions are iterating, the list can be
accessed in an inconsistent state. For example, the list_for_each() loop
can reach a node that is being removed from the list by list_del_init()
which sets the nodes' .next pointer to point to itself, so the loop will
never end (or reach the WARN_ON_ONCE in update_userdata() ).

Fix this by holding the configfs subsystem mutex (su_mutex) during all
operations that iterate over cg_children.
This includes:
- userdatum_value_store() which calls update_userdata() to iterate over
  cg_children
- All sysdata_*_enabled_store() functions which call
  count_extradata_entries() to iterate over cg_children

The su_mutex must be acquired before dynamic_netconsole_mutex to avoid
potential lock ordering issues, as configfs operations may already hold
su_mutex when calling into our code.

Fixes: df03f830d0 ("net: netconsole: cache userdata formatted string in netconsole_target")
Signed-off-by: Gustavo Luiz Duarte <gustavold@gmail.com>
Link: https://patch.msgid.link/20251029-netconsole-fix-warn-v1-1-0d0dd4622f48@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-31 17:45:06 -07:00
Tim Hostetler
329d050bbe gve: Implement settime64 with -EOPNOTSUPP
ptp_clock_settime() assumes every ptp_clock has implemented settime64().
Stub it with -EOPNOTSUPP to prevent a NULL dereference.

Fixes: acd1638052 ("gve: Add initial PTP device support")
Reported-by: syzbot+a546141ca6d53b90aba3@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a546141ca6d53b90aba3
Signed-off-by: Tim Hostetler <thostet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Joshua Washington <joshwash@google.com>
Link: https://patch.msgid.link/20251029184555.3852952-3-joshwash@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-31 15:55:32 -07:00
Tim Hostetler
6ab753b5d8 gve: Implement gettimex64 with -EOPNOTSUPP
gve implemented a ptp_clock for sole use of do_aux_work at this time.
ptp_clock_gettime() and ptp_sys_offset() assume every ptp_clock has
implemented either gettimex64 or gettime64. Stub gettimex64 and return
-EOPNOTSUPP to prevent NULL dereferencing.

Fixes: acd1638052 ("gve: Add initial PTP device support")
Reported-by: syzbot+c8c0e7ccabd456541612@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=c8c0e7ccabd456541612
Signed-off-by: Tim Hostetler <thostet@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Joshua Washington <joshwash@google.com>
Link: https://patch.msgid.link/20251029184555.3852952-2-joshwash@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-31 15:55:32 -07:00
Jakub Kicinski
b7904323e7 Merge tag 'wireless-2025-10-30' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:

====================
Couple of new fixes:

 - ath10k: revert a patch that had caused issues on some devices
 - cfg80211/mac80211: use hrtimers for some things where the
                      precise timing matters
 - zd1211rw: fix a long-standing potential leak

* tag 'wireless-2025-10-30' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: zd1211rw: fix potential memory leak in __zd_usb_enable_rx()
  wifi: mac80211: use wiphy_hrtimer_work for csa.switch_work
  wifi: mac80211: use wiphy_hrtimer_work for ml_reconf_work
  wifi: mac80211: use wiphy_hrtimer_work for ttlm_work
  wifi: cfg80211: add an hrtimer based delayed work item
  Revert "wifi: ath10k: avoid unnecessary wait for service ready message"
====================

Link: https://patch.msgid.link/20251030104919.12871-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-31 12:30:33 -07:00
Abdun Nihaal
70e8335485 wifi: zd1211rw: fix potential memory leak in __zd_usb_enable_rx()
The memory allocated for urbs with kcalloc() is not freed on any error
path. Fix that by freeing it in the error path.

Fixes: e85d0918b5 ("[PATCH] ZyDAS ZD1211 USB-WLAN driver")
Signed-off-by: Abdun Nihaal <nihaal@cse.iitm.ac.in>
Link: https://patch.msgid.link/20251028174341.139134-1-nihaal@cse.iitm.ac.in
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-10-30 08:41:10 +01:00
Rohan G Thomas
48b2e323c0 net: stmmac: est: Fix GCL bounds checks
Fix the bounds checks for the hw supported maximum GCL entry
count and gate interval time.

Fixes: b60189e039 ("net: stmmac: Integrate EST with TAPRIO scheduler API")
Signed-off-by: Rohan G Thomas <rohan.g.thomas@altera.com>
Reviewed-by: Matthew Gerlach <matthew.gerlach@altera.com>
Link: https://patch.msgid.link/20251028-qbv-fixes-v4-3-26481c7634e3@altera.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29 18:49:24 -07:00
Rohan G Thomas
ded9813d17 net: stmmac: Consider Tx VLAN offload tag length for maxSDU
Queue maxSDU requirement of 802.1 Qbv standard requires mac to drop
packets that exceeds maxSDU length and maxSDU doesn't include
preamble, destination and source address, or FCS but includes
ethernet type and VLAN header.

On hardware with Tx VLAN offload enabled, VLAN header length is not
included in the skb->len, when Tx VLAN offload is requested. This
leads to incorrect length checks and allows transmission of
oversized packets. Add the VLAN_HLEN to the skb->len before checking
the Qbv maxSDU if Tx VLAN offload is requested for the packet.

Fixes: c5c3e1bfc9 ("net: stmmac: Offload queueMaxSDU from tc-taprio")
Signed-off-by: Rohan G Thomas <rohan.g.thomas@altera.com>
Reviewed-by: Matthew Gerlach <matthew.gerlach@altera.com>
Link: https://patch.msgid.link/20251028-qbv-fixes-v4-2-26481c7634e3@altera.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29 18:49:24 -07:00
Rohan G Thomas
c657f86106 net: stmmac: vlan: Disable 802.1AD tag insertion offload
The DWMAC IP's VLAN tag insertion offload does not support inserting
STAG (802.1AD) and CTAG (802.1Q) types in bytes 13 and 14 using the
same MAC_VLAN_Incl and MAC_VLAN_Inner_Incl register configurations.

Currently, MAC_VLAN_Incl is configured to offload only STAG type
insertion. However, the DWMAC IP inserts a CTAG type when the inner
VLAN ID field of the descriptor is not configured, and a STAG type
when it is configured. This behavior is not documented and leads to
inconsistent double VLAN tagging.

Additionally, an unexpected CTAG with VLAN ID 0 is inserted, resulting
in frames like:

Frame 1: 110 bytes on wire (880 bits), 110 bytes captured (880 bits)
Ethernet II, Src: <src> (<src>), Dst: <dst> (<dst>)
IEEE 802.1ad, ID: 100
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 0 (unexpected)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 200
Internet Protocol Version 4, Src: 192.168.4.10, Dst: 192.168.4.11
Internet Control Message Protocol

To avoid this undocumented and incorrect behavior, disable 802.1AD tag
insertion offload. Also, don't set CSVL bit. As per the data book,
when this bit is set, S-VLAN type (0x88A8) is inserted in the 13th and
14th bytes of transmitted packets and when this bit is reset, C-VLAN
type (0x8100) is inserted in the 13th and 14th bytes of transmitted
packets.

Fixes: 30d932279d ("net: stmmac: Add support for VLAN Insertion Offload")
Fixes: e94e3f3b51 ("net: stmmac: Add support for VLAN Insertion Offload in GMAC4+")
Fixes: 1d2c7a5fee ("net: stmmac: Refactor VLAN implementation")
Signed-off-by: Rohan G Thomas <rohan.g.thomas@altera.com>
Reviewed-by: Boon Khai Ng <boon.khai.ng@altera.com>
Link: https://patch.msgid.link/20251028-qbv-fixes-v4-1-26481c7634e3@altera.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29 18:49:24 -07:00
Shahar Shitrit
426e9da3b2 net/mlx5e: kTLS, Cancel RX async resync request in error flows
When device loses track of TLS records, it attempts to resync by
monitoring records and requests an asynchronous resynchronization
from software for this TLS connection.

The TLS module handles such device RX resync requests by logging record
headers and comparing them with the record tcp_sn when provided by the
device. It also increments rcd_delta to track how far the current
record tcp_sn is from the tcp_sn of the original resync request.
If the device later responds with a matching tcp_sn, the TLS module
approves the tcp_sn for resync.

However, the device response may be delayed or never arrive,
particularly due to traffic-related issues such as packet drops or
reordering. In such cases, the TLS module remains unaware that resync
will not complete, and continues performing unnecessary work by logging
headers and incrementing rcd_delta, which can eventually exceed the
threshold and trigger a WARN(). For example, this was observed when the
device got out of tracking, causing
mlx5e_ktls_handle_get_psv_completion() to fail and ultimately leading
to the rcd_delta warning.

To address this, call tls_offload_rx_resync_async_request_cancel()
to cancel the resync request and stop resync tracking in such error
cases. Also, increment the tls_resync_req_skip counter to track these
cancellations.

Fixes: 0419d8c9d8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1761508983-937977-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29 18:32:18 -07:00
Shahar Shitrit
34892cfec0 net: tls: Change async resync helpers argument
Update tls_offload_rx_resync_async_request_start() and
tls_offload_rx_resync_async_request_end() to get a struct
tls_offload_resync_async parameter directly, rather than
extracting it from struct sock.

This change aligns the function signatures with the upcoming
tls_offload_rx_resync_async_request_cancel() helper, which
will be introduced in a subsequent patch.

Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1761508983-937977-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29 18:32:17 -07:00
Thanh Quan
298574936a net: phy: dp83869: fix STRAP_OPMODE bitmask
According to the TI DP83869HM datasheet Revision D (June 2025), section
7.6.1.41 STRAP_STS Register, the STRAP_OPMODE bitmask is bit [11:9].
Fix this.

In case the PHY is auto-detected via PHY ID registers, or not described
in DT, or, in case the PHY is described in DT but the optional DT property
"ti,op-mode" is not present, then the driver reads out the PHY functional
mode (RGMII, SGMII, ...) from hardware straps.

Currently, all upstream users of this PHY specify both DT compatible string
"ethernet-phy-id2000.a0f1" and ti,op-mode = <DP83869_RGMII_COPPER_ETHERNET>
property, therefore it seems no upstream users are affected by this bug.

The driver currently interprets bits [2:0] of STRAP_STS register as PHY
functional mode. Those bits are controlled by ANEG_DIS, ANEGSEL_0 straps
and an always-zero reserved bit. Systems that use RGMII-to-Copper functional
mode are unlikely to disable auto-negotiation via ANEG_DIS strap, or change
auto-negotiation behavior via ANEGSEL_0 strap. Therefore, even with this bug
in place, the STRAP_STS register content is likely going to be interpreted
by the driver as RGMII-to-Copper mode.

However, for a system with PHY functional mode strapping set to other mode
than RGMII-to-Copper, the driver is likely to misinterpret the strapping
as RGMII-to-Copper and misconfigure the PHY.

For example, on a system with SGMII-to-Copper strapping, the STRAP_STS
register reads as 0x0c20, but the PHY ends up being configured for
incompatible RGMII-to-Copper mode.

Fixes: 0eaf8ccf20 ("net: phy: dp83869: Set opmode from straps")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Thanh Quan <thanh.quan.xn@renesas.com>
Signed-off-by: Hai Pham <hai.pham.ud@renesas.com>
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org> # Port from U-Boot to Linux
Link: https://patch.msgid.link/20251027140320.8996-1-marek.vasut+renesas@mailbox.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29 17:59:09 -07:00
Jinliang Wang
da2522df3f net: mctp: Fix tx queue stall
The tx queue can become permanently stuck in a stopped state due to a
race condition between the URB submission path and its completion
callback.

The URB completion callback can run immediately after usb_submit_urb()
returns, before the submitting function calls netif_stop_queue(). If
this occurs, the queue state management becomes desynchronized, leading
to a stall where the queue is never woken.

Fix this by moving the netif_stop_queue() call to before submitting the
URB. This closes the race window by ensuring the network stack is aware
the queue is stopped before the URB completion can possibly run.

Fixes: 0791c0327a ("net: mctp: Add MCTP USB transport driver")
Signed-off-by: Jinliang Wang <jinliangw@google.com>
Acked-by: Jeremy Kerr <jk@codeconstruct.com.au>
Link: https://patch.msgid.link/20251027065530.2045724-1-jinliangw@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29 17:55:14 -07:00
Cosmin Ratiu
53110232c9 net/mlx5: Don't zero user_count when destroying FDB tables
esw->user_count tracks how many TC rules are added on an esw via
mlx5e_configure_flower -> mlx5_esw_get -> atomic64_inc(&esw->user_count)

esw.user_count was unconditionally set to 0 in
esw_destroy_legacy_fdb_table and esw_destroy_offloads_fdb_tables.

These two together can lead to the following sequence of events:
1. echo 1 > /sys/class/net/eth2/device/sriov_numvfs
  - mlx5_core_sriov_configure -...-> esw_create_legacy_table ->
    atomic64_set(&esw->user_count, 0)
2. tc qdisc add dev eth2 ingress && \
   tc filter replace dev eth2 pref 1 protocol ip chain 0 ingress \
       handle 1 flower action ct nat zone 64000 pipe
  - mlx5e_configure_flower -> mlx5_esw_get ->
    atomic64_inc(&esw->user_count)
3. echo 0 > /sys/class/net/eth2/device/sriov_numvfs
  - mlx5_core_sriov_configure -..-> esw_destroy_legacy_fdb_table
    -> atomic64_set(&esw->user_count, 0)
4. devlink dev eswitch set pci/0000:08:00.0 mode switchdev
  - mlx5_devlink_eswitch_mode_set -> mlx5_esw_try_lock ->
    atomic64_read(&esw->user_count) == 0
  - then proceed to a WARN_ON in:
  esw_offloads_start -> mlx5_eswitch_enable_locke -> esw_offloads_enable
  -> mlx5_esw_offloads_rep_load -> mlx5e_vport_rep_load ->
  mlx5e_netdev_change_profile -> mlx5e_detach_netdev ->
  mlx5e_cleanup_nic_rx -> mlx5e_tc_nic_cleanup ->
  mlx5e_mod_hdr_tbl_destroy

Fix this by not clearing out the user_count when destroying FDB tables,
so that the check in mlx5_esw_try_lock can prevent the mode change when
there are TC rules configured, as originally intended.

Fixes: 2318b8bb94 ("net/mlx5: E-switch, Destroy legacy fdb table when needed")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1761510019-938772-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29 17:53:37 -07:00
Miaoqian Lin
dc89548c69 net: usb: asix_devices: Check return value of usbnet_get_endpoints
The code did not check the return value of usbnet_get_endpoints.
Add checks and return the error if it fails to transfer the error.

Found via static anlaysis and this is similar to
commit 07161b2416 ("sr9800: Add check for usbnet_get_endpoints").

Fixes: 933a27d39e ("USB: asix - Add AX88178 support and many other changes")
Fixes: 2e55cc7210 ("[PATCH] USB: usbnet (3/9) module for ASIX Ethernet adapters")
Cc: stable@vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://patch.msgid.link/20251026164318.57624-1-linmq006@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29 17:51:48 -07:00
Gustavo Luiz Duarte
00764aa5c9 netconsole: Fix race condition in between reader and writer of userdata
The update_userdata() function constructs the complete userdata string
in nt->extradata_complete and updates nt->userdata_length. This data
is then read by write_msg() and write_ext_msg() when sending netconsole
messages. However, update_userdata() was not holding target_list_lock
during this process, allowing concurrent message transmission to read
partially updated userdata.

This race condition could result in netconsole messages containing
incomplete or inconsistent userdata - for example, reading the old
userdata_length with new extradata_complete content, or vice versa,
leading to truncated or corrupted output.

Fix this by acquiring target_list_lock with spin_lock_irqsave() before
updating extradata_complete and userdata_length, and releasing it after
both fields are fully updated. This ensures that readers see a
consistent view of the userdata, preventing corruption during concurrent
access.

The fix aligns with the existing locking pattern used throughout the
netconsole code, where target_list_lock protects access to target
fields including buf[] and msgcounter that are accessed during message
transmission.

Also get rid of the unnecessary variable complete_idx, which makes it
easier to bail out of update_userdata().

Fixes: df03f830d0 ("net: netconsole: cache userdata formatted string in netconsole_target")
Signed-off-by: Gustavo Luiz Duarte <gustavold@gmail.com>
Link: https://patch.msgid.link/20251028-netconsole-fix-race-v4-1-63560b0ae1a0@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29 17:41:00 -07:00
Abdun Nihaal
a4384d786e nfp: xsk: fix memory leak in nfp_net_alloc()
In nfp_net_alloc(), the memory allocated for xsk_pools is not freed in
the subsequent error paths, leading to a memory leak. Fix that by
freeing it in the error path.

Fixes: 6402528b7a ("nfp: xsk: add AF_XDP zero-copy Rx and Tx support")
Signed-off-by: Abdun Nihaal <nihaal@cse.iitm.ac.in>
Link: https://patch.msgid.link/20251028160845.126919-1-nihaal@cse.iitm.ac.in
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29 17:38:23 -07:00
Jakub Kicinski
f99c579211 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2025-10-28 (ice, ixgbe, igb, igc)

For ice, Grzegorz fixes setting of PHY lane number and logical PF ID for
E82x devices. He also corrects access of CGU (Clock Generation Unit) on
dual complex devices.

Kohei Enju resolves issues with error path cleanup for probe when in
recovery mode on ixgbe and ensures PHY is powered on for link testing
on igc. Lastly, he converts incorrect use of -ENOTSUPP to -EOPNOTSUPP
on igb, igc, and ixgbe.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ixgbe: use EOPNOTSUPP instead of ENOTSUPP in ixgbe_ptp_feature_enable()
  igc: use EOPNOTSUPP instead of ENOTSUPP in igc_ethtool_get_sset_count()
  igb: use EOPNOTSUPP instead of ENOTSUPP in igb_get_sset_count()
  igc: power up the PHY before the link test
  ixgbe: fix memory leak and use-after-free in ixgbe_recovery_probe()
  ice: fix usage of logical PF id
  ice: fix destination CGU for dual complex E825
  ice: fix lane number calculation
====================

Link: https://patch.msgid.link/20251028202515.675129-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29 17:23:11 -07:00
Miaoqing Pan
9222582ec5 Revert "wifi: ath12k: Fix missing station power save configuration"
This reverts commit 4b66d18918.

In [1], Ross Brown reports poor performance of WCN7850 after enabling
power save. Temporarily revert the fix; it will be re-enabled once
the issue is resolved.

Tested-on: WCN7850 hw2.0 PCI WLAN.IOE_HMT.1.1-00011-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1

Fixes: 4b66d18918 ("wifi: ath12k: Fix missing station power save configuration")
Reported-by: Ross Brown <true.robot.ross@gmail.com>
Closes: https://lore.kernel.org/all/CAMn66qZENLhDOcVJuwUZ3ir89PVtVnQRq9DkV5xjJn1p6BKB9w@mail.gmail.com/ # [1]
Signed-off-by: Miaoqing Pan <miaoqing.pan@oss.qualcomm.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Link: https://patch.msgid.link/20251028060744.897198-1-miaoqing.pan@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-10-29 08:12:14 -07:00
Jijie Shao
7e2958aee5 net: hibmcge: fix the inappropriate netif_device_detach()
current, driver will call netif_device_detach() in
pci_error_handlers.error_detected() and do reset in
pci_error_handlers.slot_reset().
However, if pci_error_handlers.slot_reset() is not called
after pci_error_handlers.error_detected(),
driver will be detached and unable to recover.

drivers/pci/pcie/err.c/report_error_detected() says:
  If any device in the subtree does not have an error_detected
  callback, PCI_ERS_RESULT_NO_AER_DRIVER prevents subsequent
  error callbacks of any device in the subtree, and will
  exit in the disconnected error state.

Therefore, when the hibmcge device and other devices that do not
support the error_detected callback are under the same subtree,
hibmcge will be unable to do slot_reset even for non-fatal errors.

This path move netif_device_detach() from error_detected() to slot_reset(),
ensuring that detach and reset are always executed together.

Fixes: fd394a334b ("net: hibmcge: Add support for abnormal irq handling feature")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20251025014642.265259-4-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-28 19:11:04 -07:00
Jijie Shao
71eb8d1e07 net: hibmcge: remove unnecessary check for np_link_fail in scenarios without phy.
hibmcge driver uses fixed_phy to configure scenarios without PHY,
where the driver is always in a linked state. However,
there might be no link in hardware, so the np_link error
is detected in hbg_hw_adjust_link(), which can cause abnormal logs.

Therefore, in scenarios without a PHY, the driver no longer
checks the np_link status.

Fixes: 1d7cd7a9c6 ("net: hibmcge: support scenario without PHY")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20251025014642.265259-3-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-28 19:11:04 -07:00
Jijie Shao
12d2303db8 net: hibmcge: fix rx buf avl irq is not re-enabled in irq_handle issue
irq initialized with the macro HBG_ERR_IRQ_I will automatically
be re-enabled, whereas those initialized with the macro HBG_IRQ_I
will not be re-enabled.

Since the rx buf avl irq is initialized using the macro HBG_IRQ_I,
it needs to be actively re-enabled;
otherwise priv->stats.rx_fifo_less_empty_thrsld_cnt cannot be
correctly incremented.

Fixes: fd394a334b ("net: hibmcge: Add support for abnormal irq handling feature")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20251025014642.265259-2-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-28 19:11:04 -07:00
Pavel Zhigulin
d8d2b1f815 net: cxgb4/ch_ipsec: fix potential use-after-free in ch_ipsec_xfrm_add_state() callback
In ch_ipsec_xfrm_add_state() there is not check of try_module_get
return value. It is very unlikely, but try_module_get() could return
false value, which could cause use-after-free error.
Conditions: The module count must be zero, and a module unload in
progress. The thread doing the unload is blocked somewhere.
Another thread makes a callback into the module for some request
that (for instance) would need to create a kernel thread.
It tries to get a reference for the thread.
So try_module_get(THIS_MODULE) is the right call - and will fail here.

This fix adds checking the result of try_module_get call

Fixes: 6dad4e8ab3 ("chcr: Add support for Inline IPSec")
Signed-off-by: Pavel Zhigulin <Pavel.Zhigulin@kaspersky.com>
Link: https://patch.msgid.link/20251024161304.724436-1-Pavel.Zhigulin@kaspersky.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-28 18:40:20 -07:00
Kohei Enju
f82acf6fb4 ixgbe: use EOPNOTSUPP instead of ENOTSUPP in ixgbe_ptp_feature_enable()
When the requested PTP feature is not supported,
ixgbe_ptp_feature_enable() returns -ENOTSUPP, causing userland programs
to get "Unknown error 524".

Since EOPNOTSUPP should be used when error is propagated to userland,
return -EOPNOTSUPP instead of -ENOTSUPP.

Fixes: 3a6a4edaa5 ("ixgbe: Hardware Timestamping + PTP Hardware Clock (PHC)")
Signed-off-by: Kohei Enju <enjuk@amazon.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-10-28 12:49:59 -07:00
Kohei Enju
21d08d1c4c igc: use EOPNOTSUPP instead of ENOTSUPP in igc_ethtool_get_sset_count()
igc_ethtool_get_sset_count() returns -ENOTSUPP when a given stringset is
not supported, causing userland programs to get "Unknown error 524".

Since EOPNOTSUPP should be used when error is propagated to userland,
return -EOPNOTSUPP instead of -ENOTSUPP.

Fixes: 36b9fea609 ("igc: Add support for statistics")
Signed-off-by: Kohei Enju <enjuk@amazon.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-10-28 12:49:59 -07:00
Kohei Enju
bc73c5885c igb: use EOPNOTSUPP instead of ENOTSUPP in igb_get_sset_count()
igb_get_sset_count() returns -ENOTSUPP when a given stringset is not
supported, causing userland programs to get "Unknown error 524".

Since EOPNOTSUPP should be used when error is propagated to userland,
return -EOPNOTSUPP instead of -ENOTSUPP.

Fixes: 9d5c824399 ("igb: PCI-Express 82575 Gigabit Ethernet driver")
Signed-off-by: Kohei Enju <enjuk@amazon.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-10-28 12:49:59 -07:00
Kohei Enju
81fb1fe75c igc: power up the PHY before the link test
The current implementation of the igc driver doesn't power up the PHY
before the link test in igc_ethtool_diag_test(), causing the link test
to always report FAIL when admin state is down and the PHY is
consequently powered down.

To test the link state regardless of admin state, power up the PHY
before the link test in the offline test path. After the link test, the
original PHY state is restored by igc_reset(), so additional code which
explicitly restores the original state is not necessary.

Note that this change is applied only for the offline test path. This is
because in the online path we shouldn't interrupt normal networking
operation and powering up the PHY and restoring the original state would
interrupt that.

This implementation also uses igc_power_up_phy_copper() without checking
the media type, since igc devices are currently only copper devices and
the function is called in other places without checking the media type.

Furthermore, the powering up is on a best-effort basis, that is, we
don't handle failures of powering up (e.g. bus error) and just let the
test report FAIL.

Tested on Intel Corporation Ethernet Controller I226-V (rev 04) with
cable connected and link available.

Set device down and do ethtool test.
  # ip link set dev enp0s5 down

Without patch:
  # ethtool --test enp0s5
  The test result is FAIL
  The test extra info:
  Register test  (offline)         0
  Eeprom test    (offline)         0
  Interrupt test (offline)         0
  Loopback test  (offline)         0
  Link test   (on/offline)         1

With patch:
  # ethtool --test enp0s5
  The test result is PASS
  The test extra info:
  Register test  (offline)         0
  Eeprom test    (offline)         0
  Interrupt test (offline)         0
  Loopback test  (offline)         0
  Link test   (on/offline)         0

Fixes: f026d8ca29 ("igc: add support to eeprom, registers and link self-tests")
Signed-off-by: Kohei Enju <enjuk@amazon.com>
Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Tested-by: Avigail Dahan <avigailx.dahan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-10-28 12:49:59 -07:00
Kohei Enju
85308d999c ixgbe: fix memory leak and use-after-free in ixgbe_recovery_probe()
The error path of ixgbe_recovery_probe() has two memory bugs.

For non-E610 adapters, the function jumps to clean_up_probe without
calling devlink_free(), leaking the devlink instance and its embedded
adapter structure.

For E610 adapters, devlink_free() is called at shutdown_aci, but
clean_up_probe then accesses adapter->state, sometimes triggering
use-after-free because adapter is embedded in devlink. This UAF is
similar to the one recently reported in ixgbe_remove(). (Link)

Fix both issues by moving devlink_free() after adapter->state access,
aligning with the cleanup order in ixgbe_probe().

Link: https://lore.kernel.org/intel-wired-lan/20250828020558.1450422-1-den@valinux.co.jp/
Fixes: 29cb3b8d95 ("ixgbe: add E610 implementation of FW recovery mode")
Signed-off-by: Kohei Enju <enjuk@amazon.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-10-28 12:49:59 -07:00
Grzegorz Nitka
9a0f81fc64 ice: fix usage of logical PF id
In some devices, the function numbers used are non-contiguous. For
example, here is such configuration for E825 device:

root@/home/root# lspci -v | grep Eth
0a:00.0 Ethernet controller: Intel Corporation Ethernet Connection
E825-C for backplane (rev 04)
0a:00.1 Ethernet controller: Intel Corporation Ethernet Connection
E825-C for backplane (rev 04)
0a:00.4 Ethernet controller: Intel Corporation Ethernet Connection
E825-C 10GbE (rev 04)
0a:00.5 Ethernet controller: Intel Corporation Ethernet Connection
E825-C 10GbE (rev 04)

When distributing RSS and FDIR masks, which are global resources across
the active devices, it is required to have a contiguous PF id, which can
be described as a logical PF id. In the case above, function 0 would
have a logical PF id of 0, function 1 would have a logical PF id
of 1, and functions 4 and 5 would have a logical PF ids 2 and 3
respectively.
Using logical PF id can properly describe which slice of resources can
be used by a particular PF.

The 'function id' to 'logical id' mapping has been introduced with the
commit 015307754a ("ice: Support VF queue rate limit and quanta size
configuration"). However, the usage of 'logical_pf_id' field was
unintentionally skipped for profile mask configuration.
Fix it by using 'logical_pf_id' instead of 'pf_id' value when configuring
masks.

Without that patch, wrong indexes, i.e. out of range for given PF, can
be used while configuring resources masks, which might lead to memory
corruption and undefined driver behavior.
The call trace below is one of the examples of such error:

[  +0.000008] WARNING: CPU: 39 PID: 3830 at drivers/base/devres.c:1095
devm_kfree+0x70/0xa0
[  +0.000002] RIP: 0010:devm_kfree+0x70/0xa0
[  +0.000001] Call Trace:
[  +0.000002]  <TASK>
[  +0.000002]  ice_free_hw_tbls+0x183/0x710 [ice]
[  +0.000106]  ice_deinit_hw+0x67/0x90 [ice]
[  +0.000091]  ice_deinit+0x20d/0x2f0 [ice]
[  +0.000076]  ice_remove+0x1fa/0x6a0 [ice]
[  +0.000075]  pci_device_remove+0xa7/0x1d0
[  +0.000010]  device_release_driver_internal+0x365/0x530
[  +0.000006]  driver_detach+0xbb/0x170
[  +0.000003]  bus_remove_driver+0x117/0x290
[  +0.000007]  pci_unregister_driver+0x26/0x250

Fixes: 015307754a ("ice: Support VF queue rate limit and quanta size configuration")
Suggested-by: Dan Nowlin <dan.nowlin@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-10-28 12:49:59 -07:00
Grzegorz Nitka
4507641306 ice: fix destination CGU for dual complex E825
On dual complex E825, only complex 0 has functional CGU (Clock
Generation Unit), powering all the PHYs.
SBQ (Side Band Queue) destination device 'cgu' in current implementation
points to CGU on current complex and, in order to access primary CGU
from the secondary complex, the driver should use 'cgu_peer' as
a destination device in read/write CGU registers operations.

Define new 'cgu_peer' (15) as RDA (Remote Device Access) client over
SB-IOSF interface and use it as device target when accessing CGU from
secondary complex.

This problem has been identified when working on recovery clock
enablement [1]. In existing implementation for E825 devices, only PF0,
which is clock owner, is involved in CGU configuration, thus the
problem was not exposed to the user.

[1] https://lore.kernel.org/intel-wired-lan/20250905150947.871566-1-grzegorz.nitka@intel.com/

Fixes: e2193f9f9e ("ice: enable timesync operation on 2xNAC E825 devices")
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Reviewed-by: Arkadiusz Kubalewski <Arkadiusz.kubalewski@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-10-28 12:49:59 -07:00
Grzegorz Nitka
e984046131 ice: fix lane number calculation
E82X adapters do not have sequential IDs, lane number is PF ID.

Add check for ICE_MAC_GENERIC and skip checking port options.

Also, adjust logical port number for specific E825 device with external
PHY support (PCI device id 0x579F). For this particular device,
with 2x25G (PHY0) and 2x10G (PHY1) port configuration, modification of
pf_id -> lane_number mapping is required. PF IDs on the 2nd PHY start
from 4 in such scenario. Otherwise, the lane number cannot be
determined correctly, leading to PTP init errors during PF initialization.

Fixes: 258f5f9058 ("ice: Add correct PHY lane assignment")
Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Milena Olech <milena.olech@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-10-28 12:49:59 -07:00
Abdun Nihaal
46a499aaf8 sfc: fix potential memory leak in efx_mae_process_mport()
In efx_mae_enumerate_mports(), memory allocated for mae_mport_desc is
passed as a argument to efx_mae_process_mport(), but when the error path
in efx_mae_process_mport() gets executed, the memory allocated for desc
gets leaked.

Fix that by freeing the memory allocation before returning error.

Fixes: a6a15aca42 ("sfc: enumerate mports in ef100")
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Abdun Nihaal <nihaal@cse.iitm.ac.in>
Link: https://patch.msgid.link/20251023141844.25847-1-nihaal@cse.iitm.ac.in
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-27 17:55:41 -07:00
Jijie Shao
03ca7c8c42 net: hns3: return error code when function fails
Currently, in hclge_mii_ioctl(), the operation to
read the PHY register (SIOCGMIIREG) always returns 0.

This patch changes the return type of hclge_read_phy_reg(),
returning an error code when the function fails.

Fixes: 024712f51e ("net: hns3: add ioctl support for imp-controlled PHYs")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20251023131338.2642520-2-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-27 17:52:53 -07:00
Baochen Qiang
2469bb6a6a Revert "wifi: ath10k: avoid unnecessary wait for service ready message"
This reverts commit 51a73f1b2e.

Although this commit benefits QCA6174, it breaks QCA988x and
QCA9984 [1][2]. Since it is not likely to root cause/fix this
issue in a short time, revert it to get those chips back.

Compile tested only.

Fixes: 51a73f1b2e ("wifi: ath10k: avoid unnecessary wait for service ready message")
Link: https://lore.kernel.org/ath10k/6d41bc00602c33ffbf68781f563ff2e6c6915a3e.camel@gmail.com # [1]
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220671 # [2]
Signed-off-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251027-ath10k-revert-polling-first-change-v1-1-89aaf3bcbfa1@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-10-27 07:35:24 -07:00
Emanuele Ghidoli
84a905290c net: phy: dp83867: Disable EEE support as not implemented
While the DP83867 PHYs report EEE capability through their feature
registers, the actual hardware does not support EEE (see Links).
When the connected MAC enables EEE, it causes link instability and
communication failures.

The issue is reproducible with a iMX8MP and relevant stmmac ethernet port.
Since the introduction of phylink-managed EEE support in the stmmac driver,
EEE is now enabled by default, leading to issues on systems using the
DP83867 PHY.

Call phy_disable_eee during phy initialization to prevent EEE from being
enabled on DP83867 PHYs.

Link: https://e2e.ti.com/support/interface-group/interface/f/interface-forum/1445244/dp83867ir-dp83867-disable-eee-lpi
Link: https://e2e.ti.com/support/interface-group/interface/f/interface-forum/658638/dp83867ir-eee-energy-efficient-ethernet
Fixes: 2a10154abc ("net: phy: dp83867: Add TI dp83867 phy")
Cc: stable@vger.kernel.org
Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20251023144857.529566-1-ghidoliemanuele@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-24 19:10:03 -07:00
Bui Quang Minh
1ab6658174 virtio-net: drop the multi-buffer XDP packet in zerocopy
In virtio-net, we have not yet supported multi-buffer XDP packet in
zerocopy mode when there is a binding XDP program. However, in that
case, when receiving multi-buffer XDP packet, we skip the XDP program
and return XDP_PASS. As a result, the packet is passed to normal network
stack which is an incorrect behavior (e.g. a XDP program for packet
count is installed, multi-buffer XDP packet arrives and does go through
XDP program. As a result, the packet count does not increase but the
packet is still received from network stack).This commit instead returns
XDP_ABORTED in that case.

Fixes: 99c861b44e ("virtio_net: xsk: rx: support recv merge mode")
Cc: stable@vger.kernel.org
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Link: https://patch.msgid.link/20251022155630.49272-1-minhquangbui99@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-23 17:30:40 -07:00
Lizhi Xu
420c84c330 usbnet: Prevents free active kevent
The root cause of this issue are:
1. When probing the usbnet device, executing usbnet_link_change(dev, 0, 0);
put the kevent work in global workqueue. However, the kevent has not yet
been scheduled when the usbnet device is unregistered. Therefore, executing
free_netdev() results in the "free active object (kevent)" error reported
here.

2. Another factor is that when calling usbnet_disconnect()->unregister_netdev(),
if the usbnet device is up, ndo_stop() is executed to cancel the kevent.
However, because the device is not up, ndo_stop() is not executed.

The solution to this problem is to cancel the kevent before executing
free_netdev().

Fixes: a69e617e53 ("usbnet: Fix linkwatch use-after-free on disconnect")
Reported-by: Sam Sun <samsun1006219@gmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=8bfd7bcc98f7300afb84
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
Link: https://patch.msgid.link/20251022024007.1831898-1-lizhi.xu@windriver.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-23 17:27:44 -07:00
Jakub Kicinski
a83155cc4e Merge tag 'wireless-2025-10-23' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:

====================
First set of fixes:
 - brcmfmac: long-standing crash when used w/o P2P
 - iwlwifi: fix for a use-after-free bug
 - mac80211: key tailroom accounting bug could leave
             allocation overhead and cause a warning
 - ath11k: add a missing platform,
           fix key flag operations
 - bcma: skip devices disabled in OF/DT
 - various (potential) memory leaks

* tag 'wireless-2025-10-23' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: nl80211: call kfree without a NULL check
  wifi: mac80211: fix key tailroom accounting leak
  wifi: brcmfmac: fix crash while sending Action Frames in standalone AP Mode
  MAINTAINERS: wcn36xx: Add linux-wireless list
  bcma: don't register devices disabled in OF
  wifi: mac80211: reset FILS discovery and unsol probe resp intervals
  wifi: iwlwifi: fix potential use after free in iwl_mld_remove_link()
  wifi: ath11k: avoid bit operation on key flags
  wifi: ath12k: free skb during idr cleanup callback
  wifi: ath11k: Add missing platform IDs for quirk table
  wifi: ath10k: Fix memory leak on unsupported WMI command
====================

Link: https://patch.msgid.link/20251023180604.626946-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-23 17:15:48 -07:00
Patrisious Haddad
664f76be38 net/mlx5: Fix IPsec cleanup over MPV device
When we do mlx5e_detach_netdev() we eventually disable blocking events
notifier, among those events are IPsec MPV events from IB to core.

So before disabling those blocking events, make sure to also unregister
the devcom device and mark all this device operations as complete,
in order to prevent the other device from using invalid netdev
during future devcom events which could cause the trace below.

BUG: kernel NULL pointer dereference, address: 0000000000000010
PGD 146427067 P4D 146427067 PUD 146488067 PMD 0
Oops: Oops: 0000 [#1] SMP
CPU: 1 UID: 0 PID: 7735 Comm: devlink Tainted: GW 6.12.0-rc6_for_upstream_min_debug_2024_11_08_00_46 #1
Tainted: [W]=WARN
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:mlx5_devcom_comp_set_ready+0x5/0x40 [mlx5_core]
Code: 00 01 48 83 05 23 32 1e 00 01 41 b8 ed ff ff ff e9 60 ff ff ff 48 83 05 00 32 1e 00 01 eb e3 66 0f 1f 44 00 00 0f 1f 44 00 00 <48> 8b 47 10 48 83 05 5f 32 1e 00 01 48 8b 50 40 48 85 d2 74 05 40
RSP: 0018:ffff88811a5c35f8 EFLAGS: 00010206
RAX: ffff888106e8ab80 RBX: ffff888107d7e200 RCX: ffff88810d6f0a00
RDX: ffff88810d6f0a00 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffff88811a17e620 R08: 0000000000000040 R09: 0000000000000000
R10: ffff88811a5c3618 R11: 0000000de85d51bd R12: ffff88811a17e600
R13: ffff88810d6f0a00 R14: 0000000000000000 R15: ffff8881034bda80
FS:  00007f27bdf89180(0000) GS:ffff88852c880000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 000000010f159005 CR4: 0000000000372eb0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 ? __die+0x20/0x60
 ? page_fault_oops+0x150/0x3e0
 ? exc_page_fault+0x74/0x130
 ? asm_exc_page_fault+0x22/0x30
 ? mlx5_devcom_comp_set_ready+0x5/0x40 [mlx5_core]
 mlx5e_devcom_event_mpv+0x42/0x60 [mlx5_core]
 mlx5_devcom_send_event+0x8c/0x170 [mlx5_core]
 blocking_event+0x17b/0x230 [mlx5_core]
 notifier_call_chain+0x35/0xa0
 blocking_notifier_call_chain+0x3d/0x60
 mlx5_blocking_notifier_call_chain+0x22/0x30 [mlx5_core]
 mlx5_core_mp_event_replay+0x12/0x20 [mlx5_core]
 mlx5_ib_bind_slave_port+0x228/0x2c0 [mlx5_ib]
 mlx5_ib_stage_init_init+0x664/0x9d0 [mlx5_ib]
 ? idr_alloc_cyclic+0x50/0xb0
 ? __kmalloc_cache_noprof+0x167/0x340
 ? __kmalloc_noprof+0x1a7/0x430
 __mlx5_ib_add+0x34/0xd0 [mlx5_ib]
 mlx5r_probe+0xe9/0x310 [mlx5_ib]
 ? kernfs_add_one+0x107/0x150
 ? __mlx5_ib_add+0xd0/0xd0 [mlx5_ib]
 auxiliary_bus_probe+0x3e/0x90
 really_probe+0xc5/0x3a0
 ? driver_probe_device+0x90/0x90
 __driver_probe_device+0x80/0x160
 driver_probe_device+0x1e/0x90
 __device_attach_driver+0x7d/0x100
 bus_for_each_drv+0x80/0xd0
 __device_attach+0xbc/0x1f0
 bus_probe_device+0x86/0xa0
 device_add+0x62d/0x830
 __auxiliary_device_add+0x3b/0xa0
 ? auxiliary_device_init+0x41/0x90
 add_adev+0xd1/0x150 [mlx5_core]
 mlx5_rescan_drivers_locked+0x21c/0x300 [mlx5_core]
 esw_mode_change+0x6c/0xc0 [mlx5_core]
 mlx5_devlink_eswitch_mode_set+0x21e/0x640 [mlx5_core]
 devlink_nl_eswitch_set_doit+0x60/0xe0
 genl_family_rcv_msg_doit+0xd0/0x120
 genl_rcv_msg+0x180/0x2b0
 ? devlink_get_from_attrs_lock+0x170/0x170
 ? devlink_nl_eswitch_get_doit+0x290/0x290
 ? devlink_nl_pre_doit_port_optional+0x50/0x50
 ? genl_family_rcv_msg_dumpit+0xf0/0xf0
 netlink_rcv_skb+0x54/0x100
 genl_rcv+0x24/0x40
 netlink_unicast+0x1fc/0x2d0
 netlink_sendmsg+0x1e4/0x410
 __sock_sendmsg+0x38/0x60
 ? sockfd_lookup_light+0x12/0x60
 __sys_sendto+0x105/0x160
 ? __sys_recvmsg+0x4e/0x90
 __x64_sys_sendto+0x20/0x30
 do_syscall_64+0x4c/0x100
 entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7f27bc91b13a
Code: bb 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 8b 05 fa 96 2c 00 45 89 c9 4c 63 d1 48 63 ff 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 f3 c3 0f 1f 40 00 41 55 41 54 4d 89 c5 55
RSP: 002b:00007fff369557e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000009c54b10 RCX: 00007f27bc91b13a
RDX: 0000000000000038 RSI: 0000000009c54b10 RDI: 0000000000000006
RBP: 0000000009c54920 R08: 00007f27bd0030e0 R09: 000000000000000c
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001
 </TASK>
Modules linked in: mlx5_vdpa vringh vhost_iotlb vdpa xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi ib_umad scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm mlx5_fwctl mlx5_ib ib_uverbs ib_core mlx5_core
CR2: 0000000000000010

Fixes: 82f9378c44 ("net/mlx5: Handle IPsec steering upon master unbind/bind")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1761136182-918470-5-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-23 07:14:33 -07:00
Patrisious Haddad
8f82f89550 net/mlx5: Refactor devcom to return NULL on failure
Devcom device and component registration isn't always critical to the
functionality of the caller, hence the registration can fail and we can
continue working with an ERR_PTR value saved inside a variable.

In order to avoid that make sure all devcom failures return NULL.

Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1761136182-918470-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-23 07:14:33 -07:00
Alexei Lazar
d58a9a917a net/mlx5e: Skip PPHCR register query if not supported by the device
Check the PCAM supported registers mask before querying the PPHCR
register, as it is not supported in older devices.

Fixes: 44907e7c8f ("net/mlx5e: Add logic to read RS-FEC histogram bin ranges from PPHCR")
Signed-off-by: Alexei Lazar <alazar@nvidia.com>
Reviewed-by: Yael Chemla <ychemla@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1761136182-918470-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-23 07:14:33 -07:00
Robert Marko
399d109347 net: phy: micrel: always set shared->phydev for LAN8814
Currently, during the LAN8814 PTP probe shared->phydev is only set if PTP
clock gets actually set, otherwise the function will return before setting
it.

This is an issue as shared->phydev is unconditionally being used when IRQ
is being handled, especially in lan8814_gpio_process_cap and since it was
not set it will cause a NULL pointer exception and crash the kernel.

So, simply always set shared->phydev to avoid the NULL pointer exception.

Fixes: b3f1a08fcf ("net: phy: micrel: Add support for PTP_PF_EXTTS for lan8814")
Signed-off-by: Robert Marko <robert.marko@sartura.hr>
Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://patch.msgid.link/20251021132034.983936-1-robert.marko@sartura.hr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-23 07:11:48 -07:00
Ralf Lici
efd729408b ovpn: use datagram_poll_queue for socket readiness in TCP
openvpn TCP encapsulation uses a custom queue to deliver packets to
userspace. Currently it relies on datagram_poll, which checks
sk_receive_queue, leading to false readiness signals when that queue
contains non-userspace packets.

Switch ovpn_tcp_poll to use datagram_poll_queue with the peer's
user_queue, ensuring poll only signals readiness when userspace data is
actually available. Also refactor ovpn_tcp_poll in order to enforce the
assumption we can make on the lifetime of ovpn_sock and peer.

Fixes: 11851cbd60 ("ovpn: implement TCP transport")
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/20251021100942.195010-4-ralf@mandelbit.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-23 15:46:04 +02:00
Tonghao Zhang
10843e1492 net: bonding: fix possible peer notify event loss or dup issue
If the send_peer_notif counter and the peer event notify are not synchronized.
It may cause problems such as the loss or dup of peer notify event.

Before this patch:
- If should_notify_peers is true and the lock for send_peer_notif-- fails, peer
  event may be sent again in next mii_monitor loop, because should_notify_peers
  is still true.
- If should_notify_peers is true and the lock for send_peer_notif-- succeeded,
  but the lock for peer event fails, the peer event will be lost.

This patch locks the RTNL for send_peer_notif, events, and commit simultaneously.

Fixes: 07a4ddec3c ("bonding: add an option to specify a delay between peer notifications")
Cc: Jay Vosburgh <jv@jvosburgh.net>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Cc: Vincent Bernat <vincent@bernat.ch>
Cc: <stable@vger.kernel.org>
Signed-off-by: Tonghao Zhang <tonghao@bamaicloud.com>
Acked-by: Jay Vosburgh <jv@jvosburgh.net>
Link: https://patch.msgid.link/20251021050933.46412-1-tonghao@bamaicloud.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-23 13:08:53 +02:00
Lad Prabhakar
706136c572 net: ravb: Ensure memory write completes before ringing TX doorbell
Add a final dma_wmb() barrier before triggering the transmit request
(TCCR_TSRQ) to ensure all descriptor and buffer writes are visible to
the DMA engine.

According to the hardware manual, a read-back operation is required
before writing to the doorbell register to guarantee completion of
previous writes. Instead of performing a dummy read, a dma_wmb() is
used to both enforce the same ordering semantics on the CPU side and
also to ensure completion of writes.

Fixes: c156633f13 ("Renesas Ethernet AVB driver proper")
Cc: stable@vger.kernel.org
Co-developed-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Signed-off-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://patch.msgid.link/20251017151830.171062-5-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-22 18:15:14 -07:00