Commit Graph

54993 Commits

Author SHA1 Message Date
Cosmin Ratiu
64ad6470c8 net/mlx5: Move SF dev table notifier registration outside the PF devlink lock
This completes the previous patches by moving notifier registration for
SF dev tables outside the devlink locked critical section in
mlx5_init_one() / mlx5_uninit_one() and into the mlx5_mdev_init() /
mlx5_mdev_uninit() functions.

This is only done for non-SFs, since SFs do not have a SF HW table
themselves.

After this patch, notifiers can grab the PF devlink lock (soon to be
necessary) without creating a locking cycle.

Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1763325940-1231508-7-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:32:28 -08:00
Cosmin Ratiu
d4a0acbd94 net/mlx5: Move the SF table notifiers outside the devlink lock
Move the SF table notifiers registration/unregistration outside of
mlx5_init_one() / mlx5_uninit_one() and into the mlx5_mdev_init() /
mlx5_mdev_uninit() functions.

This is only done for non-SFs, since SFs do not have a SF table
themselves and thus don't need notifiers.

Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1763325940-1231508-6-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:32:28 -08:00
Cosmin Ratiu
e63c9c5f0a net/mlx5: Move the SF HW table notifier outside the devlink lock
Move the SF HW table notifier registration/unregistration outside of
mlx5_init_one() / mlx5_uninit_one() and into the mlx5_mdev_init() /
mlx5_mdev_uninit() functions.

This is only done for non-SFs, since SFs do not have a SF HW table
themselves.

Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1763325940-1231508-5-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:32:27 -08:00
Cosmin Ratiu
d3a356db85 net/mlx5: Move the vhca event notifier outside of the devlink lock
The vhca event notifier consists of an atomic notifier for vhca state
changes (used for SF events), multiple workqueues and a blocking
notifier chain for delivering the vhca state change events for further
processing.

This patch moves the vhca notifier head outside of mlx5_init_one() /
mlx5_uninit_one() and into the mlx5_mdev_init() / mlx5_mdev_uninit()
functions.

This allows called notifiers to grab the PF devlink lock which was
previously impossible because it would create a circular lock
dependency.

mlx5_vhca_event_stop() is now called earlier in the cleanup phase and
flushes the workqueues to ensure that after the call, there are no
pending events. This simplifies the cleanup flow for vhca event
consumers.

Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1763325940-1231508-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:32:27 -08:00
Cosmin Ratiu
3fee828789 net/mlx5: Move the esw mode notifier chain outside the devlink lock
The esw mode change notifier chain is initialized/cleaned up in
mlx5_init_one() / mlx5_uninit_one() with the devlink lock held.

Move the notifier head from the eswitch struct into mlx5_priv directly,
and initialize it outside the critical section. This will allow notifier
registration to happen earlier in the init procedure in subsequent
patches.

Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1763325940-1231508-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:32:27 -08:00
Cosmin Ratiu
b6b03097f9 net/mlx5: Initialize events outside devlink lock
Move event init/cleanup outside of mlx5_init_one() / mlx5_uninit_one()
and into the mlx5_mdev_init() / mlx5_mdev_uninit() functions.

By doing this, we avoid the events being reinitialized on devlink reload
and, more importantly, the events->sw_nh notifier chain becomes
available earlier in the init procedure, which will be used in
subsequent patches. This makes sense because the events struct is pure
software, independent of any HW details.

Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1763325940-1231508-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:32:27 -08:00
Aditya Garg
45120304e8 net: mana: Drop TX skb on post_work_request failure and unmap resources
Drop TX packets when posting the work request fails and ensure DMA
mappings are always cleaned up.

Signed-off-by: Aditya Garg <gargaditya@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/1763464269-10431-3-git-send-email-gargaditya@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:11:57 -08:00
Aditya Garg
934fa943b5 net: mana: Handle SKB if TX SGEs exceed hardware limit
The MANA hardware supports a maximum of 30 scatter-gather entries (SGEs)
per TX WQE. Exceeding this limit can cause TX failures.
Add ndo_features_check() callback to validate SKB layout before
transmission. For GSO SKBs that would exceed the hardware SGE limit, clear
NETIF_F_GSO_MASK to enforce software segmentation in the stack.
Add a fallback in mana_start_xmit() to linearize non-GSO SKBs that still
exceed the SGE limit.

Also, Add ethtool counter for SKBs linearized

Co-developed-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
Signed-off-by: Aditya Garg <gargaditya@linux.microsoft.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/1763464269-10431-2-git-send-email-gargaditya@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:11:57 -08:00
Anshumali Gaur
929ca3bcea octeontx2-af: Skip TM tree print for disabled SQs
Currently, the TM tree is printing all SQ topology including those
which are not enabled, this results in redundant output for SQs
which are not active. This patch adds a check in print_tm_tree()
to skip printing the TM tree hierarchy if the SQ is not enabled.

Signed-off-by: Anshumali Gaur <agaur@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251118054235.1599714-1-agaur@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:06:22 -08:00
Russell King (Oracle)
bf351bbec5 net: stmmac: stmmac_is_jumbo_frm() returns boolean
stmmac_is_jumbo_frm() returns whether the driver considers the frame
size to be a jumbo frame, and thus returns 0/1 values. This is boolean,
so convert it to return a boolean and use false/true instead. Also
convert stmmac_xmit()'s is_jumbo to be bool, which causes several
variables to be repositioned to keep it in reverse Christmas-tree
order.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/E1vLIWW-0000000Ewkl-21Ia@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 08:33:49 -08:00
Russell King (Oracle)
b5adada61e net: stmmac: stmmac_is_jumbo_frm() len should be unsigned
stmmac_is_jumbo_frm() and the is_jumbo_frm() methods take skb->len
which is an unsigned int. Avoid an implicit cast to "int" via the
method parameter and then incorrectly doing signed comparisons on
this unsigned value.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/E1vLIWR-0000000Ewkf-1Tdx@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 08:33:49 -08:00
Russell King (Oracle)
7ac60a14d3 net: stmmac: convert priv->sph* to boolean and rename
priv->sph* only have 'true' and 'false' used with them, yet they are an
int. Change their type to a bool, and rename to make their usage more
clear.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/E1vLIDN-0000000Evur-2NLU@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 08:32:38 -08:00
Tariq Toukan
70ca239b61 net/mlx5: Use EOPNOTSUPP instead of ENOTSUPP
Per Documentation/dev-tools/checkpatch.rst, ENOTSUPP is not a standard
error code and should be avoided. EOPNOTSUPP should be used instead.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/1763415729-1238421-6-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-18 18:53:34 -08:00
Saeed Mahameed
fbb9933666 net/mlx5: Abort new commands if all command slots are stalled
In case of a FW issue, FW might be not responding to FW commands,
causing kernel lockout for a long period of time, e.g. rtnl_lock held
while ethtool is trying to collect stats waiting for FW to respond to
multiple commands, when all of them will timeout.

While there's no immediate indication of the FW lockout, we can safely
assume that something is wrong when all command slots are busy and in
a timeout state and no FW completion was received on any of them.

In such case, start immediately failing new commands.

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1763415729-1238421-5-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-18 18:53:34 -08:00
Carolina Jubran
ea3270351c net/mlx5: Remove redundant bw_share minimal value assignment
Remove unnecessary logic that sets bw_share to minimal value, when
parent has bw_share configured but nodes don't have min_rate.

This check is redundant because the parent bandwidth acts as the upper
bound regardless, and the firmware always enforces the topmost
bandwidth constraint.

Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1763415729-1238421-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-18 18:53:33 -08:00
Carolina Jubran
391dad2e68 net/mlx5e: Recover SQ on excessive PTP TX timestamp delta
Extend the TX timestamp handler to recover the SQ when the difference
between the port and CQE TX timestamps is abnormally large.

The current logic aborts timestamp delivery if the delta exceeds
1/128 seconds, which matches the maximum expected packet interval in
ptp4l. A larger delta makes the timestamps unreliable.

This change adds recovery if the delta exceeds 0.5 seconds. Such a
large gap should not occur in normal operation and indicates that
firmware is stuck or metadata tracking is out of sync, leading to stale
or mismatched timestamps. Recovering the SQ ensures forward progress
and avoids silently dropping invalid timestamps.

The timestamp handler now takes mlx5e_ptpsq directly to access both CQ
stats and the recovery state.

Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1763415729-1238421-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-18 18:53:33 -08:00
Gal Pressman
2e4c44b12f net/mlx5: Refactor EEPROM query error handling to return status separately
Matthew and Jakub reported [1] issues where inventory automation tools
are calling EEPROM query repeatedly on a port that doesn't have an SFP
connected, resulting in millions of error prints.

Move MCIA register status extraction from the query functions to the
callers, allowing use of extack reporting instead of a dmesg print when
using the netlink API.

[1] https://lore.kernel.org/netdev/20251028194011.39877-1-mattc@purestorage.com/

Cc: Matthew W Carlis <mattc@purestorage.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1763415729-1238421-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-18 18:53:33 -08:00
Tim Hostetler
1b42e07af1 gve: Add Rx HWTS metadata to AF_XDP ZC mode
By overlaying the struct gve_xdp_buff on top of the struct xdp_buff_xsk
that AF_XDP utilizes, the driver records the 32 bit timestamp via the
completion descriptor and the cached 64 bit NIC timestamp via gve_priv.

The driver's implementation of xmo_rx_timestamp extends the timestamp to
the full and up to date 64 bit timestamp and returns it to the user.

gve_rx_xsk_dqo is modified to accept a pointer to the completion
descriptor and no longer takes a buf_len explicitly as it can be pulled
out of the descriptor.

With this patch gve now supports bpf_xdp_metadata_rx_timestamp.

Signed-off-by: Tim Hostetler <thostet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: Joshua Washington <joshwash@google.com>
Link: https://patch.msgid.link/20251114211146.292068-5-joshwash@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-18 15:52:43 +01:00
Tim Hostetler
66adaf1021 gve: Prepare bpf_xdp_metadata_rx_timestamp support
Support populating XDP RX metadata with hardware RX timestamps. This
patch utilizes the same underlying logic to calculate hardware
timestamps as the regular RX path.

xdp_metadata_ops is registered with the net_device in a future patch.

gve_rx_calculate_hwtstamp was pulled out so as to not duplicate logic
between gve_xdp_rx_timestamp and gve_rx_hwtstamp.

Signed-off-by: Tim Hostetler <thostet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: Joshua Washington <joshwash@google.com>
Link: https://patch.msgid.link/20251114211146.292068-4-joshwash@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-18 15:52:43 +01:00
Tim Hostetler
f356a66b87 gve: Wrap struct xdp_buff
RX timestamping will need to keep track of extra temporary information
per-packet. In preparation for this, introduce gve_xdp_buff to wrap the
xdp_buff. This is similar in function to stmmac_xdp_buff and
ice_xdp_buff.

Signed-off-by: Tim Hostetler <thostet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: Joshua Washington <joshwash@google.com>
Link: https://patch.msgid.link/20251114211146.292068-3-joshwash@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-18 15:52:42 +01:00
Tim Hostetler
46e7860ef9 gve: Move ptp_schedule_worker to gve_init_clock
Previously, gve had been only initializing ptp aux work when
hardware timestamping was initialized through ndo_hwtsatmp_set. As this
patch series introduces XDP hardware timestamp metadata which will
require the ptp aux work, the work can't be gated on the
kernel_hwtstamp_config being set and must be initialized elsewhere.

For simplicity, ptp_schedule_worker is invoked right after the ptp_clock
is registered with the kernel (which happens during gve_probe or
following reset). The worker is scheduled in GVE_NIC_TS_SYNC_INTERVAL_MS
as the synchronous call to gve_clock_nic_ts_read makes the worker
redundant if scheduled immediately.

If gve cannot read the device clock immediately, it errors out of
gve_init_clock.

Signed-off-by: Tim Hostetler <thostet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: Joshua Washington <joshwash@google.com>
Link: https://patch.msgid.link/20251114211146.292068-2-joshwash@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-18 15:52:42 +01:00
Ovidiu Panait
c171e679ee net: stmmac: Disable EEE RX clock stop when VLAN is enabled
On the Renesas RZ/V2H EVK platform, where the stmmac MAC is connected to a
Microchip KSZ9131RNXI PHY, creating or deleting VLAN interfaces may fail
with timeouts:

    # ip link add link end1 name end1.5 type vlan id 5
    15c40000.ethernet end1: Timeout accessing MAC_VLAN_Tag_Filter
    RTNETLINK answers: Device or resource busy

Disabling EEE at runtime avoids the problem:

    # ethtool --set-eee end1 eee off
    # ip link add link end1 name end1.5 type vlan id 5
    # ip link del end1.5

The stmmac hardware requires the receive clock to be running when writing
certain registers, such as those used for MAC address configuration or
VLAN filtering. However, by default the driver enables Energy Efficient
Ethernet (EEE) and allows the PHY to stop the receive clock when the link
is idle. As a result, the RX clock might be stopped when attempting to
access these registers, leading to timeouts and other issues.

Commit dd557266cf ("net: stmmac: block PHY RXC clock-stop")
addressed this issue for most register accesses by wrapping them in
phylink_rx_clk_stop_block()/phylink_rx_clk_stop_unblock() calls.
However, VLAN add/delete operations may be invoked with bottom halves
disabled, where sleeping is not allowed, so using these helpers is not
possible.

Therefore, to fix this, disable the RX clock stop feature in the phylink
configuration if VLAN features are set. This ensures the RX clock remains
active and register accesses succeed during VLAN operations.

Signed-off-by: Ovidiu Panait <ovidiu.panait.rb@renesas.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20251113112721.70500-3-ovidiu.panait.rb@renesas.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-18 10:43:41 +01:00
Ovidiu Panait
d9db257236 net: stmmac: Fix VLAN 0 deletion in vlan_del_hw_rx_fltr()
When the "rx-vlan-filter" feature is enabled on a network device, the 8021q
module automatically adds a VLAN 0 hardware filter when the device is
brought administratively up.

For stmmac, this causes vlan_add_hw_rx_fltr() to create a new entry for
VID 0 in the mac_device_info->vlan_filter array, in the following format:

    VLAN_TAG_DATA_ETV | VLAN_TAG_DATA_VEN | vid

Here, VLAN_TAG_DATA_VEN indicates that the hardware filter is enabled for
that VID.

However, on the delete path, vlan_del_hw_rx_fltr() searches the vlan_filter
array by VID only, without verifying whether a VLAN entry is enabled. As a
result, when the 8021q module attempts to remove VLAN 0, the function may
mistakenly match a zero-initialized slot rather than the actual VLAN 0
entry, causing incorrect deletions and leaving stale entries in the
hardware table.

Fix this by verifying that the VLAN entry's enable bit (VLAN_TAG_DATA_VEN)
is set before matching and deleting by VID. This ensures only active VLAN
entries are removed and avoids leaving stale entries in the VLAN filter
table, particularly for VLAN ID 0.

Fixes: ed64639bc1 ("net: stmmac: Add support for VLAN Rx filtering")
Signed-off-by: Ovidiu Panait <ovidiu.panait.rb@renesas.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20251113112721.70500-2-ovidiu.panait.rb@renesas.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-18 10:43:41 +01:00
Heiner Kallweit
28c0074fd4 r8169: bail out from probe if fiber mode is detected on RTL8127AF
It was reported that on a card with RTL8127AF (SFP + DAC) link-up isn't
detected. Realtek hides the SFP behind the internal PHY, which isn't
behaving fully compliance with clause 22 any longer in fiber mode.
Due to not having access to chip documentation there isn't much I can
do for now. Instead of silently failing to detect link-up in fiber mode,
inform the user that fiber mode isn't support and bail out.

The logic to detect fiber mode is borrowed from Realtek's r8127 driver.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/fab6605a-54e2-4f54-b194-11c2b9caaaa9@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 19:59:00 -08:00
Inochi Amaoto
db37c6e510 net: stmmac: dwmac-sophgo: Add phy interface filter
As the SG2042 has an internal rx delay, the delay should be removed
when initializing the mac, otherwise the phy will be misconfigurated.

Fixes: 543009e2d4 ("net: stmmac: dwmac-sophgo: Add support for Sophgo SG2042 SoC")
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Tested-by: Han Gao <rabenda.cn@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251114003805.494387-4-inochiama@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 19:57:38 -08:00
Erni Sri Satya Vennela
be4f1d67ec net: mana: Add standard counter rx_missed_errors
Report standard counter stats->rx_missed_errors
using hc_rx_discards_no_wqe from the hardware.

Add a global workqueue to periodically run
mana_query_gf_stats every 2 seconds to get the latest
info in eth_stats and define a driver capability flag
to notify hardware of the periodic queries.

To avoid repeated failures and log flooding, the workqueue
is not rescheduled if mana_query_gf_stats fails on HWC timeout
error and the stats are reset to 0. Other errors are transient
which will not need a VF reset for recovery.

Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/1763120599-6331-3-git-send-email-ernis@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 19:52:30 -08:00
Erni Sri Satya Vennela
e275d9091c net: mana: Move hardware counter stats from per-port to per-VF context
Move hardware counter (HC) statistics from mana_port_context to
mana_context to enable sharing stats across multiple network ports
on the same MANA VF. Previously, each network port queried
hardware counters independently using MANA_QUERY_GF_STAT command
(GF = Generic Function stats from GDMA hardware), resulting in
redundant queries when multiple ports existed on the same device.

Isolate hardware counter stats by introducing mana_ethtool_hc_stats
in mana_context and update the code to ensure all stats are properly
reported via ethtool -S <interface>, maintaining consistency with
previous behavior.

Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/1763120599-6331-2-git-send-email-ernis@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 19:52:30 -08:00
Russell King (Oracle)
6409249ccc net: stmmac: remove unnecessary .pkt_route queue initialisation
PCI drivers explicitly set .pkt_route to zero. However, as the struct
is allocated using devm_kzalloc(), all members default to zero unless
explicitly initialised. Thus, explicitly setting these to zero is
unnecessary. Remove these. This leaves only stmmac_platform.c where
this is explicitly initialised depending on DT properties.

$ grep '\.pkt_route =' *.c
dwmac-intel.c: plat->rx_queues_cfg[0].pkt_route = 0x0;
dwmac-intel.c: plat->rx_queues_cfg[i].pkt_route = 0x0;
dwmac-loongson.c: plat->rx_queues_cfg[0].pkt_route = 0x0;
stmmac_main.c: if (priv->plat->rx_queues_cfg[queue].pkt_route == 0x0)
stmmac_pci.c: plat->rx_queues_cfg[0].pkt_route = 0x0;
stmmac_pci.c: plat->rx_queues_cfg[i].pkt_route = 0x0;
stmmac_platform.c: plat->rx_queues_cfg[queue].pkt_route = PACKET_AVCPQ;
stmmac_platform.c: plat->rx_queues_cfg[queue].pkt_route = PACKET_PTPQ;
stmmac_platform.c: plat->rx_queues_cfg[queue].pkt_route = PACKET_DCBCPQ;
stmmac_platform.c: plat->rx_queues_cfg[queue].pkt_route = PACKET_UPQ;
stmmac_platform.c: plat->rx_queues_cfg[queue].pkt_route = PACKET_MCBCQ;
stmmac_platform.c: plat->rx_queues_cfg[queue].pkt_route = 0x0;

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vJvjf-0000000EVkO-1ZaO@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 19:47:58 -08:00
Russell King (Oracle)
0a20999ed4 net: stmmac: remove unnecessary .prio queue initialisation
stmmac_platform.c explicitly sets .prio to zero if the snps,priority
property is not present in DT for the queue. However, as the struct
is allocated using devm_kzalloc(), all members default to zero unless
explicitly initialised, and of_property_read_u32() will not write to
its argument if the property is not found. Thus, explicitly setting
these to zero is unnecessary. Remove these.

$ grep '\.prio =' *.c
stmmac_platform.c: plat->rx_queues_cfg[queue].prio = 0;
stmmac_platform.c: plat->tx_queues_cfg[queue].prio = 0;

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vJvja-0000000EVkI-0zUH@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 19:47:58 -08:00
Russell King (Oracle)
c03101cb1b net: stmmac: remove unnecessary .use_prio queue initialisation
Several drivers (see below) explicitly set the queue .use_prio
configuration to false. However, as this structure is allocated using
devm_kzalloc(), all members default to zero unless otherwise explicitly
initialised. .use_prio isn't, so defaults to false. Remove these
unnecessary initialisations, leaving stmmac_platform.c as the only
file that .use_prio is set true.

$ grep 'use_prio =' *.c
dwmac-intel.c: plat->tx_queues_cfg[0].use_prio = false;
dwmac-intel.c: plat->rx_queues_cfg[0].use_prio = false;
dwmac-intel.c: plat->rx_queues_cfg[i].use_prio = false;
dwmac-intel.c: plat->tx_queues_cfg[i].use_prio = false;
dwmac-loongson.c: plat->tx_queues_cfg[0].use_prio = false;
dwmac-loongson.c: plat->rx_queues_cfg[0].use_prio = false;
stmmac_pci.c: plat->tx_queues_cfg[0].use_prio = false;
stmmac_pci.c: plat->rx_queues_cfg[0].use_prio = false;
stmmac_pci.c: plat->tx_queues_cfg[i].use_prio = false;
stmmac_pci.c: plat->rx_queues_cfg[i].use_prio = false;
stmmac_platform.c: plat->rx_queues_cfg[queue].use_prio = false;
stmmac_platform.c: plat->rx_queues_cfg[queue].use_prio = true;
stmmac_platform.c: plat->tx_queues_cfg[queue].use_prio = false;
stmmac_platform.c: plat->tx_queues_cfg[queue].use_prio = true;

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vJvjV-0000000EVkC-0WAV@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 19:47:58 -08:00
Russell King (Oracle)
b6d013b326 net: stmmac: setup default RX channel map in stmmac_plat_dat_alloc()
Setup the default 1:1 RX channel map in stmmac_plat_dat_alloc() and
remove 1:1 initialisations from platform glue drivers.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vJvjQ-0000000EVk6-05z7@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 19:47:58 -08:00
Russell King (Oracle)
d5e788e86f net: stmmac: move initialisation of queues_to_use to stmmac_plat_dat_alloc()
Move the default initialisation of plat_dat->tx_queues_to_use and
plat_dat->rx_queues_to_use to 1 to stmmac_plat_dat_alloc(). This means
platform glue only needs to override this if different.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vJvjK-0000000EVk0-3qb2@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 19:47:58 -08:00
Russell King (Oracle)
bcb145c696 net: stmmac: move initialisation of unicast_filter_entries to stmmac_plat_dat_alloc()
Move the default initialisation of plat_dat->unicast_filter_entries to
1 to stmmac_plat_dat_alloc(). This means platform glue only needs to
override this if different.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vJvjF-0000000EVju-3LfS@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 19:47:58 -08:00
Russell King (Oracle)
07cedb9eed net: stmmac: move initialisation of multicast_filter_bins to stmmac_plat_dat_alloc()
Move the default initialisation of plat_dat->multicast_filter_bins to
HASH_TABLE_SIZE to stmmac_plat_dat_alloc(). This means platform glue
only needs to override this if different.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vJvjA-0000000EVjo-2qVn@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 19:47:57 -08:00
Russell King (Oracle)
528478a746 net: stmmac: move initialisation of maxmtu to stmmac_plat_dat_alloc()
Move the default initialisation of plat_dat->maxmtu to JUMBO_LEN to
stmmac_plat_dat_alloc().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vJvj5-0000000EVji-2EYA@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 19:47:57 -08:00
Russell King (Oracle)
ae4f29712b net: stmmac: move initialisation of clk_csr to stmmac_plat_dat_alloc()
Move the default initialisation of plat_dat->clk_csr to
stmmac_plat_dat_alloc().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vJvj0-0000000EVjb-1jDh@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 19:47:57 -08:00
Russell King (Oracle)
99e6ddaabd net: stmmac: move initialisation of phy_addr to stmmac_plat_dat_alloc()
Move the default initialisation of plat_dat->phy_addr to
stmmac_plat_dat_alloc().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vJviv-0000000EVjV-1CLF@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 19:47:57 -08:00
Russell King (Oracle)
511171e47f net: stmmac: add stmmac_plat_dat_alloc()
Add a function to allocate and initialise the plat_stmmacenet_data
structure with default values.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vJviq-0000000EVjP-0c0l@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 19:47:57 -08:00
Vadim Fedorenko
89ae72f21b qede: convert to use ndo_hwtstamp callbacks
The driver implemented SIOCSHWTSTAMP ioctl cmd only, but it stores
configuration in private structure, so it can be reported back to users.
Implement both ndo_hwtstamp_set and ndo_hwtstamp_set callbacks.
ndo_hwtstamp_set implements a check of unsupported 1-step timestamping
and qede_ptp_cfg_filters() becomes void as it cannot fail anymore.

Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251116094610.3932005-3-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 19:30:12 -08:00
Vadim Fedorenko
889e6af877 bnx2x: convert to use ndo_hwtstamp callbacks
The driver implemented SIOCSHWTSTAMP ioctl command only, but at the same
time it has configuration stored in a private structure. Implement both
ndo_hwtstamp_set and ndo_hwtstamp_get callback using stored info.
ndo_hwtstamp_set callback implements a check for unsupported 1-step
timestamping. The same check is removed from bnx2x_configure_ptp_filters
function as it's not needed anymore. Another call site of
bnx2x_configure_ptp_filters has hwtstamp_ioctl_called guard.

Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251116094610.3932005-2-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 19:30:12 -08:00
Mohsin Bashir
0135333914 eth: fbnic: Configure RDE settings for pause frame
fbnic supports pause frames. When pause frames are enabled presumably
user expects lossless operation from the NIC. Make sure we configure
RDE (Rx DMA Engine) to DROP_NEVER mode to avoid discards due to delays
in fetching Rx descriptors from the host.

While at it enable DROP_NEVER when NIC only has a single queue
configured. In this case the NIC acts as a FIFO so there's no risk
of head-of-line blocking other queues by making RDE wait. If pause
is disabled this just moves the packet loss from the DMA engine to
the Rx buffer.

Remove redundant call to fbnic_config_drop_mode_rcq(), introduced by
commit 0cb4c0a137 ("eth: fbnic: Implement Rx queue
alloc/start/stop/free"). This call does not add value as
fbnic_enable_rcq(), which is called immediately afterward, already
handles this.

Although we do not support autoneg at this time, preserve tx_pause in
.mac_link_up instead of fbnic_phylink_get_pauseparam()

Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20251113232610.1151712-1-mohsin.bashr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 16:57:32 -08:00
Breno Leitao
945499665f mlx5: extract GRXRINGS from .get_rxnfc
Commit 84eaf4359c ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.

Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count() for both the mlx5 ethernet and IPoIB drivers.

The ETHTOOL_GRXRINGS handling was previously kept in .get_rxnfc() to
support "ethtool -x" when CONFIG_MLX5_EN_RXNFC=n. With the new
dedicated .get_rx_ring_count() callback, this is no longer necessary.

This simplifies the RX ring count retrieval and aligns mlx5 with the new
ethtool API for querying RX ring parameters.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20251113-mlx_grxrings-v1-2-0017f2af7dd0@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 16:57:32 -08:00
Breno Leitao
467c3f008d mlx4: extract GRXRINGS from .get_rxnfc
Commit 84eaf4359c ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.

Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().

This simplifies the RX ring count retrieval and aligns mlx4 with the new
ethtool API for querying RX ring parameters. This is compiled tested
only.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20251113-mlx_grxrings-v1-1-0017f2af7dd0@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 16:57:32 -08:00
Jakub Kicinski
c9dfb92de0 Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Tariq Toukan says:

====================
mlx5-next updates 2025-11-13

The following pull-request contains common mlx5 updates

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5: Expose definition for 1600Gbps link mode
  net/mlx5: fs, set non default device per namespace
  net/mlx5: fs, Add other_eswitch support for steering tables
  net/mlx5: Add OTHER_ESWITCH HW capabilities
  net/mlx5: Add direct ST mode support for RDMA
  PCI/TPH: Expose pcie_tph_get_st_table_loc()
  {rdma,net}/mlx5: Query vports mac address from device
====================

Link: https://patch.msgid.link/1763027252-1168760-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-14 18:55:38 -08:00
Russell King (Oracle)
1188741cb5 net: stmmac: rk: use PHY_INTF_SEL_x in functions
Rather than defining one xxx_GMAC_PHY_INTF_SEL_xxx() for each mode,
define xxx_GMAC_PHY_INTF_SEL() which takes the phy_intf_sel value.
Pass the appropriate value into these new macros in the set_to_xxx()
methods.

No change to produced code on aarch64.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/E1vJbPG-0000000EBqb-2cF2@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-14 18:33:57 -08:00
Russell King (Oracle)
5e37047f74 net: stmmac: rk: use PHY_INTF_SEL_x constants
The values used in the xxx_GMAC_PHY_INTF_SEL_xxx() macros are the
phy_intf_sel values used for the dwmac core. Use these to define these
constants.

No change to produced code on aarch64.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/E1vJbPB-0000000EBqV-27GS@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-14 18:33:57 -08:00
Russell King (Oracle)
764ebe423e net: stmmac: rk: convert all bitfields to GRF_FIELD*()
Convert all bitfields to GRF_FIELD() or GRF_FIELD_CONST(), which makes
the bitfield values more readable, and also allows the aarch64 compiler
to produce better code.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vJbP6-0000000EBqP-1cmm@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-14 18:33:57 -08:00
Russell King (Oracle)
ebb07edf97 net: stmmac: rk: replace HIWORD_UPDATE() with GRF_FIELD()
Provide GRF_FIELD() which takes the high/low bit numbers of the field
and field value, generates the mask and passes it to FIELD_PREP_WM16.
Replace all HIWORD_UPDATE() instances with this.

No change to produced code on aarch64.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/E1vJbP1-0000000EBqJ-1AjR@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-14 18:33:57 -08:00
Breno Leitao
04ca7a69a3 net: bnx2x: convert to use get_rx_ring_count
Convert the bnx2x driver to use the new .get_rx_ring_count ethtool
operation instead of implementing .get_rxnfc solely for handling
ETHTOOL_GRXRINGS command. This simplifies the code by replacing the
switch statement with a direct return of the queue count.

The new callback provides the same functionality in a more direct way,
following the ongoing ethtool API modernization.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251112-bnx_grxrings-v1-1-1c2cb73979e2@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-14 18:04:21 -08:00
Breno Leitao
f455d3f02d net: ixgbe: convert to use .get_rx_ring_count
Convert the ixgbe driver to use the new .get_rx_ring_count ethtool
operation for handling ETHTOOL_GRXRINGS command. This simplifies the
code by extracting the ring count logic into a dedicated callback.

The new callback provides the same functionality in a more direct way,
following the ongoing ethtool API modernization.

This was compile-tested only.

Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20251113-ixgbe_gxrings-v2-1-0ecf57808a78@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-14 18:02:22 -08:00