Commit Graph

130378 Commits

Author SHA1 Message Date
Furong Xu
ba5f78505f net: stmmac: Drop redundant skb_mark_for_recycle() for SKB frags
After commit df542f6693 ("net: stmmac: Switch to zero-copy in
non-XDP RX path"), SKBs are always marked for recycle, it is redundant
to mark SKBs more than once when new frags are appended.

Signed-off-by: Furong Xu <0x1207@gmail.com>
Link: https://patch.msgid.link/20250117062805.192393-1-0x1207@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-20 12:03:27 -08:00
Xiangqian Zhang
f6f2e946aa net: mii: Fix the Speed display when the network cable is not connected
Two different models of usb card, the drivers are r8152 and asix. If no
network cable is connected, Speed = 10Mb/s. This problem is repeated in
linux 3.10, 4.19, 5.4, 6.12. This problem also exists on the latest
kernel. Both drivers call mii_ethtool_get_link_ksettings,
but the value of cmd->base.speed in this
function can only be SPEED_1000 or SPEED_100 or SPEED_10.
When the network cable is not connected, set cmd->base.speed
=SPEED_UNKNOWN.

Signed-off-by: Xiangqian Zhang <zhangxiangqian@kylinos.cn>
Link: https://patch.msgid.link/20250117094603.4192594-1-zhangxiangqian@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-20 12:02:38 -08:00
Jakub Kicinski
99d028c634 eth: bnxt: update header sizing defaults
300-400B RPC requests are fairly common. With the current default
of 256B HDS threshold bnxt ends up splitting those, lowering PCIe
bandwidth efficiency and increasing the number of memory allocation.

Increase the HDS threshold to fit 4 buffers in a 4k page.
This works out to 640B as the threshold on a typical kernel confing.
This change increases the performance for a microbenchmark which
receives 400B RPCs and sends empty responses by 4.5%.
Admittedly this is just a single benchmark, but 256B works out to
just 6 (so 2 more) packets per head page, because shinfo size
dominates the headers.

Now that we use page pool for the header pages I was also tempted
to default rx_copybreak to 0, but in synthetic testing the copybreak
size doesn't seem to make much difference.

Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250119020518.1962249-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-20 11:44:58 -08:00
Jakub Kicinski
bee018052d eth: bnxt: allocate enough buffer space to meet HDS threshold
Now that we can configure HDS threshold separately from the rx_copybreak
HDS threshold may be higher than rx_copybreak.

We need to make sure that we have enough space for the headers.

Fixes: 6b43673a25 ("bnxt_en: add support for hds-thresh ethtool command")
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250119020518.1962249-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-20 11:44:58 -08:00
Jakub Kicinski
928459bbda net: ethtool: populate the default HDS params in the core
The core has the current HDS config, it can pre-populate the values
for the drivers. While at it, remove the zero-setting in netdevsim.
Zero are the default values since the config is zalloc'ed.

Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250119020518.1962249-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-20 11:44:58 -08:00
Jakub Kicinski
e58263e911 eth: bnxt: apply hds_thrs settings correctly
Use the pending config for hds_thrs. Core will only update the "current"
one after we return success. Without this change 2 reconfigs would be
required for the setting to reach the device.

Fixes: 6b43673a25 ("bnxt_en: add support for hds-thresh ethtool command")
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250119020518.1962249-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-20 11:44:57 -08:00
Jakub Kicinski
3c836451ca net: move HDS config from ethtool state
Separate the HDS config from the ethtool state struct.
The HDS config contains just simple parameters, not state.
Having it as a separate struct will make it easier to clone / copy
and also long term potentially make it per-queue.

Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250119020518.1962249-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-20 11:44:57 -08:00
Aleksander Jan Bajkowski
64ff63aeef net: phy: realtek: HWMON support for standalone versions of RTL8221B and RTL8251
HWMON support has been added for the RTL8221/8251 PHYs integrated together
with the MAC inside the RTL8125/8126 chips. This patch extends temperature
reading support for standalone variants of the mentioned PHYs.

I don't know whether the earlier revisions of the RTL8226 also have a
built-in temperature sensor, so they have been skipped for now.

Tested on RTL8221B-VB-CG.

Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-20 12:30:56 +00:00
Russell King (Oracle)
af10e092b7 net: phylink: always do a major config when attaching a SFP PHY
Background: https://lore.kernel.org/r/20250107123615.161095-1-ericwouds@gmail.com

Since adding negotiation of in-band capabilities, it is no longer
sufficient to just look at the MLO_AN_xxx mode and PHY interface to
decide whether to do a major configuration, since the result now
depends on the capabilities of the attaching PHY.

Always trigger a major configuration in this case.

Testing log: https://lore.kernel.org/r/f20c9744-3953-40e7-a9c9-5534b25d2e2a@gmail.com

Reported-by: Eric Woudstra <ericwouds@gmail.com>
Tested-by: Eric Woudstra <ericwouds@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-20 12:17:35 +00:00
Roger Quadros
4395a44acb net: ethernet: ti: am65-cpsw: fix freeing IRQ in am65_cpsw_nuss_remove_tx_chns()
When getting the IRQ we use k3_udma_glue_tx_get_irq() which returns
negative error value on error. So not NULL check is not sufficient
to deteremine if IRQ is valid. Check that IRQ is greater then zero
to ensure it is valid.

There is no issue at probe time but at runtime user can invoke
.set_channels which results in the following call chain.
am65_cpsw_set_channels()
 am65_cpsw_nuss_update_tx_rx_chns()
  am65_cpsw_nuss_remove_tx_chns()
  am65_cpsw_nuss_init_tx_chns()

At this point if am65_cpsw_nuss_init_tx_chns() fails due to
k3_udma_glue_tx_get_irq() then tx_chn->irq will be set to a
negative value.

Then, at subsequent .set_channels with higher channel count we
will attempt to free an invalid IRQ in am65_cpsw_nuss_remove_tx_chns()
leading to a kernel warning.

The issue is present in the original commit that introduced this driver,
although there, am65_cpsw_nuss_update_tx_rx_chns() existed as
am65_cpsw_nuss_update_tx_chns().

Fixes: 93a7653031 ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-20 10:02:29 +00:00
Krzysztof Kozlowski
544c939406 dsa: Use str_enable_disable-like helpers
Replace ternary (condition ? "enable" : "disable") syntax with helpers
from string_choices.h because:
1. Simple function call with one argument is easier to read.  Ternary
   operator has three arguments and with wrapping might lead to quite
   long code.
2. Is slightly shorter thus also easier to read.
3. It brings uniformity in the text - same string.
4. Allows deduping by the linker, which results in a smaller binary
   file.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-20 09:23:25 +00:00
Jakub Kicinski
66cc61a25c Merge tag 'wireless-next-2025-01-17' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Kalle Valo says:

====================
wireless-next patches for v6.14

Most likely the last "new features" pull request for v6.14 and this is
a bigger one. Multi-Link Operation (MLO) work continues both in stack
in drivers. Few new devices supported and usual fixes all over.

Major changes:

cfg80211
 * Emergency Preparedness Communication Services (EPCS) station mode support

mac80211
 * an option to filter a sta from being flushed
 * some support for RX Operating Mode Indication (OMI) power saving
 * support for adding and removing station links for MLO

iwlwifi
 * new device ids
 * rework firmware error handling and restart

rtw88
 * RTL8812A: RFE type 2 support
 * LED support

rtw89
 * variant info to support RTL8922AE-VS

mt76
 * mt7996: single wiphy multiband support (preparation for MLO)
 * mt7996: support for more variants
 * mt792x: P2P_DEVICE support
 * mt7921u: TP-Link TXE50UH support

ath12k
 * enable MLO for QCN9274 (although it seems to be broken with dual
   band devices)
 * MLO radar detection support
 * debugfs: transmit buffer OFDMA, AST entry and puncture stats

* tag 'wireless-next-2025-01-17' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (322 commits)
  wifi: brcmfmac: fix NULL pointer dereference in brcmf_txfinalize()
  wifi: rtw88: add RTW88_LEDS depends on LEDS_CLASS to Kconfig
  wifi: wilc1000: unregister wiphy only after netdev registration
  wifi: cfg80211: adjust allocation of colocated AP data
  wifi: mac80211: fix memory leak in ieee80211_mgd_assoc_ml_reconf()
  wifi: ath12k: fix key cache handling
  wifi: ath12k: Fix uninitialized variable access in ath12k_mac_allocate() function
  wifi: ath12k: Remove ath12k_get_num_hw() helper function
  wifi: ath12k: Refactor the ath12k_hw get helper function argument
  wifi: ath12k: Refactor ath12k_hw set helper function argument
  wifi: mt76: mt7996: add implicit beamforming support for mt7992
  wifi: mt76: mt7996: fix beacon command during disabling
  wifi: mt76: mt7996: fix ldpc setting
  wifi: mt76: mt7996: fix definition of tx descriptor
  wifi: mt76: connac: adjust phy capabilities based on band constraints
  wifi: mt76: mt7996: fix incorrect indexing of MIB FW event
  wifi: mt76: mt7996: fix HE Phy capability
  wifi: mt76: mt7996: fix the capability of reception of EHT MU PPDU
  wifi: mt76: mt7996: add max mpdu len capability
  wifi: mt76: mt7921: avoid undesired changes of the preset regulatory domain
  ...
====================

Link: https://patch.msgid.link/20250117203529.72D45C4CEDD@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18 17:46:54 -08:00
Heiner Kallweit
12d5151be0 net: phy: remove leftovers from switch to linkmode bitmaps
We have some leftovers from the switch to linkmode bitmaps which
- have never been used
- are not used any longer
- have no user outside phy_device.c
So remove them.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/5493b96e-88bb-4230-a911-322659ec5167@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18 17:40:10 -08:00
Jakub Kicinski
17656eb5cf eth: bnxt: fix string truncation warning in FW version
W=1 builds with gcc 14.2.1 report:

drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:4193:32: error: ‘%s’ directive output may be truncated writing up to 31 bytes into a region of size 27 [-Werror=format-truncation=]
 4193 |                          "/pkg %s", buf);

It's upset that we let buf be full length but then we use 5
characters for "/pkg ".

The builds is also clear with clang version 19.1.5 now.

Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250117183726.1481524-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18 17:32:45 -08:00
Shinas Rasheed
f840399395 octeon_ep_vf: update tx/rx stats locally for persistence
Update tx/rx stats locally, so that ndo_get_stats64()
can use that and not rely on per queue resources to obtain statistics.
The latter used to cause race conditions when the device stopped.

Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Link: https://patch.msgid.link/20250117094653.2588578-5-srasheed@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18 17:26:39 -08:00
Shinas Rasheed
cc0e510cc8 octeon_ep_vf: remove firmware stats fetch in ndo_get_stats64
The firmware stats fetch call that happens in ndo_get_stats64()
is currently not required, and causes a warning to issue.

The corresponding warn log for the PF is given below:

[  123.316837] ------------[ cut here ]------------
[  123.316840] Voluntary context switch within RCU read-side critical section!
[  123.316917] pc : rcu_note_context_switch+0x2e4/0x300
[  123.316919] lr : rcu_note_context_switch+0x2e4/0x300
[  123.316947] Call trace:
[  123.316949]  rcu_note_context_switch+0x2e4/0x300
[  123.316952]  __schedule+0x84/0x584
[  123.316955]  schedule+0x38/0x90
[  123.316956]  schedule_timeout+0xa0/0x1d4
[  123.316959]  octep_send_mbox_req+0x190/0x230 [octeon_ep]
[  123.316966]  octep_ctrl_net_get_if_stats+0x78/0x100 [octeon_ep]
[  123.316970]  octep_get_stats64+0xd4/0xf0 [octeon_ep]
[  123.316975]  dev_get_stats+0x4c/0x114
[  123.316977]  dev_seq_printf_stats+0x3c/0x11c
[  123.316980]  dev_seq_show+0x1c/0x40
[  123.316982]  seq_read_iter+0x3cc/0x4e0
[  123.316985]  seq_read+0xc8/0x110
[  123.316987]  proc_reg_read+0x9c/0xec
[  123.316990]  vfs_read+0xc8/0x2ec
[  123.316993]  ksys_read+0x70/0x100
[  123.316995]  __arm64_sys_read+0x20/0x30
[  123.316997]  invoke_syscall.constprop.0+0x7c/0xd0
[  123.317000]  do_el0_svc+0xb4/0xd0
[  123.317002]  el0_svc+0xe8/0x1f4
[  123.317005]  el0t_64_sync_handler+0x134/0x150
[  123.317006]  el0t_64_sync+0x17c/0x180
[  123.317008] ---[ end trace 63399811432ab69b ]---

Fixes: c3fad23cdc ("octeon_ep_vf: add support for ndo ops")
Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Link: https://patch.msgid.link/20250117094653.2588578-4-srasheed@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18 17:26:39 -08:00
Shinas Rasheed
10fad79846 octeon_ep: update tx/rx stats locally for persistence
Update tx/rx stats locally, so that ndo_get_stats64()
can use that and not rely on per queue resources to obtain statistics.
The latter used to cause race conditions when the device stopped.

Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Link: https://patch.msgid.link/20250117094653.2588578-3-srasheed@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18 17:26:39 -08:00
Shinas Rasheed
1f64255bb7 octeon_ep: remove firmware stats fetch in ndo_get_stats64
The firmware stats fetch call that happens in ndo_get_stats64()
is currently not required, and causes a warning to issue.

The warn log is given below:

[  123.316837] ------------[ cut here ]------------
[  123.316840] Voluntary context switch within RCU read-side critical section!
[  123.316917] pc : rcu_note_context_switch+0x2e4/0x300
[  123.316919] lr : rcu_note_context_switch+0x2e4/0x300
[  123.316947] Call trace:
[  123.316949]  rcu_note_context_switch+0x2e4/0x300
[  123.316952]  __schedule+0x84/0x584
[  123.316955]  schedule+0x38/0x90
[  123.316956]  schedule_timeout+0xa0/0x1d4
[  123.316959]  octep_send_mbox_req+0x190/0x230 [octeon_ep]
[  123.316966]  octep_ctrl_net_get_if_stats+0x78/0x100 [octeon_ep]
[  123.316970]  octep_get_stats64+0xd4/0xf0 [octeon_ep]
[  123.316975]  dev_get_stats+0x4c/0x114
[  123.316977]  dev_seq_printf_stats+0x3c/0x11c
[  123.316980]  dev_seq_show+0x1c/0x40
[  123.316982]  seq_read_iter+0x3cc/0x4e0
[  123.316985]  seq_read+0xc8/0x110
[  123.316987]  proc_reg_read+0x9c/0xec
[  123.316990]  vfs_read+0xc8/0x2ec
[  123.316993]  ksys_read+0x70/0x100
[  123.316995]  __arm64_sys_read+0x20/0x30
[  123.316997]  invoke_syscall.constprop.0+0x7c/0xd0
[  123.317000]  do_el0_svc+0xb4/0xd0
[  123.317002]  el0_svc+0xe8/0x1f4
[  123.317005]  el0t_64_sync_handler+0x134/0x150
[  123.317006]  el0t_64_sync+0x17c/0x180
[  123.317008] ---[ end trace 63399811432ab69b ]---

Fixes: 6a610a46ba ("octeon_ep: add support for ndo ops")
Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Link: https://patch.msgid.link/20250117094653.2588578-2-srasheed@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18 17:26:38 -08:00
Sean Anderson
9d301a53a5 net: xilinx: axienet: Report an error for bad coalesce settings
Instead of silently ignoring invalid/unsupported settings, report an
error. Additionally, relax the check for non-zero usecs to apply only
when it will be used (i.e. when frames != 1).

Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://patch.msgid.link/20250116232954.2696930-3-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18 17:18:53 -08:00
Sean Anderson
5cff9d1756 net: xilinx: axienet: Add some symbolic constants for IRQ delay timer
Instead of using literals, add some symbolic constants for the IRQ delay
timer calculation.

Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://patch.msgid.link/20250116232954.2696930-2-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18 17:18:53 -08:00
Philipp Stanner
dfa2f4d5f9 PCI: Remove devres from pci_intx()
pci_intx() is a hybrid function which can sometimes be managed through
devres. This hybrid nature is undesirable.

Since all users of pci_intx() have by now been ported either to
always-managed pcim_intx() or never-managed pci_intx_unmanaged(), the
devres functionality can be removed from pci_intx().

Consequently, pci_intx_unmanaged() is now redundant, because pci_intx()
itself is now unmanaged.

Remove the devres functionality from pci_intx(). Have all users of
pci_intx_unmanaged() call pci_intx(). Remove pci_intx_unmanaged().

Link: https://lore.kernel.org/r/20241209130632.132074-13-pstanner@redhat.com
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
2025-01-18 14:38:49 -06:00
Philipp Stanner
41400bc533 net/ethernet: Use never-managed version of pci_intx()
pci_intx() is a hybrid function which can sometimes be managed through
devres. To remove this hybrid nature from pci_intx(), it is necessary to
port users to either an always-managed or a never-managed version.

broadcom/bnx2x and brocade/bna enable their PCI devices with
pci_enable_device(). Thus, they need the never-managed version.

Replace pci_intx() with pci_intx_unmanaged().

Link: https://lore.kernel.org/r/20241209130632.132074-5-pstanner@redhat.com
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
2025-01-18 14:38:49 -06:00
Philipp Stanner
71cf28c649 wifi: qtnfmac: use always-managed version of pcim_intx()
pci_intx() is a hybrid function which can sometimes be managed through
devres. To remove this hybrid nature from pci_intx(), it is necessary to
port users to either an always-managed or a never-managed version.

qtnfmac enables its PCI device with pcim_enable_device(). Thus, it needs
the always-managed version.

Replace pci_intx() with pcim_intx().

Link: https://lore.kernel.org/r/20241209130632.132074-11-pstanner@redhat.com
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Kalle Valo <kvalo@kernel.org>
2025-01-18 14:38:49 -06:00
Vladimir Oltean
e777a4b39b net: dsa: felix: report timestamping stats from the ocelot library
Make the linkage between the DSA user port ethtool_ops :: get_ts_info
and the implementation from the Ocelot switch library.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250116104628.123555-5-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-17 20:01:10 -08:00
Vladimir Oltean
8fbd24f3d1 net: mscc: ocelot: add TX timestamping statistics
Add an u64 hardware timestamping statistics structure for each ocelot
port. Export a function from the common switch library for reporting
them to ethtool. This is called by the ocelot switchdev front-end for
now.

Note that for the switchdev driver, we report the one-step PTP packets
as unconfirmed, even though in principle, for some transmission
mechanisms like FDMA, we may be able to confirm transmission and bump
the "pkts" counter in ocelot_fdma_tx_cleanup() instead. I don't have
access to hardware which uses the switchdev front-end, and I've kept the
implementation simple.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250116104628.123555-4-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-17 20:01:10 -08:00
Amit Cohen
448269fa05 mlxsw: Do not store Tx header length as driver parameter
Tx header handling was moved to PCI code, as there is no several drivers
which configure Tx header differently. Tx header length is stored as driver
parameter, this is not really necessary as it always stores the same value.
Remove this field and use the macro MLXSW_TXHDR_LEN explicitly.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/1fb7b3f007de4d311e559c8a954b673d0895d5e9.1737044384.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-17 19:58:00 -08:00
Amit Cohen
6ce1aac748 mlxsw: Move Tx header handling to PCI driver
Tx header should be added to all packets transmitted from the CPU to
Spectrum ASICs. Historically, handling this header was added as a driver
function, as Tx header is different between Spectrum and Switch-X. See
SwitchX implementation in commit 31557f0f97 ("mlxsw: Introduce
Mellanox SwitchX-2 ASIC support"). From May 2021, there is no support
for SwitchX-2 ASIC, and all the relevant code was removed.

For now, there is no justification to handle Tx header as part of
spectrum.c, we can handle this as part of PCI, in skb_transmit().

A future patch set will add support for XDP in mlxsw driver, to support
XDP_TX and XDP_REDIRECT actions, Tx header should be added before
transmitting the packet. As preparation for this, move Tx header handling
to PCI driver, so then XDP code will not have to call API from spectrum.c.
This also improves the code as now Tx header is pushed just before
transmitting, so it is not done from many flows which might miss something.

Note that for PTP, we should configure Tx header differently, use the
fields from mlxsw_txhdr_info to configure the packets correctly in PCI
driver. Handle VLAN tagging in switch driver, verify that packet which
should be transmitted as data is tagged, otherwise, tag it.

Remove the calls for thxdr_construct() functions, as now this is done as
part of skb_transmit().

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/293a81e6f7d59a8ec9f9592edb7745536649ff11.1737044384.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-17 19:58:00 -08:00
Amit Cohen
c89d9c3d0a mlxsw: Define Tx header fields in txheader.h
The next patch will move Tx header constructing to pci.c. As preparation,
move the definitions of Tx header fields from spectrum.c to txheader.h,
so pci.c will include this header and can access the fields.

Remove 'etclass' which is not used.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/2250b5cb3998ab4850fc8251c3a0f5926d32e194.1737044384.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-17 19:58:00 -08:00
Amit Cohen
e8e08279d3 mlxsw: Initialize txhdr_info according to PTP operations
A next patch will construct Tx header as part of pci.c. The switch driver
(mlxsw_spectrum.ko) should encapsulate all the differences between the
different ASICs and the bus driver (mlxsw_pci.ko) should remain unaware.

As preparation, add the relevant info as part of mlxsw_txhdr_info
structure, so later bus driver will merely construct the Tx header based on
information passed from the switch driver.

Most of the packets are transmitted as control packets, but PTP packets in
Spectrum-2 and Spectrum-3 should be handled differently. The driver
transmits them as data packets, and the default VLAN tag (4095) is added if
the packet is not already tagged.

Extend PTP operations to store a boolean which indicates whether packets
should be transmitted as data packets. Set it for Spectrum-2 and
Spectrum-3 only. Extend mlxsw_txhdr_info to store fields which will be used
later to construct Tx header. Initialize such fields according to the new
boolean which is stored in PTP operations.

Note that for now, mlxsw_txhdr_info structure is initialized, but not used,
a next patch will use it.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/efcaacd4bedef524e840a0c29f96cebf2c4bc0e0.1737044384.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-17 19:57:59 -08:00
Amit Cohen
3498566555 mlxsw: Add mlxsw_txhdr_info structure
mlxsw_tx_info structure is used to store information that is needed to
process Tx completions when Tx time stamps are requested. A next patch
will move Tx header handling from spectrum.c to pci.c. For that, some
additional fields which are related to Tx should be passed to pci driver.

As preparation, create an extended structure, called mlxsw_txhdr_info,
and store mlxsw_tx_info inside. The new fields should not be added to
mlxsw_tx_info structure as it is stored in the SKB control block which is
of limited size.

The next patch will extend the new structure with some fields which are
needed in order to construct Tx header.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/93aed1961f046f79f46869bab37a3faa5027751d.1737044384.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-17 19:57:59 -08:00
Colin Ian King
41c5d104f3 net/mlx5: fix unintentional sign extension on shift of dest_attr->vport.vhca_id
Shifting dest_attr->vport.vhca_id << 16 results in a promotion from an
unsigned 16 bit integer to a 32 bit signed integer, this is then sign
extended to a 64 bit unsigned long on 64 bitarchitectures. If vhca_id is
greater than 0x7fff then this leads to a sign extended result where all
the upper 32 bits of idx are set to 1. Fix this by casting vhca_id
to the same type as idx before performing the shift.

Fixes: 8e2e08a6d1 ("net/mlx5: fs, add support for dest vport HWS action")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Moshe Shemesh <moshe@nvidia.com>
Link: https://patch.msgid.link/20250116181700.96437-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-17 19:52:34 -08:00
Jakub Kicinski
ba0209bd18 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
ice: support FW Recovery Mode

Konrad Knitter says:

Enable update of card in FW Recovery Mode

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: support FW Recovery Mode
  devlink: add devl guard
  pldmfw: enable selected component update
====================

Link: https://patch.msgid.link/20250116212059.1254349-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-17 19:45:35 -08:00
Thorsten Blum
3df22e7510 hv_netvsc: Replace one-element array with flexible array member
Replace the deprecated one-element array with a modern flexible array
member in the struct nvsp_1_message_send_receive_buffer_complete.

Use struct_size_t(,,1) instead of sizeof() to maintain the same size.

Compile-tested only.

Link: https://github.com/KSPP/linux/issues/79
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Tested-by: Roman Kisel <romank@linux.microsoft.com>
Reviewed-by: Roman Kisel <romank@linux.microsoft.com>
Link: https://patch.msgid.link/20250116211932.139564-2-thorsten.blum@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-17 19:07:48 -08:00
Guillaume Nault
2ce7289f18 gtp: Prepare ip4_route_output_gtp() to .flowi4_tos conversion.
Use inet_sk_dscp() to get the socket DSCP value as dscp_t, instead of
ip_sock_rt_tos() which returns a __u8. This will ease the conversion
of fl4->flowi4_tos to dscp_t, which now just becomes a matter of
dropping the inet_dscp_to_dsfield() call.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/06bdb310a075355ff059cd32da2efc29a63981c9.1737034675.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-17 19:01:33 -08:00
Maher Sanalla
70d81f25cc net/mlxfw: Drop hard coded max FW flash image size
Currently, mlxfw kernel module limits FW flash image size to be
10MB at most, preventing the ability to burn recent BlueField-3
FW that exceeds the said size limit.

Thus, drop the hard coded limit. Instead, rely on FW's
max_component_size threshold that is reported in MCQI register
as the size limit for FW image.

Fixes: 410ed13cae ("Add the mlxfw module for Mellanox firmware flash process")
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/1737030796-1441634-1-git-send-email-moshe@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-17 19:01:07 -08:00
Daniel Golle
d3eb585498 net: phy: realtek: always clear NBase-T lpa
Clear NBase-T link partner advertisement before calling
rtlgen_read_status() to avoid phy_resolve_aneg_linkmode() wrongly
setting speed and duplex.

This fixes bogus 2.5G/5G/10G link partner advertisement and thus
speed and duplex being set by phy_resolve_aneg_linkmode() due to stale
NBase-T lpa.

Fixes: 68d5cd09e8 ("net: phy: realtek: change order of calls in C22 read_status()")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-17 12:01:40 +00:00
Daniel Golle
ea8318cb33 net: phy: realtek: clear master_slave_state if link is down
rtlgen_decode_physr() which sets master_slave_state isn't called in case
the link is down and other than rtlgen_read_status(),
rtl822x_c45_read_status() doesn't implicitely clear master_slave_state.

Avoid stale master_slave_state by always setting it to
MASTER_SLAVE_STATE_UNKNOWN in rtl822x_c45_read_status() in case the link
is down.

Fixes: 081c9c0265 ("net: phy: realtek: read duplex and gbit master from PHYSR register")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-17 12:01:40 +00:00
Daniel Golle
34d5a86ff7 net: phy: realtek: clear 1000Base-T lpa if link is down
Only read 1000Base-T link partner advertisement if autonegotiation has
completed and otherwise 1000Base-T link partner advertisement bits.

This fixes bogus 1000Base-T link partner advertisement after link goes
down (eg. by disconnecting the wire).
Fixes: 5cb409b396 ("net: phy: realtek: clear 1000Base-T link partner advertisement")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-17 12:01:40 +00:00
Divya Koppera
93359197f2 net: phy: microchip_rds_ptp : Add PEROUT feature library for RDS PTP supported Microchip phys
Adds PEROUT feature for RDS PTP supported phys where
we can generate periodic output signal on supported
pin out

Signed-off-by: Divya Koppera <divya.koppera@microchip.com>
Link: https://patch.msgid.link/20250115090634.12941-4-divya.koppera@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-16 17:27:57 -08:00
Divya Koppera
8541fc12ed net: phy: microchip_t1: Enable pin out specific to lan887x phy for PEROUT signal
Adds support for enabling pin out that is required
to generate periodic output signal on lan887x phy.

Signed-off-by: Divya Koppera <divya.koppera@microchip.com>
Link: https://patch.msgid.link/20250115090634.12941-3-divya.koppera@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-16 17:27:56 -08:00
Divya Koppera
bf356a6df7 net: phy: microchip_rds_ptp: Header file library changes for PEROUT
This ptp header file library changes will cover PEROUT
macros that are required to generate periodic output
from pin out

Signed-off-by: Divya Koppera <divya.koppera@microchip.com>
Link: https://patch.msgid.link/20250115090634.12941-2-divya.koppera@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-16 17:27:56 -08:00
Russell King (Oracle)
4218647d45 net: stmmac: convert to phylink managed EEE support
Convert stmmac to use phylink managed EEE support rather than delving
into phylib:

1. Move the stmmac_eee_init() calls out of mac_link_down() and
   mac_link_up() methods into the new mac_{enable,disable}_lpi()
   methods. We leave the calls to stmmac_set_eee_pls() in place as
   these change bits which tell the EEE hardware when the link came
   up or down, and is used for a separate hardware timer. However,
   symmetrically conditionalise this with priv->dma_cap.eee.

2. Update the current LPI timer each time LPI is enabled - which we
   need for software-timed LPI.

3. With phylink managed EEE, phylink manages the receive clock stop
   configuration via phylink_config.eee_rx_clk_stop_enable. Set this
   appropriately which makes the call to phy_eee_rx_clock_stop()
   redundant.

4. From what I can work out, all supported interfaces support LPI
   signalling on stmmac (there's no restriction implemented.) It
   also appears to support LPI at all full duplex speeds at or over
   100M. Set these capabilities.

5. The default timer appears to be derived from a module parameter.
   Set this the same, although we keep code that reconfigures the
   timer in stmmac_init_phy().

6. Remove the direct call to phy_support_eee(), which phylink will do
   on the drivers behalf if phylink_config.eee_enabled_default is set.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/E1tYAEG-0014QH-9O@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-16 17:23:00 -08:00
Russell King (Oracle)
bd691d5ca9 net: lan743x: convert to phylink managed EEE
Convert lan743x to phylink managed EEE:
- Set the lpi_capabilties.
- Move the call to lan743x_mac_eee_enable() into the enable/disable
  tx_lpi functions.
- Ensure that EEEEN is clear during probe.
- Move the setting of the LPI timer into mac_enable_tx_lpi().
- Move reading of LPI timer to phylink initialisation to set the
  default timer value.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/E1tYAEB-0014QB-4s@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-16 17:23:00 -08:00
Russell King (Oracle)
a66447966f net: lan743x: use netdev in lan743x_phylink_mac_link_down()
Use the netdev that we already have in lan743x_phylink_mac_link_down().

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/E1tYAE5-0014Q5-Up@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-16 17:23:00 -08:00
Russell King (Oracle)
b53b14786e net: mvpp2: add EEE implementation
Add EEE support for mvpp2, using phylink's EEE implementation, which
means we just need to implement the two methods for LPI control, and
with the initial configuration. Only SGMII mode is supported, so only
100M and 1G speeds.

Disabling LPI requires clearing a single bit. Enabling LPI needs a full
configuration of several values, as the timer values are dependent on
the MAC operating speed.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/E1tYAE0-0014Pz-R9@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-16 17:23:00 -08:00
Russell King (Oracle)
ac79927dc8 net: mvneta: convert to phylink EEE implementation
Convert mvneta to use phylink's EEE implementation by implementing the
two LPI control methods, and adding the initial configuration and
capabilities.

Although disabling LPI requires clearing a single bit, for safety we
clear the manual mode and force bits to ensure that auto mode will be
used.

Enabling LPI needs a full configuration of several values, as the timer
values are dependent on the MAC operating speed, as per the original
code.

As Armada 388 states that EEE is only supported in "SGMII" modes, mark
this in lpi_interfaces. Testing with RGMII on the Clearfog platform
indicates that the receive path fails to detect LPI over RGMII.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/E1tYADv-0014Pt-NO@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-16 17:22:59 -08:00
Russell King (Oracle)
03abf2a7c6 net: phylink: add EEE management
Add EEE management to phylink, making use of the phylib implementation.
This will only be used where a MAC driver populates the methods and
capabilities bitfield, otherwise we keep our old behaviour.

Phylink will keep track of the EEE configuration, including the clock
stop abilities at each end of the MAC to PHY link, programming the PHY
appropriately and preserving the LPI configuration should the PHY go
away.

Phylink will call into the MAC driver when LPI needs to be enabled or
disabled, with the requirement that the MAC have LPI disabled prior
to the netdev being brought up (in other words, it will only call
mac_disable_tx_lpi() if it has already called mac_enable_tx_lpi().)

Support for phylink managed EEE is enabled by populating both tx_lpi
MAC operations method pointers, and filling in both LPI interfaces
and capabilities. If the methods are provided but the LPI interfaces
or capabilities remain empty, this indicates to phylink that EEE is
implemented by the driver but the hardware it is driving does not
support EEE, and thus the ethtool set_eee() and get_eee() methods will
return EOPNOTSUPP.

No validation of the LPI timer value is performed by this patch.

For interface modes which do not support LPI, we make no attempt to
manipulate the phylib EEE advertisement, but instead refuse to
activate LPI at the MAC, noting it at debug message level.

We also restrict the advertisement and reported userspace support
linkmode masks according to the lpi_capabilities provided to
phylink by the MAC driver.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/E1tYADq-0014Pn-J1@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-16 17:22:59 -08:00
Russell King (Oracle)
a17ceec62f net: phylink: add phylink_link_is_up() helper
Add a helper to determine whether the link is up or down. Currently
this is only used in one location, but becomes necessary to test
when reconfiguring EEE.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/E1tYADl-0014Ph-EV@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-16 17:22:59 -08:00
Russell King (Oracle)
a00e0d34c0 net: phy: add support for querying PHY clock stop capability
Add support for querying whether the PHY allows the transmit xMII clock
to be stopped while in LPI mode. This will be used by phylink to pass
to the MAC driver so it can configure the generation of the xMII clock
appropriately.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/E1tYADg-0014Pb-AJ@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-16 17:22:59 -08:00
Konrad Knitter
1de25c6b98 ice: support FW Recovery Mode
Recovery Mode is intended to recover from a fatal failure scenario in
which the device is not accessible to the host, meaning the firmware is
non-responsive.

The purpose of the Firmware Recovery Mode is to enable software tools to
update firmware and/or device configuration so the fatal error can be
resolved.

Recovery Mode Firmware supports a limited set of admin commands required
for NVM update.
Recovery Firmware does not support hardware interrupts so a polling mode
is used.

The driver will expose only the minimum set of devlink commands required
for the recovery of the adapter.

Using an appropriate NVM image, the user can recover the adapter using
the devlink flash API.

Prior to 4.20 E810 Adapter Recovery Firmware supports only the update
and erase of the "fw.mgmt" component.

E810 Adapter Recovery Firmware doesn't support selected preservation of
cards settings or identifiers.

The following command can be used to recover the adapter:

$ devlink dev flash <pci-address> <update-image.bin> component fw.mgmt
  overwrite settings overwrite identifier

Newer FW versions (4.20 or newer) supports update of "fw.undi" and
"fw.netlist" components.

$ devlink dev flash <pci-address> <update-image.bin>

Tested on Intel Corporation Ethernet Controller E810-C for SFP
FW revision 3.20 and 4.30.

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Konrad Knitter <konrad.knitter@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-16 13:05:06 -08:00