Commit Graph

1427959 Commits

Author SHA1 Message Date
Johan Hovold
f4ac0cc88e net: usb: lan78xx: drop redundant device reference
Driver core holds a reference to the USB interface and its parent USB
device while the interface is bound to a driver and there is no need to
take additional references unless the structures are needed after
disconnect.

Drop the redundant device reference to reduce cargo culting, make it
easier to spot drivers where an extra reference is needed, and reduce
the risk of memory leaks when drivers fail to release it.

Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260305105006.16415-1-johan@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 19:19:38 -08:00
Jakub Kicinski
f67ab9d810 Merge branch 'net-ntb_netdev-add-multi-queue-support'
Koichiro Den says:

====================
net: ntb_netdev: Add Multi-queue support

ntb_netdev currently hard-codes a single NTB transport queue pair, which
means the datapath effectively runs as a single-queue netdev regardless
of available CPUs / parallel flows.

The longer-term motivation here is throughput scale-out: allow
ntb_netdev to grow beyond the single-QP bottleneck and make it possible
to spread TX/RX work across multiple queue pairs as link speeds and core
counts keep increasing.

Multi-queue also unlocks the standard networking knobs on top of it. In
particular, once the device exposes multiple TX queues, qdisc/tc can
steer flows/traffic classes into different queues (via
skb->queue_mapping), enabling per-flow/per-class scheduling and QoS in a
familiar way.

Usage
=====

  1. Ensure the NTB device you want to use has multiple Memory Windows.
  2. modprobe ntb_transport on both sides, if it's not built-in.
  3. modprobe ntb_netdev on both sides, if it's not built-in.
  4. Use ethtool -L to configure the desired number of queues.
     The default number of real (combined) queues is 1.

     e.g. ethtool -L eth0 combined 2 # to increase
          ethtool -L eth0 combined 1 # to reduce back to 1

  Note:
    * If the NTB device has only a single Memory Window, ethtool -L eth0
      combined N (N > 1) fails with:
      "netlink error: No space left on device".
    * ethtool -L can be executed while the net_device is up.

Compatibility
=============

  The default remains a single queue, so behavior is unchanged unless
  the user explicitly increases the number of queues.

Kernel base
===========

  ntb-next (latest as of 2026-03-06):
  commit 7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link
                        disable path")

Testing / Results
=================

  Environment / command line:
    - 2x R-Car S4 Spider boards
      "Kernel base" (see above) + this series

  TCP:
    [RC] $ sudo iperf3 -s
    [EP] $ sudo iperf3 -Z -c ${SERVER_IP} -l 65480 -w 512M -P 4
  UDP:
    [RC] $ sudo iperf3 -s
    [EP] $ sudo iperf3 -ub0 -c ${SERVER_IP} -l 65480 -w 512M -P 4

  Without this series:
      TCP / UDP : 589 Mbps / 580 Mbps

  With this series (default single queue):
      TCP / UDP : 583 Mbps / 583 Mbps

  With this series + `ethtool -L eth0 combined 2`:
      TCP / UDP : 576 Mbps / 584 Mbps

  With this series + `ethtool -L eth0 combined 2` + [1], where flows are
  properly distributed across queues:
      TCP / UDP : 1.13 Gbps / 1.16 Gbps (re-measured with v3)

  The 575~590 Mbps variation is run-to-run variance i.e. no measurable
  regression or improvement is observed with a single queue. The key
  point is scaling from ~600 Mbps to ~1.20 Gbps once flows are
  distributed across multiple queues.

  Note: On R-Car S4 Spider, only BAR2 is usable for ntb_transport MW.
  For testing, BAR2 was expanded from 1 MiB to 2 MiB and split into two
  Memory Windows. A follow-up series is planned to add split BAR support
  for vNTB. On platforms where multiple BARs can be used for the
  datapath, this series should allow >=2 queues without additional
  changes.

  [1] [PATCH v2 00/10] NTB: epf: Enable per-doorbell bit handling while keeping legacy offset
      https://lore.kernel.org/linux-pci/20260227084955.3184017-1-den@valinux.co.jp/
      (subject was accidentally incorrect in the original posting)
====================

Link: https://patch.msgid.link/20260305155639.1885517-1-den@valinux.co.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 19:15:24 -08:00
Koichiro Den
24d9e73c7e net: ntb_netdev: Support ethtool channels for multi-queue
Support dynamic queue pair addition/removal via ethtool channels.
Use the combined channel count to control the number of netdev TX/RX
queues, each corresponding to a ntb_transport queue pair.

When the number of queues is reduced, tear down and free the removed
ntb_transport queue pairs (not just deactivate them) so other
ntb_transport clients can reuse the freed resources.

When the number of queues is increased, create additional queue pairs up
to NTB_NETDEV_MAX_QUEUES (=64). The effective limit is determined by the
underlying ntb_transport implementation and NTB hardware resources (the
number of MWs), so set_channels may return -ENOSPC if no more QPs can be
allocated.

Keep the default at one queue pair to preserve the previous behavior.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
Link: https://patch.msgid.link/20260305155639.1885517-5-den@valinux.co.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 19:15:20 -08:00
Koichiro Den
b83bf617dc net: ntb_netdev: Factor out multi-queue helpers
Implementing .set_channels will otherwise duplicate the same multi-queue
operations at multiple call sites. Factor out the following helpers:

  - ntb_netdev_update_carrier(): carrier is switched on when at least
                                 one QP link is up
  - ntb_netdev_queue_rx_drain(): drain and free all queued RX packets
                                 for one QP
  - ntb_netdev_queue_rx_fill():  prefill RX ring for one QP

No functional change.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
Link: https://patch.msgid.link/20260305155639.1885517-4-den@valinux.co.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 19:15:20 -08:00
Koichiro Den
304132b7a5 net: ntb_netdev: Gate subqueue stop/wake by transport link
When ntb_netdev is extended to multiple ntb_transport queue pairs, the
netdev carrier can be up as long as at least one QP link is up. In that
setup, a given QP may be link-down while the carrier remains on.

Make the link event handler start/stop the corresponding netdev TX
subqueue and drive carrier state based on whether any QP link is up.
Also guard subqueue wake/start points in the TX completion and timer
paths so a subqueue is not restarted while its QP link is down.

Stop all queues in ndo_open() and let the link event handler wake each
subqueue once ntb_transport link negotiation succeeds.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
Link: https://patch.msgid.link/20260305155639.1885517-3-den@valinux.co.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 19:15:20 -08:00
Koichiro Den
ee970634c7 net: ntb_netdev: Introduce per-queue context
Prepare ntb_netdev for multi-queue operation by moving queue-pair state
out of struct ntb_netdev.

Introduce struct ntb_netdev_queue to carry the ntb_transport_qp pointer,
the per-QP TX timer and queue id. Pass this object as the callback
context and convert the RX/TX handlers and link event path accordingly.

The probe path allocates a fixed upper bound for netdev queues while
instantiating only a single ntb_transport queue pair, preserving the
previous behavior. Also store client_dev for future queue pair
creation/removal via the ntb_transport API.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
Link: https://patch.msgid.link/20260305155639.1885517-2-den@valinux.co.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 19:15:20 -08:00
Vivian Wang
70eba59f92 net: spacemit: Remove unused buff_addr fields
These were never used. Just remove them.

No functional change intended.

Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn>
Link: https://patch.msgid.link/20260305-k1-ethernet-cleanup-buff_addr-v1-1-e978ef119231@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 18:58:15 -08:00
Jakub Kicinski
00e3228702 Merge branch 'nfc-drop-redundant-usb-device-references'
Johan Hovold says:

====================
nfc: drop redundant USB device references

Driver core holds a reference to the USB interface and its parent USB
device while the interface is bound to a driver and there is no need to
take additional references unless the structures are needed after
disconnect.

Drop redundant device references to reduce cargo culting, make it easier
to spot drivers where an extra reference is needed, and reduce the risk
of memory leaks when drivers fail to release them.
====================

Link: https://patch.msgid.link/20260305111019.18030-1-johan@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 18:57:47 -08:00
Johan Hovold
1a2d4bfa04 nfc: port100: drop redundant device reference
Driver core holds a reference to the USB interface and its parent USB
device while the interface is bound to a driver and there is no need to
take additional references unless the structures are needed after
disconnect.

Drop the redundant device reference to reduce cargo culting, make it
easier to spot drivers where an extra reference is needed, and reduce
the risk of memory leaks when drivers fail to release it.

Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260305111019.18030-3-johan@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 18:57:42 -08:00
Johan Hovold
6c7f710325 nfc: pn533: drop redundant device reference
Driver core holds a reference to the USB interface and its parent USB
device while the interface is bound to a driver and there is no need to
take additional references unless the structures are needed after
disconnect.

Drop the redundant device reference to reduce cargo culting, make it
easier to spot drivers where an extra reference is needed, and reduce
the risk of memory leaks when drivers fail to release it.

Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260305111019.18030-2-johan@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 18:57:42 -08:00
Bui Quang Minh
e3f8800aa2 virtio-net: xsk: Support wakeup on RX side
When XDP_USE_NEED_WAKEUP is used and the fill ring is empty so no buffer
is allocated on RX side, allow RX NAPI to be descheduled. This avoids
wasting CPU cycles on polling. Users will be notified and they need to
make a wakeup call after refilling the ring.

Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Link: https://patch.msgid.link/20260304154317.7506-1-minhquangbui99@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 17:38:23 -08:00
Aleksei Oladko
21c0dc7cdd selftests: net: forwarding: fix IPv6 address leak in cleanup
Several forwarding tests (e.g., gre_multipath.sh) initialize both IPv4
and IPv6 addresses using simple_if_init, but only clean up IPv4
in simple_if_fini. This leaves stale IPv6 addresses on the interfaces,
which causes subsequent tests to fail when they encounter unexpected
address configuration.

The issue can be reproduced by running tests in sequence:
  # run_kselftest.sh -t net/forwarding:ipip_hier_gre.sh
  # run_kselftest.sh -t net/forwarding:min_max_mtu.sh
  TAP version 13
  1..1
  # timeout set to 0
  # selftests: net/forwarding: min_max_mtu.sh
  # TEST: ping                                                          [ OK ]
  # TEST: ping6                                                         [ OK ]
  # TEST: Test maximum MTU configuration                                [ OK ]
  # TEST: Test traffic, packet size is maximum MTU                      [FAIL]
  #       Ping6, packet size: 65487 succeeded, but should have failed
  # TEST: Test minimum MTU configuration                                [ OK ]
  # TEST: Test traffic, packet size is minimum MTU                      [ OK ]
  not ok 1 selftests: net/forwarding: min_max_mtu.sh # exit=1

Fix this by removing the unused IPv6 argument from simple_if_init in
tests that don't use IPv6 (gre_multipath.sh, ipip_lib.sh), and by
adding the missing IPv6 argument to simple_if_fini in tests that
use IPv6 (gre_multipath_nh.sh, gre_multipath_nh_res.sh).

Signed-off-by: Aleksei Oladko <aleksey.oladko@virtuozzo.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260305211000.515301-1-aleksey.oladko@virtuozzo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 17:15:33 -08:00
Jiayuan Chen
ad4c955960 net: annotate data races around sk->sk_prot
inet_sendmsg() and inet_recvmsg() access sk->sk_prot without
lock_sock() or any other synchronization.

sock_replace_proto() (used by sockmap), TLS and MPTCP can change
sk->sk_prot under us, so these functions need READ_ONCE() to avoid
load tearing.

Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260304064253.16955-1-jiayuan.chen@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 16:55:36 -08:00
Heiner Kallweit
260d27b3ae net: phy: remove phy_attach
378e6523eb ("net: bcmgenet: remove unused platform code") removed
the last user of phy_attach(). So remove this function.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/8812176a-e319-4e9f-815d-99ea339df8b2@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 16:47:56 -08:00
Eric Dumazet
c698f5cc94 inet_diag: report delayed ack timer information
inet_sk_diag_fill() populates r->idiag_timer with the following
precedence order:

1 - Retransmit timer.
4 - Probe0 timer.
2 - Keepalive timer.

This patch adds a new value, last in the list, if other timers
are not active.

5 - Delayed ACK timer.

A corresponding iproute2 patch will follow to replace "unknown"
with "delack":

ESTAB 10     0   [2002:a05:6830:1f86::]:12875 [2002:a05:6830:1f85::]:50438

    timer:(unknown,003ms,0) ino:152178 sk:3004 cgroup:unreachable:189 <->

    skmem:(r1344,rb12780520,t0,tb262144,f2752,w0,o250,bl0,d0) ts usec_ts
    ...

Also add the following enum in uapi/linux/inet_diag.h
as suggested by David Ahern.

enum {
	IDIAG_TIMER_OFF,
	IDIAG_TIMER_ON,
	IDIAG_TIMER_KEEPALIVE,
	IDIAG_TIMER_TIMEWAIT,
	IDIAG_TIMER_PROBE0,
	IDIAG_TIMER_DELACK,
};

Neal Cardwell suggested to test for ICSK_ACK_TIMER:
inet_csk_clear_xmit_timer() does not call sk_stop_timer()
because INET_CSK_CLEAR_TIMERS is unset.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260305114829.2163276-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 16:32:26 -08:00
Jakub Kicinski
db29970799 Merge branch 'net-stmmac-mdio-related-cleanups'
Russell King says:

====================
net: stmmac: mdio related cleanups

The first four patches clean up the MDC clock divisor selection code,
turning the three different ways we choose a divisor into tabular form,
rather than doing the selection purely in code.

Convert MDIO to use field_prep() which allows a non-constant mask to be
used when preparing fields.

Then use u32 and the associated typed GENMASK for MDIO register field
definitions.

Finally, an extra couple of patches that use appropriate types in
struct mdio_bus_data.
====================

Link: https://patch.msgid.link/aald--qJquWGIvmO@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 15:39:13 -08:00
Russell King (Oracle)
e4fd855c52 net: stmmac: make pcs_mask and phy_mask u32
The PCS and PHY masks are passed to the mdio bus layer as phy_mask
to prevent bus addresses between 0 and 31 inclusive being scanned,
and this is declared as u32. Also declare these as u32 in stmmac
for type consistency.

Since this is a u32, use BIT_U32() rather than BIT() to generate
values for these fields.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/E1vy6AY-0000000BtxJ-3smT@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 15:39:10 -08:00
Russell King (Oracle)
3cd963fa91 net: stmmac: mdio_bus_data->default_an_inband is boolean
default_an_inband is declared as an unsigned int, but is set to true/
false and is assigned to phylink_config's member of the same name
which is a bool. Declare this also as a bool for consistency.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/E1vy6AT-0000000BtxD-2qm7@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 15:39:10 -08:00
Russell King (Oracle)
a64d927aec net: stmmac: use GENMASK_U32() for mdio bitfields
Rather than using hex numbers, use GENMASK() for mdio bitfields.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/E1vy6AO-0000000Btx7-2NDV@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 15:39:10 -08:00
Russell King (Oracle)
df388b4d39 net: stmmac: use u32 for MDIO register field masks
MDIO registers are 32-bit, so use u32 to describe the masks for these
registers. Convert the GENMASK() initialisers to GENMASK_U32() for
type compatibility.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/E1vy6AJ-0000000Btx1-1teC@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 15:39:09 -08:00
Russell King (Oracle)
58bd003900 net: stmmac: mdio: convert field prep to use field_prep()
Convert the MDIO field preparation to use field_prep(), which removes
the need to store separate mask and shifts. Also convert the clk_csr
value using __ffs() to do the shift as we need to detect overflows
for this.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/E1vy6AE-0000000Btwv-1LM4@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 15:39:09 -08:00
Russell King (Oracle)
506f78f43c net: stmmac: mdio: simplify MDC clock divisor lookup
As each lookup now iterates over each table in the same way, simplfy
the code to select the table, and then walk that table.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/E1vy6A9-0000000Btwp-0lxY@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 15:39:09 -08:00
Russell King (Oracle)
b6687ef976 net: stmmac: mdio: use same test for MDC clock divisor lookups
Use the same frequency test for all clk_csr value lookups (clock
rate > table rate). This has the side effect that the standard rate
table results in the divider being used for the maximum frequency
for the divider rather than the next higher divider. This still
allows MDC to meet the IEE 802.3 specification, but at a rate closer
to 2.5MHz for these frequencies.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/E1vy6A4-0000000Btwj-0ATB@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 15:39:09 -08:00
Russell King (Oracle)
4c7e0e0818 net: stmmac: mdio: convert MDC clock divisor selection to tables
Convert the MDC clock divisor selection to tabular format.

Note that there is a change for 300MHz, but this is not a problem,
as the MDC clock remains within the useable ranges, which are:

	STMMAC_CSR_500_800M	/324 1.54 - 2.47MHz
	STMMAC_CSR_300_500M	/204 1.47 - 2.45MHz
	STMMAC_CSR_250_300M	/124 2.02 - 2.42MHz
	STMMAC_CSR_150_250M	/102 1.47 - 2.45MHz
	STMMAC_CSR_100_150M	/62  1.61 - 2.42MHz
	STMMAC_CSR_60_100M	/42  1.43 - 2.38MHz
	STMMAC_CSR_35_60M	/26  1.35 - 2.31MHz
	STMMAC_CSR_20_35M	/16  1.25 - 2.19MHz

Thus, with the change of divisor for exactly 300MHz, MDC temporarily
changes from 2.42MHz to 1.47MHz for the sake of consistency.

The databook does not specify whether the frequency limits for the
CSR divider are inclusive or exclusive.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/E1vy69y-0000000Btwd-3oq7@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 15:39:09 -08:00
Dragos Tatulea
507ccb668f selftests: drv-net: iou-zcrx: wait for memory cleanup of probe run
The large chunks test does a probe run of iou-zcrx before it runs the
actual test. After the probe run finishes, the context will still exist
until the deferred io_uring teardown. When running iou-zcrx the second
time, io_uring_register_ifq() can return -EEXIST due to the existence of
the old context.

The fix is simple: wait for the context teardown using the new
mp_clear_wait() utility before running the second instance of iou-zcrx.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Link: https://patch.msgid.link/20260305080446.897628-2-dtatulea@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 15:36:37 -08:00
Heiner Kallweit
1a9940317c Revert "net: phy: improve mdiobus_stats_acct"
This reverts commit 1afccc5a20.

As reported by Marek the change causes a warning on non-PREEMPT_RT
32 bit systems.

Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/c3a1aba9-3fae-4c4b-bcb1-fb620fb7a309@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 13:19:33 -08:00
Jakub Kicinski
8e235bc433 docs: netdev: refine netdevsim testing guidance
The library to create tests for both NIC HW and netdevsim has existed
for almost a year. netdevsim-only tests we get increasingly feel like
a waste, we should try to write tests that work both on netdevsim and
real HW. Refine the guidance accordingly.

Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260304151647.2770466-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 13:18:04 -08:00
Jakub Kicinski
13f0dd7ed1 Merge branch 'selftests-net-add-netkit-container-env-and-test'
David Wei says:

====================
selftests/net: add netkit container env and test

Add a new Python selftest env NetDrvContEnv that sets up a pair of
netkit netdevs, with one inside of a netns, and a bpf prog that forwards
skbs from NETIF to the netkit inside the netns.

  NETIF           = "eth0"
  LOCAL_V6        = "2001:db8:1::1"
  REMOTE_V6       = "2001:db8:1::2"
  LOCAL_PREFIX_V6 = "2001:db8:2::0/64"

          +-----------------------------+        +------------------------------+
  dst     | INIT NS                     |        | TEST NS                      |
  2001:   | +---------------+           |        |                              |
  db8:2::2| | NETIF         |           |  bpf   |                              |
      +---|>| 2001:db8:1::1 |           |redirect| +-------------------------+  |
      |   | |               |-----------|--------|>| Netkit                  |  |
      |   | +---------------+           | _peer  | | nk_guest                |  |
      |   | +-------------+ Netkit pair |        | | fe80::2/64              |  |
      |   | | Netkit      |.............|........|>| 2001:db8:2::2/64        |  |
      |   | | nk_host     |             |        | +-------------------------+  |
      |   | | fe80::1/64  |             |        |                              |
      |   | +-------------+             |        | route:                       |
      |   |                             |        |   default                    |
      |   | route:                      |        |     via fe80::1 dev nk_guest |
      |   |   2001:db8:2::2/128         |        +------------------------------+
      |   |     via fe80::2 dev nk_host |
      |   +-----------------------------+
      |
      |   +---------------+
      |   | REMOTE        |
      +---| 2001:db8:1::2 |
          +---------------+

I will use this series for queue leasing selftests. Include a basic ping
test in this series as demonstration.
====================

Link: https://patch.msgid.link/20260305181803.2912736-1-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 13:17:10 -08:00
David Wei
37d24994a7 selftests/net: Add netkit container ping test
Add a basic ping test using NetDrvContEnv that sets up a netkit pair,
with one end in a netns. Use LOCAL_PREFIX_V6 and nk_forward BPF program
to ping from a remote host to the netkit in netns.

Signed-off-by: David Wei <dw@davidwei.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/20260305181803.2912736-5-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 13:16:40 -08:00
David Wei
3f74d5bb80 selftests/net: Add env for container based tests
Add an env NetDrvContEnv for container based selftests. This automates
the setup of a netns, netkit pair with one inside the netns, and a BPF
program that forwards skbs from the NETIF host inside the container.

Currently only netkit is used, but other virtual netdevs e.g. veth can
be used too.

Expect netkit container datapath selftests to have a publicly routable
IP prefix to assign to netkit in a container, such that packets will
land on eth0. The BPF skb forward program will then forward such packets
from the host netns to the container netns.

Signed-off-by: David Wei <dw@davidwei.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/20260305181803.2912736-4-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 13:11:17 -08:00
David Wei
68ab8ba927 selftests/net: Export Netlink class via lib.py
Making rtnl newlink calls requires constants defined in Netlink class in
pyynl. Export it.

Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20260305181803.2912736-3-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 13:11:17 -08:00
David Wei
01748996f6 selftests/net: Add bpf skb forwarding program
Add nk_forward.bpf.c, a BPF program that forwards skbs matching some IPv6
prefix received on eth0 ifindex to a specified netkit ifindex. This will
be needed by netkit container tests.

Signed-off-by: David Wei <dw@davidwei.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260305181803.2912736-2-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 13:11:17 -08:00
Antonio Quartulli
e5e09233e8 tools: ynl: add uns-admin-perm to genetlink
GENL_UNS_ADMIN_PERM may be required by protocols using
the `genetlink` family, however, this flag is currently
only allowed in `genetlink-legacy`.

Add it to the list of possible values in genetlink.yaml too.

Cc: Simon Horman <horms@kernel.org>
Cc: Donald Hunter <donald.hunter@gmail.com>
Link: https://github.com/OpenVPN/ovpn-net-next/issues/33
Suggested-by: Ralf Lici <ralf@mandelbit.com>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
Link: https://patch.msgid.link/20260304141020.23270-1-antonio@openvpn.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 19:07:03 -08:00
Lorenzo Bianconi
7600fb3b41 net: airoha: Rely __field_prep for non-constant masks
Rely on __field_prep macros for non-constant masks preparing the values
for register updates instead of open-coding.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260304-airoha-__field_prep-v1-1-b185facc4e2f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 19:05:52 -08:00
Jakub Kicinski
8844de1037 Merge branch 'net-cadence-macb-add-ieee-802-3az-eee-support'
Nicolai Buchwitz says:

====================
net: cadence: macb: add IEEE 802.3az EEE support

Add Energy Efficient Ethernet (IEEE 802.3az) support to the Cadence GEM
(macb) driver using phylink's managed EEE framework. The GEM MAC has
hardware LPI registers but no built-in idle timer, so the driver
implements software-managed TX LPI using a delayed_work timer while
delegating EEE negotiation and ethtool state to phylink.

The series is structured as follows:

  1. LPI statistics: Expose the four hardware EEE counters (RX/TX LPI
     transitions and time) through ethtool -S, accumulated in software
     since they are clear-on-read. Adds register offset definitions
     GEM_RXLPI/RXLPITIME/TXLPI/TXLPITIME (0x270-0x27c).

  2. TX LPI engine: Introduces GEM_TXLPIEN (NCR bit 19) and
     MACB_CAPS_EEE alongside the implementation that uses them.
     phylink mac_enable_tx_lpi / mac_disable_tx_lpi callbacks with a
     delayed_work-based idle timer. LPI entry is deferred 1 second
     after link-up per IEEE 802.3az. Wake before transmit with a
     conservative 50us PHY wake delay (IEEE 802.3az Tw_sys_tx).

  3. ethtool EEE ops: get_eee/set_eee delegating to phylink for PHY
     negotiation and timer management.

  4. RP1 enablement: Set MACB_CAPS_EEE for the Raspberry Pi 5's RP1
     southbridge (Cadence GEM_GXL rev 0x00070109 + BCM54213PE PHY).

  5. EyeQ5 enablement: Set MACB_CAPS_EEE for the Mobileye EyeQ5 GEM
     instance, verified with a hardware loopback by Théo Lebrun.

Tested on Raspberry Pi 5 (1000BASE-T, BCM54213PE PHY, 250ms LPI timer):

  iperf3 throughput (no regression):
    TCP TX: 937.8 Mbit/s (EEE on) vs 937.0 Mbit/s (EEE off)
    TCP RX: 936.5 Mbit/s both

  Latency (ping RTT, small expected increase from LPI wake):
    1s interval:  0.273 ms (EEE on) vs 0.181 ms (EEE off)
    10ms interval: 0.206 ms (EEE on) vs 0.168 ms (EEE off)
    flood ping:   0.200 ms (EEE on) vs 0.156 ms (EEE off)

  LPI counters (ethtool -S, 1s-interval ping, EEE on):
    tx_lpi_transitions: 112
    tx_lpi_time: 15574651

  Zero packet loss across all tests. Also verified with
  ethtool --show-eee / --set-eee and cable unplug/replug cycling.
====================

Link: https://patch.msgid.link/20260304105432.631186-1-nb@tipi-net.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 18:56:54 -08:00
Nicolai Buchwitz
48575b6e16 net: cadence: macb: enable EEE for Mobileye EyeQ5
Set MACB_CAPS_EEE for the Mobileye EyeQ5 GEM instance. EEE has been
verified on EyeQ5 hardware using a loopback setup with ethtool
--show-eee confirming EEE active on both ends at 100baseT/Full and
1000baseT/Full.

Tested-by: Théo Lebrun <theo.lebrun@bootlin.com>
Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
Link: https://patch.msgid.link/20260304105432.631186-6-nb@tipi-net.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 18:56:49 -08:00
Nicolai Buchwitz
92ba330743 net: cadence: macb: enable EEE for Raspberry Pi RP1
Set MACB_CAPS_EEE for the Raspberry Pi 5 RP1 southbridge
(Cadence GEM_GXL rev 0x00070109 paired with BCM54213PE PHY).

EEE has been verified on RP1 hardware: the LPI counter registers
at 0x270-0x27c return valid data, the TXLPIEN bit in NCR (bit 19)
controls LPI transmission correctly, and ethtool --show-eee reports
the negotiated state after link-up.

Other GEM variants that share the same LPI register layout (SAMA5D2,
SAME70, PIC32CZ) can be enabled by adding MACB_CAPS_EEE to their
respective config entries once tested.

Reviewed-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
Reviewed-by: Théo Lebrun <theo.lebrun@bootlin.com>
Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
Link: https://patch.msgid.link/20260304105432.631186-5-nb@tipi-net.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 18:56:48 -08:00
Nicolai Buchwitz
61332b7876 net: cadence: macb: add ethtool EEE support
Implement get_eee and set_eee ethtool ops for GEM as simple passthroughs
to phylink_ethtool_get_eee() and phylink_ethtool_set_eee().

No MACB_CAPS_EEE guard is needed: phylink returns -EOPNOTSUPP from both
ops when mac_supports_eee is false, which is the case when
lpi_capabilities and lpi_interfaces are not populated. Those fields are
only set when MACB_CAPS_EEE is present (previous patch), so phylink
already handles the unsupported case correctly.

Reviewed-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
Reviewed-by: Théo Lebrun <theo.lebrun@bootlin.com>
Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
Link: https://patch.msgid.link/20260304105432.631186-4-nb@tipi-net.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 18:56:48 -08:00
Nicolai Buchwitz
0cc425f18f net: cadence: macb: implement EEE TX LPI support
The GEM MAC has hardware LPI registers (NCR bit 19: TXLPIEN) but no
built-in idle timer, so asserting TXLPIEN blocks all TX immediately
with no automatic wake. A software idle timer is required, as noted
in Microchip documentation (section 40.6.19): "It is best to use
firmware to control LPI."

Implement phylink managed EEE using the mac_enable_tx_lpi and
mac_disable_tx_lpi callbacks:

- macb_tx_lpi_set(): sets or clears TXLPIEN; requires bp->lock to be
  held by the caller (asserted with lockdep_assert_held). Returns bool
  indicating whether the register actually changed, avoiding redundant
  writes and unnecessary udelay on the xmit fast path.

- macb_tx_lpi_work_fn(): delayed_work handler that enters LPI if all
  TX queues are idle and EEE is still active. Takes bp->lock with
  irqsave before calling macb_tx_lpi_set().

- macb_tx_lpi_schedule(): arms the work timer using the LPI timer
  value provided by phylink (default 250 ms). Called from
  macb_tx_complete() after each TX drain so the idle countdown
  restarts whenever the ring goes quiet.

- macb_tx_lpi_wake(): called from macb_start_xmit() under bp->lock,
  immediately before TSTART. Returns early if eee_active is false to
  avoid a register read on the common path when EEE is disabled.
  Clears TXLPIEN and applies a 50 us udelay for PHY wake (IEEE
  802.3az Tw_sys_tx is 16.5 us for 1000BASE-T / 30 us for
  100BASE-TX; GEM has no hardware enforcement). Only delays when
  TXLPIEN was actually set. The delay is placed after tx_head is
  advanced so the work_fn's queue-idle check sees a non-empty ring
  and cannot race back into LPI before the frame is transmitted.

- mac_enable_tx_lpi: stores the timer and sets eee_active under
  bp->lock, then defers the first LPI entry by 1 second per IEEE
  802.3az section 22.7a.

- mac_disable_tx_lpi: cancels the work (sync, without the lock to
  avoid deadlock with the work_fn), then takes bp->lock to clear
  eee_active and deassert TXLPIEN.

Populate phylink_config lpi_interfaces (MII, GMII, RGMII variants)
and lpi_capabilities (MAC_100FD | MAC_1000FD) so phylink can
negotiate EEE with the PHY and call the callbacks appropriately.
Set lpi_timer_default to 250000 us and eee_enabled_default to true.

Reviewed-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
Reviewed-by: Théo Lebrun <theo.lebrun@bootlin.com>
Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
Link: https://patch.msgid.link/20260304105432.631186-3-nb@tipi-net.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 18:56:48 -08:00
Nicolai Buchwitz
237577e603 net: cadence: macb: add EEE LPI statistics counters
The GEM MAC provides four read-only, clear-on-read LPI statistics
registers at offsets 0x270-0x27c:

  GEM_RXLPI     (0x270): RX LPI transition count (16-bit)
  GEM_RXLPITIME (0x274): cumulative RX LPI time (24-bit)
  GEM_TXLPI     (0x278): TX LPI transition count (16-bit)
  GEM_TXLPITIME (0x27c): cumulative TX LPI time (24-bit)

Add register offset definitions, extend struct gem_stats with
corresponding u64 software accumulators, and register the four
counters in gem_statistics[] so they appear in ethtool -S output.
Because the hardware counters clear on read, the existing
macb_update_stats() path accumulates them into the u64 fields on
every stats poll, preventing loss between userspace reads.

These registers are present on SAMA5D2, SAME70, PIC32CZ, and RP1
variants of the Cadence GEM IP and have been confirmed on RP1 via
devmem reads.

Reviewed-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
Reviewed-by: Théo Lebrun <theo.lebrun@bootlin.com>
Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
Link: https://patch.msgid.link/20260304105432.631186-2-nb@tipi-net.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 18:56:47 -08:00
Kuniyuki Iwashima
d4d8c6e6fd tcp: Initialise ehash secrets during connect() and listen().
inet_ehashfn() and inet6_ehashfn() initialise random secrets
on the first call by net_get_random_once().

While the init part is patched out using static keys, with
CONFIG_STACKPROTECTOR_STRONG=y, this causes a compiler to
generate a stack canary due to an automatic variable,
unsigned long ___flags, in the DO_ONCE() macro being passed
to __do_once_start().

With FDO, this is visible in __inet_lookup_established() and
__inet6_lookup_established() too.

Let's initialise the secrets by get_random_sleepable_once()
in the slow paths: inet_hash() for listen(), and
inet_hash_connect() and inet6_hash_connect() for connect().

Note that IPv6 listener will initialise both IPv4 & IPv6 secrets
in inet_hash() for IPv4-mapped IPv6 address.

With the patch, the stack size is reduced by 16 bytes (___flags
 + a stack canary) and NOPs for the static key go away.

Before: __inet6_lookup_established()

       ...
       push   %rbx
       sub    $0x38,%rsp                # stack is 56 bytes
       mov    %edx,%ebx                 # sport
       mov    %gs:0x299419f(%rip),%rax  # load stack canary
       mov    %rax,0x30(%rsp)              and store it onto stack
       mov    0x440(%rdi),%r15          # net->ipv4.tcp_death_row.hashinfo
       nop
 32:   mov    %r8d,%ebp                 # hnum
       shl    $0x10,%ebp                # hnum << 16
       nop
 3d:   mov    0x70(%rsp),%r14d          # sdif
       or     %ebx,%ebp                 # INET_COMBINED_PORTS(sport, hnum)
       mov    0x11a8382(%rip),%eax      # inet6_ehashfn() ...

After: __inet6_lookup_established()

       ...
       push   %rbx
       sub    $0x28,%rsp                # stack is 40 bytes
       mov    0x60(%rsp),%ebp           # sdif
       mov    %r8d,%r14d                # hnum
       shl    $0x10,%r14d               # hnum << 16
       or     %edx,%r14d                # INET_COMBINED_PORTS(sport, hnum)
       mov    0x440(%rdi),%rax          # net->ipv4.tcp_death_row.hashinfo
       mov    0x1194f09(%rip),%r10d     # inet6_ehashfn() ...

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260303235424.3877267-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 18:50:05 -08:00
Jakub Kicinski
edad504ce1 Merge branch 'doc-netlink-expand-nftables-specification'
Remy D. Farley says:

====================
doc/netlink: Expand nftables specification

Getting out some changes I've accumulated while making nftables work
with Rust netlink-bindings. Hopefully, this will be useful upstream.
====================

Link: https://patch.msgid.link/20260303195638.381642-1-one-d-wide@protonmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 18:49:11 -08:00
Remy D. Farley
568b370f12 doc/netlink: nftables: Fill out operation attributes
Filled out operation attributes:
- newtable
- gettable
- deltable
- destroytable
- newchain
- getchain
- delchain
- destroychain
- newrule
- getrule
- getrule-reset
- delrule
- destroyrule
- newset
- getset
- delset
- destroyset
- newsetelem
- getsetelem
- getsetelem-reset
- delsetelem
- destroysetelem
- getgen
- newobj
- getobj
- delobj
- destroyobj
- newflowtable
- getflowtable
- delflowtable
- destroyflowtable

Signed-off-by: Remy D. Farley <one-d-wide@protonmail.com>
Link: https://patch.msgid.link/20260303195638.381642-6-one-d-wide@protonmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 18:49:08 -08:00
Remy D. Farley
27c7ee6d26 doc/netlink: nftables: Add sub-messages
New sub-messsages:
- log
- match
- numgen
- range

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Remy D. Farley <one-d-wide@protonmail.com>
Link: https://patch.msgid.link/20260303195638.381642-5-one-d-wide@protonmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 18:49:08 -08:00
Remy D. Farley
482da27d52 doc/netlink: nftables: Update attribute sets
New attribute sets:
- log-attrs
- numgen-attrs
- range-attrs
- compat-target-attrs
- compat-match-attrs
- compat-attrs

Added missing attributes:
- table-attrs (pad, owner)
- set-attrs (type, count)

Added missing checks:
- range-attrs
- expr-bitwise-attrs
- compat-target-attrs
- compat-match-attrs
- compat-attrs

Annotated doc comment or associated enum:
- batch-attrs
- verdict-attrs
- expr-payload-attrs

Fixed byte order:
- nft-counter-attrs
- expr-counter-attrs
- rule-compat-attrs

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Remy D. Farley <one-d-wide@protonmail.com>
Link: https://patch.msgid.link/20260303195638.381642-4-one-d-wide@protonmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 18:49:08 -08:00
Remy D. Farley
a3a54ba4ef doc/netlink: nftables: Add definitions
New enums/flags:
- payload-base
- range-ops
- registers
- numgen-types
- log-level
- log-flags

Added missing enumerations:
- bitwise-ops

Annotated doc comment or associated enum:
- bitwise-ops

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Remy D. Farley <one-d-wide@protonmail.com>
Link: https://patch.msgid.link/20260303195638.381642-3-one-d-wide@protonmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 18:49:08 -08:00
Remy D. Farley
bf5a54bc0e doc/netlink: netlink-raw: Add max check
Add definitions for max check and len-or-limit type, the same as in other
specifications.

Suggested-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Remy D. Farley <one-d-wide@protonmail.com>
Link: https://patch.msgid.link/20260303195638.381642-2-one-d-wide@protonmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 18:49:08 -08:00
Jakub Kicinski
2a13309fa8 Merge branch 'net-stmmac-qcom-ethqos-further-serdes-reorganisation'
Russell King says:

====================
net: stmmac: qcom-ethqos: further serdes reorganisation

This is part 2 of the qcom-ethqos series, part 1 and patch 2 of part 2
has now been merged.

This part of the series focuses on the generic PHY driver, but these
changes have dependencies on the ethernet driver, hence why
it will need to go via net-next. Furthermore, subsequent changes
depend on these patches.

The underlying ideas here are:

- get rid of the driver using phy_set_speed() with SPEED_1000 and
  SPEED_2500 which makes no sense for an ethernet SerDes due to the
  PCS 8B10B data encoding, which inflates the data rate at the SerDes
  compared to the MAC. This is replaced with phy_set_mode_ext().
- allow phy_power_on() / phy_set_mode*() to be called in any order.

Mohd has tested this series, although not in the resulting merge order.
====================

Link: https://patch.msgid.link/aacD3osfaZkLsGxm@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 18:43:09 -08:00
Russell King (Oracle)
038a8e8eb9 net: stmmac: qcom-ethqos: remove phy_set_mode_ext() after phy_power_on()
The call to phy_set_mode_ext() after phy_power_on() was a work-around
for the qcom-sgmii-eth SerDes driver that only re-enabled its clocks on
phy_power_on() but did not configure the PHY. Now that the SerDes driver
fully configures the SerDes at phy_power_on(), there is no need to call
phy_set_mode_ext() immediately afterwards.

This also means we no longer need to record the previous operating mode
of the driver - this is up to the SerDes driver. In any case, the only
thing that we care about is the SerDes provides the necessary clocks to
the stmmac core to allow it to reset at this point. The actual mode is
irrelevant at this point as the correct mode will be configured in
ethqos_mac_finish_serdes() just before the network device is brought
online.

Reviewed-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com>
Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vxS4U-0000000BQXy-1Q1v@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 18:43:07 -08:00
Russell King (Oracle)
ebe8b48b88 phy: qcom-sgmii-eth: relax order of .power_on() vs .set_mode*()
Allow any order of the .power_on() and .set_mode*() methods as per the
recent discussion. This means phy_power_on() with this SerDes will now
restore the previous setup without requiring a subsequent
phy_set_mode*() call.

Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com>
Acked-by: Vinod Koul <vkoul@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vxS4P-0000000BQXs-0vGB@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 18:43:07 -08:00