If the PHY of the mido bus is enabled with Wake-on-LAN (WOL),
we cannot suspend the PHY. Although the WOL status has been
checked in phy_suspend(), returning -EBUSY(-16) would cause
the Power Management (PM) to fail to suspend. Since
phy_suspend() is an exported symbol (EXPORT_SYMBOL),
timely error reporting is needed. Therefore, an additional
check is performed here. If the PHY of the mido bus is enabled
with WOL, we skip calling phy_suspend() to avoid PM failure.
From the following logs, it has been observed that the phydev->attached_dev
is NULL, phydev is "stmmac-0:01", it not attached, but it will affect suspend
and resume.The actually attached "stmmac-0:00" will not dpm_run_callback():
mdio_bus_phy_suspend().
init log:
[ 5.932502] YT8521 Gigabit Ethernet stmmac-0:00: attached PHY driver
(mii_bus:phy_addr=stmmac-0:00, irq=POLL)
[ 5.932512] YT8521 Gigabit Ethernet stmmac-0:01: attached PHY driver
(mii_bus:phy_addr=stmmac-0:01, irq=POLL)
[ 24.566289] YT8521 Gigabit Ethernet stmmac-0:00: yt8521_read_status,
link down, media: UTP
suspend log:
[ 322.631362] OOM killer disabled.
[ 322.631364] Freezing remaining freezable tasks
[ 322.632536] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[ 322.632540] printk: Suspending console(s) (use no_console_suspend to debug)
[ 322.633052] YT8521 Gigabit Ethernet stmmac-0:01:
PM: dpm_run_callback(): mdio_bus_phy_suspend+0x0/0x110 [libphy] returns -16
[ 322.633071] YT8521 Gigabit Ethernet stmmac-0:01:
PM: failed to suspend: error -16
[ 322.669699] PM: Some devices failed to suspend, or early wake event detected
[ 322.669949] OOM killer enabled.
[ 322.669951] Restarting tasks ... done.
[ 322.671008] random: crng reseeded on system resumption
[ 322.671014] PM: suspend exit
Add a function that phylib can inquire of the driver whether WoL
has been enabled at the PHY.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Youwan Wang <youwan@nfschina.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Link: https://patch.msgid.link/20240731091537.771391-1-youwan@nfschina.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The current implementation of netpoll in veth devices leads to
suboptimal behavior, as it triggers warnings due to the invocation of
__netif_rx() within a softirq context. This is not compliant with
expected practices, as __netif_rx() has the following statement:
lockdep_assert_once(hardirq_count() | softirq_count());
Given that veth devices typically do not benefit from the
functionalities provided by netpoll, Disable netpoll for veth
interfaces.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20240805094012.1843247-1-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tariq Toukan says:
====================
mlx5 PTM cross timestamping support
This patchset by Rahul and Carolina adds PTM (Precision Time Measurement)
support to the mlx5 driver.
PTM is a PCI extended capability introduced by PCI-SIG for providing an
accurate read of the device clock offset without being impacted by
asymmetric bus transfer rates.
The performance of PTM on ConnectX-7 was evaluated using both real-time
(RTC) and free-running (FRC) clocks under traffic and no traffic
conditions. Tests with phc2sys measured the maximum offset values at a 50Hz
rate, with and without PTM.
Results:
1. No traffic
+-----+--------+--------+
| | No-PTM | PTM |
+-----+--------+--------+
| FRC | 125 ns | <29 ns |
+-----+--------+--------+
| RTC | 248 ns | <34 ns |
+-----+--------+--------+
2. With traffic
+-----+--------+--------+
| | No-PTM | PTM |
+-----+--------+--------+
| FRC | 254 ns | <40 ns |
+-----+--------+--------+
| RTC | 255 ns | <45 ns |
+-----+--------+--------+
v2: https://lore.kernel.org/d1dba3e1-2ecc-4fdf-a23b-7696c4bccf45@gmail.com
====================
Link: https://patch.msgid.link/20240730134055.1835261-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Update the MODULE_AUTHOR for netconsole, according to the format, as
stated in module.h:
use "Name <email>" or just "Name"
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit implements downshift feature in vsc73xx family phys.
By default downshift was enabled with maximum tries.
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
build_skb() and frag allocations done with GFP_ATOMIC will
fail in real life, when system is under memory pressure,
and there's nothing we can do about that. So no point
printing warnings.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lorenzo Bianconi says:
====================
Add second QDMA support for EN7581 eth controller
EN7581 SoC supports two independent QDMA controllers to connect the
Ethernet Frame Engine (FE) to the CPU. Introduce support for the second
QDMA controller. This is a preliminary series to support multiple FE ports
(e.g. connected to a second PHY controller).
Changes since v1:
- squash patch 6/9 and 7/9
- move some duplicated code from patch 2/9 in 1/9
- cosmetics
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce support for the DSA built-in switch available on the EN7581
development board. EN7581 support is similar to MT7988 one except
it requires to set MT7530_FORCE_MODE bit in MT753X_PMCR_P register
for on cpu port.
Tested-by: Benjamin Larsson <benjamin.larsson@genexis.eu>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kuniyuki Iwashima says:
====================
net: Random cleanup for netns initialisation.
patch 1 & 2 suppress unwanted memory allocation for net->gen->ptr[].
patch 3 ~ 6 move part of netns initialisation to prenet_init() that
do not require pernet_ops_rwsem.
v2:
patch 1 : Removed Fixes: tag
patch 2 : Use XOR for WARN_ON()
v1: https://lore.kernel.org/netdev/20240729210801.16196-1-kuniyu@amazon.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 7c3f1875c6 ("net: move somaxconn init from sysctl code")
introduced net_defaults_ops to make sure that net.core sysctl knobs
are always initialised even if CONFIG_SYSCTL is disabled.
Such operations better fit preinit_net() added for a similar purpose
by commit 6e77a5a4af ("net: initialize net->notrefcnt_tracker earlier").
Let's initialise the sysctl defaults in preinit_net().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most initialisations in setup_net() do not require pernet_ops_rwsem
and can be moved to preinit_net().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When initialising the root netns, we call preinit_net() under
pernet_ops_rwsem.
However, the operations in preinit_net() do not require pernet_ops_rwsem.
Also, we don't hold it for preinit_net() when initialising non-root netns.
To be consistent, let's call preinit_net() without pernet_ops_rwsem in
net_ns_init().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When initialising the root netns, we set net->passive in setup_net().
However, we do it twice for non-root netns in copy_net_ns() and
setup_net().
This is because we could bypass setup_net() in copy_net_ns() if
down_read_killable() fails.
preinit_net() is a better place to put such an operation.
Let's initialise net->passive in preinit_net().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can allocate per-netns memory for struct pernet_operations by specifying
id and size.
register_pernet_operations() assigns an id to pernet_operations and later
ops_init() allocates the specified size of memory as net->gen->ptr[id].
If id is missing, no memory is allocated. If size is not specified,
pernet_operations just wastes an entry of net->gen->ptr[] for every netns.
net_generic is available only when both id and size are specified, so let's
ensure that.
While we are at it, we add const to both fields.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit fd558d186d ("l2tp: Split pppol2tp patch into separate l2tp and
ppp parts") converted net->gen->ptr[pppol2tp_net_id] in l2tp_ppp.c to
net->gen->ptr[l2tp_net_id] in l2tp_core.c.
Now the leftover wastes one entry of net->gen->ptr[] in each netns.
Let's avoid the unwanted allocation.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The DEVID register contains two pieces of information: the device ID in
the upper nibble, and the silicon revision number in the lower nibble.
The driver should work fine with any silicon revision, so let's mask
that out in the device ID check.
Fixes: 20e6d190ff ("net: pse-pd: Add TI TPS23881 PSE controller driver")
Signed-off-by: Kyle Swenson <kyle.swenson@est.tech>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to the datasheet, the VSC73xx family's maximum internal MDIO bus
speed is 20 MHz. It also allows disabling the preamble.
This commit sets the MDIO clock prescaler to the minimum value and
disables the preamble to speed up MDIO operations.
It doesn't affect the external bus because it's configured in a different
subblock.
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20240731203455.580262-1-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Smatch reports that copying media_name and if_name to name_parts may
overwrite the destination.
.../bearer.c:166 bearer_name_validate() error: strcpy() 'media_name' too large for 'name_parts->media_name' (32 vs 16)
.../bearer.c:167 bearer_name_validate() error: strcpy() 'if_name' too large for 'name_parts->if_name' (1010102 vs 16)
This does seem to be the case so guard against this possibility by using
strscpy() and failing if truncation occurs.
Introduced by commit b97bf3fd8f ("[TIPC] Initial merge")
Compile tested only.
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240801-tipic-overrun-v2-1-c5b869d1f074@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Nick Child says:
====================
ibmveth RR performance
This patchset aims to increase the ibmveth drivers small packet
request response rate.
These 2 patches address:
1. NAPI rescheduling technique
2. Driver-side processing of small packets
Testing over several netperf tcp_rr connections, we saw a
30% increase in transactions per second. No regressions
were observed in other workloads.
====================
Link: https://patch.msgid.link/20240801211215.128101-1-nnac123@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When the length of a packet is under the rx_copybreak threshold, the
buffer is copied into a new skb and sent up the stack. This allows the
dma mapped memory to be recycled back to FW.
Previously, the reuse of the DMA space was handled immediately.
This means that further packet processing has to wait until
h_add_logical_lan finishes for this packet.
Therefore, when reusing a packet, offload the hcall to the replenish
function. As a result, much of the shared logic between the recycle and
replenish functions can be removed.
This change increases TCP_RR packet rate by another 15% (370k to 430k
txns). We can see the ftrace data supports this:
PREV: ibmveth_poll = 8078553.0 us / 190999.0 hits = AVG 42.3 us
NEW: ibmveth_poll = 7632787.0 us / 224060.0 hits = AVG 34.07 us
Signed-off-by: Nick Child <nnac123@linux.ibm.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://patch.msgid.link/20240801211215.128101-3-nnac123@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When the ibmveth driver processes less than the budget, it must call
napi_complete_done() to release the instance. This function will
return false if the driver should avoid rearming interrupts.
Previously, the driver was ignoring the return code of
napi_complete_done(). As a result, there were unnecessary calls to
enable the veth irq.
Therefore, use the return code napi_complete_done() to determine if
irq rearm is necessary.
Additionally, in the event that new data is received immediately after
rearming interrupts, rather than just rescheduling napi, also jump
back to the poll processing loop since we are already in the poll
function (and know that we did not expense all of budget).
This slight tweak results in a 15% increase in TCP_RR transaction rate
(320k to 370k txns). We can see the ftrace data supports this:
PREV: ibmveth_poll = 8818014.0 us / 182802.0 hits = AVG 48.24
NEW: ibmveth_poll = 8082398.0 us / 191413.0 hits = AVG 42.22
Signed-off-by: Nick Child <nnac123@linux.ibm.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://patch.msgid.link/20240801211215.128101-2-nnac123@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lorenzo Bianconi says:
====================
Add second QDMA support for EN7581 eth controller
EN7581 SoC supports two independent QDMA controllers to connect the
Ethernet Frame Engine (FE) to the CPU. Introduce support for the second
QDMA controller. This is a preliminary series to support multiple FE ports
(e.g. connected to a second PHY controller).
====================
Link: https://patch.msgid.link/cover.1722522582.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Print more stack frames and the failing line when check fails.
This helps when tests use helpers to do the checks.
Before:
# At ./ksft/drivers/net/hw/rss_ctx.py line 92:
# Check failed 1037698 >= 396893.0 traffic on other queues:[344612, 462380, 233020, 449174, 342298]
not ok 8 rss_ctx.test_rss_context_queue_reconfigure
After:
# Check| At ./ksft/drivers/net/hw/rss_ctx.py, line 387, in test_rss_context_queue_reconfigure:
# Check| test_rss_queue_reconfigure(cfg, main_ctx=False)
# Check| At ./ksft/drivers/net/hw/rss_ctx.py, line 230, in test_rss_queue_reconfigure:
# Check| _send_traffic_check(cfg, port, ctx_ref, { 'target': (0, 3),
# Check| At ./ksft/drivers/net/hw/rss_ctx.py, line 92, in _send_traffic_check:
# Check| ksft_lt(sum(cnts[i] for i in params['noise']), directed / 2,
# Check failed 1045235 >= 405823.5 traffic on other queues (context 1)':[460068, 351995, 565970, 351579, 127270]
not ok 8 rss_ctx.test_rss_context_queue_reconfigure
Link: https://patch.msgid.link/20240801232317.545577-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>