Commit Graph

1430725 Commits

Author SHA1 Message Date
Fernando Fernandez Mancera
2ce8a41113 net: hsr: emit notification for PRP slave2 changed hw addr on port deletion
On PRP protocol, when deleting the port the MAC address change
notification was missing. In addition to that, make sure to only perform
the MAC address change on slave2 deletion and PRP protocol as the
operation isn't necessary for HSR nor slave1.

Note that the eth_hw_addr_set() is correct on PRP context as the slaves
are either in promiscuous mode or forward offload enabled.

Reported-by: Luka Gejak <luka.gejak@linux.dev>
Closes: https://lore.kernel.org/netdev/DHFCZEM93FTT.1RWFBIE32K7OT@linux.dev/
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Reviewed-by: Felix Maurer <fmaurer@redhat.com>
Link: https://patch.msgid.link/20260403123928.4249-2-fmancera@suse.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-07 17:06:16 +02:00
Paolo Abeni
97a8355b6a Merge branch 'net-mlx5e-xdp-add-support-for-multi-packet-per-page'
Tariq Toukan says:

====================
net/mlx5e: XDP, Add support for multi-packet per page

This series removes the limitation of having one packet per page in XDP
mode. This has the following implications:

- XDP in Striding RQ mode can now be used on 64K page systems.

- XDP in Legacy RQ mode was using a single packet per page which on 64K
  page systems is quite inefficient. The improvement can be observed
  with an XDP_DROP test when running in Legacy RQ mode on a ARM
  Neoverse-N1 system with a 64K page size:
  +-----------------------------------------------+
  | MTU  | baseline   | this change | improvement |
  |------+------------+-------------+-------------|
  | 1500 | 15.55 Mpps | 18.99 Mpps  | 22.0 %      |
  | 9000 | 15.53 Mpps | 18.24 Mpps  | 17.5 %      |
  +-----------------------------------------------+

After lifting this limitation, the series switches to using fragments
for the side page in non-linear mode. This small improvement is at most
visible for XDP_DROP tests with small 64B packets and a large enough MTU
for Striding RQ to be in non-linear mode:
+----------------------------------------------------------------------+
| System               | MTU  | baseline   | this change | improvement |
|----------------------+------+------------+-------------+-------------|
| 4K page x86_64 [1]   | 9000 | 26.30 Mpps | 30.45 Mpps  | 15.80 %     |
| 64K page aarch64 [2] | 9000 | 15.27 Mpps | 20.10 Mpps  | 31.62 %     |
+----------------------------------------------------------------------+

This series does not cover the xsk (AF_XDP) paths for 64K page systems.

[1] https://lore.kernel.org/all/20260324024235.929875-1-kuba@kernel.org/
====================

Link: https://patch.msgid.link/20260403090927.139042-1-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-07 13:34:08 +02:00
Dragos Tatulea
25b8c9b6d7 net/mlx5e: XDP, Use page fragments for linear data in multibuf-mode
Currently in XDP multi-buffer mode for striding rq a whole page is
allocated for the linear part of the XDP buffer. This is wasteful,
especially on systems with larger page sizes.

This change splits the page into fixed sized fragments. The page is
replenished when the maximum number of allowed fragments is reached.
When a fragment is not used, it will be simply recycled on next packet.
This is great for XDP_DROP as the fragment can be recycled for the next
packet. In the most extreme case (XDP_DROP everything), there will be 0
fragments used => only one linear page allocation for the lifetime of
the XDP program.

The previous page_pool size increase was too conservative (doubling the
size) and now there are much fewer allocations (1/8 for a 4K page). So
drop the page_pool size extension altogether when the linear side page
is used.

This small improvement is at most visible for XDP_DROP tests with small
64B packets and a large enough MTU for Striding RQ to be in non-linear
mode:
+----------------------------------------------------------------------+
| System               | MTU  | baseline   | this change | improvement |
|----------------------+------+------------+-------------+-------------|
| 4K page x86_64 [1]   | 9000 | 26.30 Mpps | 30.45 Mpps  | 15.80 %     |
| 64K page aarch64 [2] | 9000 | 15.27 Mpps | 20.10 Mpps  | 31.62 %     |
+----------------------------------------------------------------------+

[1] Intel Xeon Platinum 8580
[2] ARM Neoverse-N1

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260403090927.139042-6-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-07 13:34:04 +02:00
Dragos Tatulea
ebd4ad29cc net/mlx5e: XDP, Use a single linear page per rq
Currently in striding rq there is one mlx5e_frag_page member per WQE for
the linear page. This linear page is used only in XDP multi-buffer mode.
This is wasteful because only one linear page is needed per rq: the page
gets refreshed on every packet, regardless of WQE. Furthermore, it is
not needed in other modes (non-XDP, XDP single-buffer).

This change moves the linear page into its own structure (struct
mlx5_mpw_linear_info) and allocates it only when necessary.

A special structure is created because an upcoming patch will extend
this structure to support fragmentation of the linear page.

This patch has no functional changes.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260403090927.139042-5-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-07 13:34:04 +02:00
Dragos Tatulea
2dfaa02387 net/mlx5e: XDP, Remove stride size limitation
Currently XDP mode always uses PAGE_SIZE strides. This limitation
existed because page fragment counting was not implemented when XDP was
added. Furthermore, due to this limitation there were other issues as
well on system with larger pages (e.g. 64K):

- XDP for Striding RQ was effectively disabled on such systems.

- Legacy RQ allows the configuration but uses a fixed scheme of one XDP
  buffer per page which is inefficient.

As fragment counting was added during the driver conversion to
page_pool and the support for XDP multi-buffer, it is now possible
to remove this stride size limitation. This patch does just that.

Now it is possible to use XDP on systems with higher page sizes (e.g.
64K):

- For Striding RQ, loading the program is no longer blocked.
  Although a 64K page can fit any packet, MTUs that result in
  stride > 8K will still make the RQ in non-linear mode. That's
  because the HW doesn't support a higher than 8K stride.

- For Legacy RQ, the stride size was PAGE_SIZE which was very
  inefficient. Now the stride size will be calculated relative to MTU.
  Legacy RQ will always be in linear mode for larger system pages.

  This can be observed with an XDP_DROP test [1] when running
  in Legacy RQ mode on a ARM Neoverse-N1 system with a 64K
  page size:
  +-----------------------------------------------+
  | MTU  | baseline   | this change | improvement |
  |------+------------+-------------+-------------|
  | 1500 | 15.55 Mpps | 18.99 Mpps  | 22.0 %      |
  | 9000 | 15.53 Mpps | 18.24 Mpps  | 17.5 %      |
  +-----------------------------------------------+

There are performance benefits for Striding RQ mode as well:

- Striding RQ non-linear mode now uses 256B strides, just like
  non-XDP mode.

- Striding RQ linear mode can now fit a number of XDP buffers per page
  that is relative to the MTU size. That means that on 4K page systems
  and a small enough MTU, 2 XDP buffers can fit in one page.

The above benefits for Striding RQ can be observed with an
XDP_DROP test [1] when running on a 4K page x86_64 system
(Intel Xeon Platinum 8580):
  +-----------------------------------------------+
  | MTU  | baseline   | this change | improvement |
  |------+------------+-------------+-------------|
  | 1000 | 28.36 Mpps | 33.98 Mpps  | 19.82 %     |
  | 9000 | 20.76 Mpps | 26.30 Mpps  | 26.70 %     |
  +-----------------------------------------------+

[1] Test description:
- xdp-bench with XDP_DROP
- RX: single queue
- TX: sends 64B packets to saturate CPU on RX side

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260403090927.139042-4-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-07 13:34:04 +02:00
Dragos Tatulea
833e72645a net/mlx5e: XDP, Improve dma address calculation of linear part for XDP_TX
When calculating the dma address of the linear part of an XDP frame, the
formula assumes that there is a single XDP buffer per page. Extend the
formula to allow multiple XDP buffers per page by calculating the data
offset in the page.

This is a preparation for the upcoming removal of a single XDP buffer
per page limitation when the formula will no longer be correct.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260403090927.139042-3-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-07 13:34:04 +02:00
Dragos Tatulea
1047e14b44 net/mlx5e: XSK, Increase size for chunk_size param
When 64K pages are used, chunk_size can take the 64K value
which doesn't fit in u16. This results in overflows that
are detected in mlx5e_mpwrq_log_wqe_sz().

Increase the type to u32 to fix this.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260403090927.139042-2-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-07 13:34:04 +02:00
Qingfang Deng
dfecb0c5af selftests: net: add tests for PPP
Add ping and iperf3 tests for ppp_async.c and pppoe.c.

Signed-off-by: Qingfang Deng <qingfang.deng@linux.dev>
Link: https://patch.msgid.link/20260403034908.30017-1-qingfang.deng@linux.dev
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-07 12:08:46 +02:00
Jakub Kicinski
c149d90e26 Merge branch 'mptcp-support-msg_eor-and-small-cleanups'
Matthieu Baerts says:

====================
mptcp: support MSG_EOR and small cleanups

This series contains various unrelated patches:

- Patches 1 & 2: support MSG_EOR instead of ignoring it.

- Patch 3: avoid duplicated code in TCP and MPTCP by using a new helper.

- Patch 4: adapt test to reproduce bug and increase code coverage.
====================

Link: https://patch.msgid.link/20260403-net-next-mptcp-msg_eor-misc-v1-0-b0b33bea3fed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 19:14:31 -07:00
Matthieu Baerts (NGI0)
c4a5cb2f00 selftests: mptcp: join: recreate signal endp with same ID
In this "delete re-add signal" MPTCP Join subtest, the endpoint linked
to the initial subflow is removed, but readded once with different ID.

It appears that there was an issue when reusing the same ID, recently
fixed by commit d191101dee ("mptcp: pm: in-kernel: always set ID as
avail when rm endp"). The test then now reuses the same ID the first
time, but continue to use another one (88) the second time.

This should then cover more cases.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/615
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260403-net-next-mptcp-msg_eor-misc-v1-5-b0b33bea3fed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 19:14:30 -07:00
Geliang Tang
eb477fdd68 tcp: add recv_should_stop helper
Factor out a new helper tcp_recv_should_stop() from tcp_recvmsg_locked()
and tcp_splice_read() to check whether to stop receiving. And use this
helper in mptcp_recvmsg() and mptcp_splice_read() to reduce redundant code.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260403-net-next-mptcp-msg_eor-misc-v1-3-b0b33bea3fed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 19:14:27 -07:00
Gang Yan
7fb2f5f964 mptcp: preserve MSG_EOR semantics in sendmsg path
Extend MPTCP's sendmsg handling to recognize and honor the MSG_EOR flag,
which marks the end of a record for application-level message boundaries.

Data fragments tagged with MSG_EOR are explicitly marked in the
mptcp_data_frag structure and skb context to prevent unintended
coalescing with subsequent data chunks. This ensures the intent of
applications using MSG_EOR is preserved across MPTCP subflows,
maintaining consistent message segmentation behavior.

Signed-off-by: Gang Yan <yangang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260403-net-next-mptcp-msg_eor-misc-v1-2-b0b33bea3fed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 19:14:26 -07:00
Gang Yan
00d46be3c3 mptcp: reduce 'overhead' from u16 to u8
The 'overhead' in struct mptcp_data_frag can safely use u8, as it
represents 'alignment + sizeof(mptcp_data_frag)'. With a maximum
alignment of 7('ALIGN(1, sizeof(long)) - 1'), the overhead is at most
47, well below U8_MAX and validated with BUILD_BUG_ON().

This patch also adds a field named 'unused' for further extensions.

Signed-off-by: Gang Yan <yangang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260403-net-next-mptcp-msg_eor-misc-v1-1-b0b33bea3fed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 19:14:26 -07:00
Arnd Bergmann
ede3136e56 dpaa2: avoid linking objects into multiple modules
Each object file contains information about which module it gets linked
into, so linking the same file into multiple modules now causes a warning:

scripts/Makefile.build:254: drivers/net/ethernet/freescale/dpaa2/Makefile: dpaa2-mac.o is added to multiple modules: fsl-dpaa2-eth fsl-dpaa2-switch
scripts/Makefile.build:254: drivers/net/ethernet/freescale/dpaa2/Makefile: dpmac.o is added to multiple modules: fsl-dpaa2-eth fsl-dpaa2-switch

Change the way that dpaa2 is built by moving the two common files into a
separate module with exported symbols instead.

Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20260402184726.3746487-3-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 19:03:49 -07:00
Arnd Bergmann
df75bd552a net: ethernet: ti-cpsw: fix linking built-in code to modules
There are six variants of the cpsw driver, sharing various parts of
the code: davinci-emac, cpsw, cpsw-switchdev, netcp, netcp_ethss and
am65-cpsw-nuss.

I noticed that this means some files can be linked into more than
one loadable module, or even part of vmlinux but also linked into
a loadable module, both of which mess up assumptions of the build
system, and causes warnings:

scripts/Makefile.build:279: cpsw_ale.o is added to multiple modules: ti-am65-cpsw-nuss ti_cpsw ti_cpsw_new
scripts/Makefile.build:279: cpsw_priv.o is added to multiple modules: ti_cpsw ti_cpsw_new
scripts/Makefile.build:279: cpsw_sl.o is added to multiple modules: ti-am65-cpsw-nuss ti_cpsw ti_cpsw_new
scripts/Makefile.build:279: cpsw_ethtool.o is added to multiple modules: ti_cpsw ti_cpsw_new
scripts/Makefile.build:279: davinci_cpdma.o is added to multiple modules: ti_cpsw ti_cpsw_new ti_davinci_emac

Change this back to having separate modules for each portion that
can be linked standalone, exporting symbols as needed:

 - ti-cpsw-common.ko now contains both cpsw-common.o and
   davinci_cpdma.o as they are always used together

 - ti-cpsw-priv.ko contains cpsw_priv.o, cpsw_sl.o and cpsw_ethtool.o,
   which are the core of the cpsw and cpsw-new drivers.

 - ti-cpsw-sl.ko contains the cpsw-sl.o object and is used on
   ti-am65-cpsw-nuss.ko in addition to the two other cpsw variants.

 - ti-cpsw-ale.o is the one standalone module that is used by all
   except davinci_emac.

Each of these will be built-in if any of its users are built-in, otherwise
it's a loadable module if there is at least one module using it. I did
not bring back the separate Kconfig symbols for this, but just handle
it using Makefile logic.

Note: ideally this is something that Kbuild complains about, but usually
we just notice when something using THIS_MODULE misbehaves in a way that
a user notices.

Fixes: 99f6297182 ("net: ethernet: ti: cpsw: drop TI_DAVINCI_CPDMA config option")
Link: https://lore.kernel.org/lkml/20240417084400.3034104-1-arnd@kernel.org/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20260402184726.3746487-2-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 19:03:49 -07:00
Arnd Bergmann
961f3c5356 net: ethernet: ti-cpsw:: rename soft_reset() function
While looking at the glob symbols shared between the cpsw drivers,
I noticed that soft_reset() is the only one that is missing a proper
namespace prefix, and will pollute the kernel namespace, so rename
it to be consistent with the other symbols.

Reviewed-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20260402184726.3746487-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 19:03:46 -07:00
Jakub Kicinski
e6b7e1a10c eth: remove the driver for acenic / tigon1&2
The entire git history for this driver looks like tree-wide
and automated cleanups. There's even more coming now with
AI, so let's try to delete it instead.

Acked-by: Jes Sorensen <jes@trained-monkey.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20260403220501.2263835-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 18:52:27 -07:00
Kevin Hao
c321b5676d net: macb: Use netif_napi_add_tx() instead of netif_napi_add() for TX NAPI
The TX NAPI should be registered via netif_napi_add_tx() to avoid
unnecessarily polluting the napi_hash table.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Link: https://patch.msgid.link/20260403-macb-napi-tx-v1-1-08126a60c65e@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 18:51:57 -07:00
Jakub Kicinski
646dbda284 Merge branch 'nfc-support-for-five-qualcomm-sdm845-phones'
David Heidelberg says:

====================
NFC support for five Qualcomm SDM845 phones

- OnePlus 6 / 6T
 - Pixel 3 / 3 XL
 - SHIFT 6MQ

Verified with NFC card using neard:

systemctl enable --now neard
nfctool --device nfc0 -1
nfctool -d nfc0 -p
gdbus introspect --system --dest org.neard --object-path /org/neard/nfc0/tag0/record0

or use gNFC:
  https://gitlab.gnome.org/dh/gnfc/

successfully detecting and reading a tag.
====================

Link: https://patch.msgid.link/20260403-oneplus-nfc-v3-0-fbdce57d63c1@ixit.cz
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 18:50:51 -07:00
David Heidelberg
e72058a4be dt-bindings: nfc: nxp,nci: Document PN557 compatible
The PN557 uses the same hardware as the PN553 but ships with
firmware compliant with NCI 2.0.

Document PN557 as a compatible device.

Signed-off-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260403-oneplus-nfc-v3-1-fbdce57d63c1@ixit.cz
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 18:50:46 -07:00
Yue Haibing
2f60df9e61 ip6_tunnel: use generic for_each_ip_tunnel_rcu macro
Remove the locally defined for_each_ip6_tunnel_rcu macro and use
the generic for_each_ip_tunnel_rcu from linux/if_tunnel.h instead.

This eliminates code duplication and ensures consistency across
the kernel tunnel implementations.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260403084619.4107978-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 18:41:03 -07:00
Jason Xing
8a4e3ab61d net: advance skb_defer_disable_key check in napi_consume_skb
When net.core.skb_defer_max is adjusted to zero, napi_consume_skb()
shouldn't go into that deeper in skb_attempt_defer_free() because it adds
an additional pair of local_bh_enable/disable() which is evidently not
needed. Advancing the check of the static key saves more cycles and
benefits non defer case.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Link: https://patch.msgid.link/20260402034114.65766-1-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 18:32:04 -07:00
Jakub Kicinski
1ef05ed263 Merge branch 'net-dsa-mxl862xx-add-support-for-bridge-offloading'
Daniel Golle says:

====================
net: dsa: mxl862xx: add support for bridge offloading

As a next step to complete the mxl862xx DSA driver, add support for
offloading forwarding between bridged ports to the switch hardware.

This works pretty much without any big surprises, apart from two
subtleties:
 * per-port control over flooding behavior has to be implemented by
   (ab)using a 0-rate QoS meters as stopper in lack of any better
   option.
 * STP state transition unconditionally enables learning on a port
   even if it was previously explicitely disabled (a firmware bug)

Note that as the driver is still lacking all VLAN features (which
are going to be added next), at this point some of the
bridge_vlan_aware.sh tests are failing after applying this series.

This is expected and cannot be avoided without implementing
port_vlan_filtering + port_vlan_add/del. And adding both bridge and
VLAN offloading at the same time would be too much for anyone to
review, so VLAN support is going to be submitted in a follow-up
series immediately after this series has been accepted.

All other relevant selftests (including bridge_vlan_unaware.sh) are
still passing.

Inspired by the comments received from Paolo Abeni as reply to v5
the driver now no longer caches bridge port membership in the
driver, but instead imports an existing helper from yt921x.c to dsa.h
in order to allow the driver to easily iterate over bridge members.
The mapping between DSA bridge num and firmware bridge ID is done
using a simple fixed-size array in mxl862xx_priv.
====================

Link: https://patch.msgid.link/cover.1775049897.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 18:30:35 -07:00
Daniel Golle
340bdf9846 net: dsa: mxl862xx: implement bridge offloading
Implement joining and leaving bridges as well as add, delete and dump
operations on isolated FDBs, port MDB membership management, and
setting a port's STP state.

The switch supports a maximum of 63 bridges, however, up to 12 may
be used as "single-port bridges" to isolate standalone ports.
Allowing up to 48 bridges to be offloaded seems more than enough on
that hardware, hence that is set as max_num_bridges.

A total of 128 bridge ports are supported in the bridge portmap, and
virtual bridge ports have to be used eg. for link-aggregation, hence
potentially exceeding the number of hardware ports.

The firmware-assigned bridge identifier (FID) for each offloaded bridge
is stored in an array used to map DSA bridge num to firmware bridge ID,
avoiding the need for a driver-private bridge tracking structure.
Bridge member portmaps are rebuilt on join/leave using
dsa_switch_for_each_bridge_member().

As there are now more users of the BRIDGEPORT_CONFIG_SET API and the
state of each port is cached locally, introduce a helper function
mxl862xx_set_bridge_port(struct dsa_switch *ds, int port) which
applies the cached per-port state to hardware. For standalone user
ports (dp->bridge == NULL), it additionally resets the port to
single-port bridge state: CPU-only portmap, learning and flooding
disabled. The CPU port path sets its state explicitly before calling
this helper and is therefore not affected by the reset.

Note that MASK_VLAN_BASED_MAC_LEARNING is intentionally absent from
the firmware write mask. After mxl862xx_reset(), the firmware
initialises all VLAN-based MAC learning fields to 0 (disabled), so
SVL is the active mode by default without having to set it explicitly.

Note that there is no convenient way to control flooding on per-port
level, so the driver is using a 0-rate QoS meter setup as a stopper in
lack of any better option. In order to be perfect the firmware-enforced
minimum bucket size is bypassed by directly writing 0s to the relevant
registers -- without that at least one 64-byte packet could still
pass before the meter would change from 'yellow' into 'red' state.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/dd079180e2098e5f9626fcd149b9bad9a1b5a1b2.1775049897.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 18:30:33 -07:00
Daniel Golle
4250ff1640 dsa: tag_mxl862xx: set dsa_default_offload_fwd_mark()
The MxL862xx offloads bridge forwarding in hardware, so set
dsa_default_offload_fwd_mark() to avoid duplicate forwarding of
packets of (eg. flooded) frames arriving at the CPU port.

Link-local frames are directly trapped to the CPU port only, so don't
set dsa_default_offload_fwd_mark() on those.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/e1161c90894ddc519c57dc0224b3a0f6bfa1d2d6.1775049897.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 18:30:33 -07:00
Daniel Golle
f259e08494 net: dsa: add bridge member iteration macro
Drivers that offload bridges need to iterate over the ports that are
members of a given bridge, for example to rebuild per-port forwarding
bitmaps when membership changes. Currently drivers typically open-code
this by combining dsa_switch_for_each_user_port() with a
dsa_port_offloads_bridge_dev() check, or cache bridge membership
within the driver.

Add dsa_switch_for_each_bridge_member() macro to express this pattern
directly, and use it for the existing dsa_bridge_ports() inline
helper.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/e7136aaa26773f39e805a00fe4ecf13cd2b83fc0.1775049897.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 18:30:33 -07:00
Daniel Golle
b0a79590d1 net: dsa: move dsa_bridge_ports() helper to dsa.h
The yt921x driver contains a helper to create a bitmap of ports
which are members of a bridge.

Move the helper as static inline function into dsa.h, so other driver
can make use of it as well.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/4f8bbfce3e4e3a02064fc4dc366263136c6e0383.1775049897.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 18:30:33 -07:00
Laurence Rowe
98f28d8d6e vsock: avoid timeout for non-blocking accept() with empty backlog
A common pattern in epoll network servers is to eagerly accept all
pending connections from the non-blocking listening socket after
epoll_wait indicates the socket is ready by calling accept in a loop
until EAGAIN is returned indicating that the backlog is empty.

Scheduling a timeout for a non-blocking accept with an empty backlog
meant AF_VSOCK sockets used by epoll network servers incurred hundreds
of microseconds of additional latency per accept loop compared to
AF_INET or AF_UNIX sockets.

Signed-off-by: Laurence Rowe <laurencerowe@gmail.com>
Reviewed-by: Bobby Eshleman <bobbyeshleman@meta.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20260402204918.130395-1-laurencerowe@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 18:29:01 -07:00
Daniel Zahka
c8eee00c0f psp: add missing device stats to get-stats reply attributes
Commit f05d26198c ("psp: add stats from psp spec to driver facing
api") added device statistics (rx-packets, rx-bytes, rx-auth-fail,
rx-error, rx-bad, tx-packets, tx-bytes, tx-error) to the stats
attribute-set but did not add them to the get-stats operation reply
attributes. The kernel reports these attributes in the reply, so
list them in the spec to match.

Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260403-psp-yaml-fix-v1-1-dacee0663903@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 18:12:34 -07:00
Jeremy Kerr
f32ba09631 net: mctp: defer creation of dst after source-address check
Sashiko reports:

> mctp_dst_from_route() increments the device reference count by calling
> mctp_dev_hold(). When a valid route is found and dst is NULL, the
> structure copy is bypassed and rc is set to 0.

Instead of optimistically creating a dst from the final route (then
releasing it if the saddr is invalid), perform the saddr check first.

This means we don't have an unuecessary hold/release on the dev, which
could leak if the dst pointer is NULL. No caller passes a NULL dst at
present though (so the leak is not possible), but this is an intended
use of mctp_dst_from_route().

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260403-dev-mctp-dst-defer-v1-1-9c2c55faf9e9@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 18:06:47 -07:00
Jeremy Kerr
70e32aadb5 net: mctp: tests: use actual address when creating dev with addr
Sashiko reports:

> This isn't a bug in the core networking code, but the addr parameter
> appears to be ignored here.

In mctp_test_create_dev_with_addr(), we are ignoring the addr argument
and just using `8`. Use the passed address instead.

All invocations use 8 anyway, so no effective change at present.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Simon Horman <horms@verge.net.au>
Link: https://patch.msgid.link/20260403-dev-mctp-fix-test-addr-v1-1-b7fa789cdd9b@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 18:04:02 -07:00
Jakub Kicinski
3b45559f6c selftests: net: py: color the basics in the output
Sometimes it's hard to spot the ok / not ok lines in the output.
This is especially true for the GRO tests which retries a lot
so there's a wall of non-fatal output printed.

Try to color the crucial lines green / red / yellow when running
in a terminal.

Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20260402215444.1589893-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 17:47:59 -07:00
Jakub Kicinski
3741f8fa00 Merge branch 'dpll-add-frequency-monitoring-feature'
Ivan Vecera says:

====================
dpll: add frequency monitoring feature

This series adds support for monitoring the measured input frequency
of DPLL input pins via the DPLL netlink interface.

Some DPLL devices can measure the actual frequency being received on
input pins. The approach mirrors the existing phase-offset-monitor
feature: a device-level attribute (DPLL_A_FREQUENCY_MONITOR) enables
or disables monitoring, and a per-pin attribute
(DPLL_A_PIN_MEASURED_FREQUENCY) exposes the measured frequency in
millihertz (mHz) when monitoring is enabled.

Patch 1 adds the new attributes to the DPLL netlink spec (dpll.yaml),
the DPLL_PIN_MEASURED_FREQUENCY_DIVIDER constant, regenerates the
auto-generated UAPI header and netlink policy, and updates
Documentation/driver-api/dpll.rst.

Patch 2 adds the callback operations (freq_monitor_get/set for
devices, measured_freq_get for pins) and the corresponding netlink
GET/SET handlers in the DPLL core. The core only invokes
measured_freq_get when the frequency monitor is enabled on the parent
device. The freq_monitor_get callback is required when measured_freq_get
is provided.

Patch 3 implements the feature in the ZL3073x driver by extracting
a common measurement latch helper from the existing FFO update path,
adding a frequency measurement function, and wiring up the new
callbacks.
====================

Link: https://patch.msgid.link/20260402184057.1890514-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 16:48:04 -07:00
Ivan Vecera
bfc923b642 dpll: zl3073x: implement frequency monitoring
Extract common measurement latch logic from zl3073x_ref_ffo_update()
into a new zl3073x_ref_freq_meas_latch() helper and add
zl3073x_ref_freq_meas_update() that uses it to latch and read absolute
input reference frequencies in Hz.

Add meas_freq field to struct zl3073x_ref and the corresponding
zl3073x_ref_meas_freq_get() accessor. The measured frequencies are
updated periodically alongside the existing FFO measurements.

Add freq_monitor boolean to struct zl3073x_dpll and implement the
freq_monitor_set/get device callbacks to enable/disable frequency
monitoring via the DPLL netlink interface.

Implement measured_freq_get pin callback for input pins that returns the
measured input frequency in mHz.

Reviewed-by: Petr Oros <poros@redhat.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20260402184057.1890514-4-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 16:48:01 -07:00
Ivan Vecera
15ed91aa84 dpll: add frequency monitoring callback ops
Add new callback operations for a dpll device:
- freq_monitor_get(..) - to obtain current state of frequency monitor
  feature from dpll device,
- freq_monitor_set(..) - to allow feature configuration.

Add new callback operation for a dpll pin:
- measured_freq_get(..) - to obtain the measured frequency in mHz.

Obtain the feature state value using the get callback and provide it to
the user if the device driver implements callbacks. The measured_freq_get
pin callback is only invoked when the frequency monitor is enabled.
The freq_monitor_get device callback is required when measured_freq_get
is provided by the driver.

Execute the set callback upon user requests.

Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20260402184057.1890514-3-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 16:48:01 -07:00
Ivan Vecera
3fdea79c09 dpll: add frequency monitoring to netlink spec
Add DPLL_A_FREQUENCY_MONITOR device attribute to allow control over
the frequency monitor feature. The attribute uses the existing
dpll_feature_state enum (enable/disable) and is present in both
device-get reply and device-set request.

Add DPLL_A_PIN_MEASURED_FREQUENCY pin attribute to expose the measured
input frequency in millihertz (mHz). The attribute is present in the
pin-get reply. Add DPLL_PIN_MEASURED_FREQUENCY_DIVIDER constant to
allow userspace to extract integer and fractional parts.

Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20260402184057.1890514-2-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 16:48:01 -07:00
Yoshihiro Shimoda
353d8e7989 net: ethernet: ravb: Suspend and resume the transmission flow
The current driver does not follow the latest datasheet and does not
suspend the flow when stopping DMA and resume it when starting. Update
the driver to do so.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
[Niklas: Rebase from BSP and reword commit message]
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://patch.msgid.link/20260401183608.1852225-1-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 16:04:28 -07:00
Jakub Kicinski
48a5e77b49 Merge branch 'net-macb-remove-dedicated-irq-handler-for-wol'
Kevin Hao says:

====================
net: macb: Remove dedicated IRQ handler for WoL

During debugging of a suspend/resume issue, I observed that the macb driver
employs a dedicated IRQ handler for Wake-on-LAN (WoL) support. To my knowledge,
no other Ethernet driver adopts this approach. This implementation unnecessarily
complicates the suspend/resume process without providing any clear benefit.
Instead, we can easily modify the existing IRQ handler to manage WoL events,
avoiding any overhead in the TX/RX hot path.

The net throughput shows no significant difference following these changes.
The following data(net throughput and execution time of macb_interrupt) were
collected from my AMD Zynqmp board using the commands:
  taskset -c 1,2,3 iperf3 -c 192.168.3.4 -t 60 -Z -P 3 -R
  cat /sys/kernel/debug/tracing/trace_stat/function0

Before:
-------
  [SUM]   0.00-60.00  sec  5.99 GBytes   858 Mbits/sec    0             sender
  [SUM]   0.00-60.00  sec  5.99 GBytes   857 Mbits/sec                  receiver

  Function                               Hit    Time            Avg             s^2
  --------                               ---    ----            ---             ---
  macb_interrupt                      217996    678425.2 us     3.112 us        1.446 us

After:
------
  [SUM]   0.00-60.00  sec  6.00 GBytes   858 Mbits/sec    0             sender
  [SUM]   0.00-60.00  sec  5.99 GBytes   857 Mbits/sec                  receiver

  Function                               Hit    Time            Avg             s^2
  --------                               ---    ----            ---             ---
  macb_interrupt                      218212    668107.3 us     3.061 us        1.413 us
====================

Link: https://patch.msgid.link/20260402-macb-irq-v2-0-942d98ab1154@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:51:07 -07:00
Kevin Hao
6637c03f35 net: macb: Remove dedicated IRQ handler for WoL
In the current implementation, the suspend/resume path frees the
existing IRQ handler and sets up a dedicated WoL IRQ handler, then
restores the original handler upon resume. This approach is not used
by any other Ethernet driver and unnecessarily complicates the
suspend/resume process. After adjusting the IRQ handler in the previous
patches, we can now handle WoL interrupts without introducing any
overhead in the TX/RX hot path. Therefore, the dedicated WoL IRQ
handler is removed.

I have verified WoL functionality on my AMD ZynqMP board using the
following steps:
  root@amd-zynqmp:~# ifconfig end0 192.168.3.3
  root@amd-zynqmp:~# ethtool -s end0 wol a
  root@amd-zynqmp:~# echo mem >/sys/power/state

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Link: https://patch.msgid.link/20260402-macb-irq-v2-4-942d98ab1154@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:51:05 -07:00
Kevin Hao
6d55ce805b net: macb: Factor out the handling of non-hot IRQ events into a separate function
In the current code, the IRQ handler checks each IRQ event sequentially.
Since most IRQ events are related to TX/RX operations, while other
events occur infrequently, this approach introduces unnecessary overhead
in the hot path for TX/RX processing. This patch reduces such overhead
by extracting the handling of all non-TX/RX events into a new function
and consolidating these events under a new flag. As a result, only a
single check is required to determine whether any non-TX/RX events have
occurred. If such events exist, the handler jumps to the new function.
This optimization reduces four conditional checks to one and prevents
the instruction cache from being polluted with rarely used code in the
hot path.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Link: https://patch.msgid.link/20260402-macb-irq-v2-3-942d98ab1154@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:51:04 -07:00
Kevin Hao
5986ff6e41 net: macb: Introduce macb_queue_isr_clear() helper function
The current implementation includes several occurrences of the
following pattern:
	if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
		queue_writel(queue, ISR, value);

Introduces a helper function to consolidate these repeated code
segments. No functional changes are made.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Link: https://patch.msgid.link/20260402-macb-irq-v2-2-942d98ab1154@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:51:04 -07:00
Kevin Hao
dc3bd465ea net: macb: Replace open-coded implementation with napi_schedule()
The driver currently duplicates the logic of napi_schedule() primarily
to include additional debug information. However, these debug details
are not essential for a specific driver and can be effectively obtained
through existing tracepoints in the networking core, such as
/sys/kernel/tracing/events/napi/napi_poll. Therefore, this patch
replaces the open-coded implementation with napi_schedule() to
simplify the driver's code.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Link: https://patch.msgid.link/20260402-macb-irq-v2-1-942d98ab1154@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:51:04 -07:00
Qingfang Deng
779fae61a3 ppp: update Kconfig help message
Both links of the PPPoE section are no longer valid, and the CVS version
is no longer relevant.

- Replace the TLDP URL with the pppd project homepage.
- Update pppd version requirement for PPPoE.
- Update RP-PPPoE project homepage, and clarify that it's only needed
  for server mode.

Signed-off-by: Qingfang Deng <qingfang.deng@linux.dev>
Reviewed-by: Julian Braha <julianbraha@gmail.com>
Link: https://patch.msgid.link/20260402050053.144250-1-qingfang.deng@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:35:13 -07:00
Eric Dumazet
1666d945b5 inet: remove leftover EXPORT_SYMBOL()
IPv6 is no longer a module, we no longer need to export these symbols.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Link: https://patch.msgid.link/20260402174430.2462800-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:07:46 -07:00
Jakub Kicinski
071fe8b5d5 Merge branch 'selftests-drv-net-gro-more-test-cases'
Jakub Kicinski says:

====================
selftests: drv-net: gro: more test cases

Add a few more test cases for GRO.

First 4 patches are unchanged from v1.

Patches 5 and 6 are new. Willem pointed out that the defines are
duplicated and all these imprecise defines have been annoying me
for a while so I decided to clean them up.

With the defines cleaned up and now more precise patch 7 (was 5)
no longer has to play any games with the MTU for ip6ip6.

The last patch now sends 3 segments as requested.
====================

Link: https://patch.msgid.link/20260402210000.1512696-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:05:48 -07:00
Jakub Kicinski
764d0833e7 selftests: drv-net: gro: add a test for bad IPv4 csum
We have a test for coalescing with bad TCP checksum, let's also
test bad IPv4 header checksum.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260402210000.1512696-9-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:05:45 -07:00
Jakub Kicinski
9a84a4047d selftests: drv-net: gro: test ip6ip6
We explicitly test ipip encap. Let's add ip6ip6, too. Having
just ipip seems like favoring IPv4 which we should not do :)
Testing all combinations is left for future work, not sure
it's actually worth it.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260402210000.1512696-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:05:44 -07:00
Jakub Kicinski
024597cc20 selftests: drv-net: gro: make large packet math more precise
When constructing the packets for large_* test cases we use
a static value for packet count and MSS. It works okay for
ipv4 vs ipv6 but the gap between ipv4 and ip6ip6 is going to
be quite significant.

Make the defines calculate the worst case values, those
are only used for sizing stack arrays. Create helpers for
calculating precise values based on the exact test case.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260402210000.1512696-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:05:44 -07:00
Jakub Kicinski
166b0cc6df selftests: drv-net: gro: remove TOTAL_HDR_LEN
Willem points out TOTAL_HDR_LEN is identical to MAX_HDR_LEN.
This seems to have been the case ever since the test was added.
Replace the uses of TOTAL_HDR_LEN with MAX_HDR_LEN, MAX seems
more common for what this value is.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260402210000.1512696-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:05:44 -07:00
Jakub Kicinski
5469b695f2 selftests: drv-net: gro: prepare for ip6ip6 support
Try to use already calculated offsets and not depend on the ipip
flag as much. This patch should not change any functionality,
it's just a cleanup to make ip6ip6 support easier.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260402210000.1512696-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:05:43 -07:00