Commit Graph

1351326 Commits

Author SHA1 Message Date
Christian Marangi
6a325aed13 net: phy: mediatek: add Airoha PHY ID to SoC driver
Airoha AN7581 SoC ship with a Switch based on the MT753x Switch embedded
in other SoC like the MT7581 and the MT7988. Similar to these they
require configuring some pin to enable LED PHYs.

Add support for the PHY ID for the Airoha embedded Switch and define a
simple probe function to toggle these pins. Also fill the LED functions
and add dedicated function to define LED polarity.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://patch.msgid.link/20250410100410.348-2-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11 20:18:31 -07:00
Christian Marangi
e5566162af net: phy: mediatek: permit to compile test GE SOC PHY driver
When commit 462a3daad6 ("net: phy: mediatek: fix compile-test
dependencies") fixed the dependency, it should have also introduced
an or on COMPILE_TEST to permit this driver to be compile-tested even if
NVMEM_MTK_EFUSE wasn't selected. The driver makes use of NVMEM API that
are always compiled (return error) so the driver can actually be
compiled even without that config.

Fix and simplify the dependency condition of this kernel config.

Fixes: 462a3daad6 ("net: phy: mediatek: fix compile-test dependencies")
Acked-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20250410100410.348-1-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11 20:18:31 -07:00
Jakub Kicinski
da1cd04bf1 Merge branch 'add-l2-hw-acceleration-for-airoha_eth-driver'
Lorenzo Bianconi says:

====================
Add L2 hw acceleration for airoha_eth driver

Introduce the capability to offload L2 traffic defining flower rules in
the PSE/PPE engine available on EN7581 SoC.
Since the hw always reports L2/L3/L4 flower rules, link all L2 rules
sharing the same L2 info (with different L3/L4 info) in the L2 subflows
list of a given L2 PPE entry.

v1: https://lore.kernel.org/20250407-airoha-flowtable-l2b-v1-0-18777778e568@kernel.org
====================

Link: https://patch.msgid.link/20250409-airoha-flowtable-l2b-v2-0-4a1e3935ea92@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11 20:15:38 -07:00
Lorenzo Bianconi
cd53f62261 net: airoha: Add L2 hw acceleration support
Similar to mtk driver, introduce the capability to offload L2 traffic
defining flower rules in the PSE/PPE engine available on EN7581 SoC.
Since the hw always reports L2/L3/L4 flower rules, link all L2 rules
sharing the same L2 info (with different L3/L4 info) in the L2 subflows
list of a given L2 PPE entry.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Link: https://patch.msgid.link/20250409-airoha-flowtable-l2b-v2-2-4a1e3935ea92@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11 20:15:36 -07:00
Lorenzo Bianconi
b4916f6790 net: airoha: Add l2_flows rhashtable
Introduce l2_flows rhashtable in airoha_ppe struct in order to
store L2 flows committed by upper layers of the kernel. This is a
preliminary patch in order to offload L2 traffic rules.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Link: https://patch.msgid.link/20250409-airoha-flowtable-l2b-v2-1-4a1e3935ea92@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11 20:15:36 -07:00
Jakub Kicinski
8bb3212be4 Merge branch 'net-retire-dccp-socket'
Kuniyuki Iwashima says:

====================
net: Retire DCCP socket.

As announced by commit b144fcaf46 ("dccp: Print deprecation
notice."), it's time to remove DCCP socket.

The patch 2 removes net/dccp, LSM code, doc, and etc, leaving
DCCP netfilter modules.

The patch 3 unexports shared functions for DCCP, and the patch 4
renames tcp_or_dccp_get_hashinfo() to tcp_get_hashinfo().

We can do more cleanup; for example, remove IPPROTO_TCP checks in
__inet6?_check_established(), remove __module_get() for twsk,
remove timewait_sock_ops.twsk_destructor(), etc, but it will be
more of TCP stuff, so I'll defer to a later series.

v2: https://lore.kernel.org/20250409003014.19697-1-kuniyu@amazon.com
v1: https://lore.kernel.org/20250407231823.95927-1-kuniyu@amazon.com
====================

Link: https://patch.msgid.link/20250410023921.11307-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11 18:58:14 -07:00
Kuniyuki Iwashima
235bd9d21f tcp: Rename tcp_or_dccp_get_hashinfo().
DCCP was removed, so tcp_or_dccp_get_hashinfo() should be renamed.

Let's rename it to tcp_get_hashinfo().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250410023921.11307-5-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11 18:58:11 -07:00
Kuniyuki Iwashima
22d6c9eebf net: Unexport shared functions for DCCP.
DCCP was removed, so many inet functions no longer need to
be exported.

Let's unexport or use EXPORT_IPV6_MOD() for such functions.

sk_free_unlock_clone() is inlined in sk_clone_lock() as it's
the only caller.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250410023921.11307-4-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11 18:58:11 -07:00
Kuniyuki Iwashima
2a63dd0edf net: Retire DCCP socket.
DCCP was orphaned in 2021 by commit 054c4610bd ("MAINTAINERS: dccp:
move Gerrit Renker to CREDITS"), which noted that the last maintainer
had been inactive for five years.

In recent years, it has become a playground for syzbot, and most changes
to DCCP have been odd bug fixes triggered by syzbot.  Apart from that,
the only changes have been driven by treewide or networking API updates
or adjustments related to TCP.

Thus, in 2023, we announced we would remove DCCP in 2025 via commit
b144fcaf46 ("dccp: Print deprecation notice.").

Since then, only one individual has contacted the netdev mailing list. [0]

There is ongoing research for Multipath DCCP.  The repository is hosted
on GitHub [1], and development is not taking place through the upstream
community.  While the repository is published under the GPLv2 license,
the scheduling part remains proprietary, with a LICENSE file [2] stating:

  "This is not Open Source software."

The researcher mentioned a plan to address the licensing issue, upstream
the patches, and step up as a maintainer, but there has been no further
communication since then.

Maintaining DCCP for a decade without any real users has become a burden.

Therefore, it's time to remove it.

Removing DCCP will also provide significant benefits to TCP.  It allows
us to freely reorganize the layout of struct inet_connection_sock, which
is currently shared with DCCP, and optimize it to reduce the number of
cachelines accessed in the TCP fast path.

Note that we keep DCCP netfilter modules as requested.  [3]

Link: https://lore.kernel.org/netdev/20230710182253.81446-1-kuniyu@amazon.com/T/#u #[0]
Link: https://github.com/telekom/mp-dccp #[1]
Link: https://github.com/telekom/mp-dccp/blob/mpdccp_v03_k5.10/net/dccp/non_gpl_scheduler/LICENSE #[2]
Link: https://lore.kernel.org/netdev/Z_VQ0KlCRkqYWXa-@calendula/ #[3]
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Paul Moore <paul@paul-moore.com> (LSM and SELinux)
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Link: https://patch.msgid.link/20250410023921.11307-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11 18:58:10 -07:00
Kuniyuki Iwashima
b2bdce7adc selftest: net: Remove DCCP bits.
We will remove DCCP.

Let's remove DCCP bits from selftest.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250410023921.11307-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11 18:58:10 -07:00
Lucien.Jheng
ba5560e53d net: phy: air_en8811h: Add clk provider for CKO pin
EN8811H outputs 25MHz or 50MHz clocks on CKO, selected by GPIO3.
CKO clock operates continuously from power-up through md32 loading.
Implement clk provider driver so we can disable the clock output in case
it isn't needed, which also helps to reduce EMF noise

Signed-off-by: Lucien.Jheng <lucienx123@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250409150902.3596-1-lucienx123@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11 18:35:04 -07:00
Zijun Hu
faeefc173b sock: Correct error checking condition for (assign|release)_proto_idx()
(assign|release)_proto_idx() wrongly check find_first_zero_bit() failure
by condition '(prot->inuse_idx == PROTO_INUSE_NR - 1)' obviously.

Fix by correcting the condition to '(prot->inuse_idx == PROTO_INUSE_NR)'

Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250410-fix_net-v2-1-d69e7c5739a4@quicinc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11 16:32:40 -07:00
Russell King (Oracle)
61499764e5 net: stmmac: stm32: simplify clock handling
Some stm32 implementations need the receive clock running in suspend,
as indicated by dwmac->ops->clk_rx_enable_in_suspend. The existing
code achieved this in a rather complex way, by passing a flag around.

However, the clk API prepare/enables are counted - which means that a
clock won't be stopped as long as there are more prepare and enables
than disables and unprepares, just like a reference count.

Therefore, we can simplify this logic by calling clk_prepare_enable()
an additional time in the probe function if this flag is set, and then
balancing that at remove time.

With this, we can avoid passing a "are we suspending" and "are we
resuming" flag to various functions in the driver.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-04-11 09:38:49 +01:00
Heiner Kallweit
0c49baf099 r8169: add helper rtl8125_phy_param
The integrated PHY's of RTL8125/8126 have an own mechanism to access
PHY parameters, similar to what r8168g_phy_param does on earlier PHY
versions. Add helper rtl8125_phy_param to simplify the code.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/847b7356-12d6-441b-ade9-4b6e1539b84a@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 20:18:11 -07:00
Heiner Kallweit
8c40d99e5f r8169: add helper rtl_csi_mod for accessing extended config space
Add a helper for the Realtek-specific mechanism for accessing extended
config space if native access isn't possible.
This avoids code duplication.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/b368fd91-57d7-4cb5-9342-98b4d8fe9aea@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 20:16:43 -07:00
Zhengchao Shao
3b4f78f9ad ipv4: remove unnecessary judgment in ip_route_output_key_hash_rcu
In the ip_route_output_key_cash_rcu function, the input fl4 member saddr is
first checked to be non-zero before entering multicast, broadcast and
arbitrary IP address checks. However, the fact that the IP address is not
0 has already ruled out the possibility of any address, so remove
unnecessary judgment.

Signed-off-by: Zhengchao Shao <shaozhengchao@163.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250409033321.108244-1-shaozhengchao@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 20:15:26 -07:00
Jakub Kicinski
dd4f33b471 Merge branch 'tools-ynl-c-basic-netlink-raw-support'
Jakub Kicinski says:

====================
tools: ynl: c: basic netlink-raw support

Basic support for netlink-raw AKA classic netlink in user space C codegen.
This series is enough to read routes and addresses from the kernel
(see the samples in patches 12 and 13).

Specs need to be slightly adjusted and decorated with the c naming info.

In terms of codegen this series includes just the basic plumbing required
to skip genlmsghdr and handle request types which may technically also
be legal in genetlink-legacy but are very uncommon there.

Subsequent series will add support for:
 - handling CRUD-style notifications
 - code gen for array types classic netlink uses
 - sub-message support

v1: https://lore.kernel.org/20250409000400.492371-1-kuba@kernel.org
====================

Link: https://patch.msgid.link/20250410014658.782120-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 20:14:44 -07:00
Jakub Kicinski
54d790856c tools: ynl: generate code for rt-route and add a sample
YNL C can now generate code for simple classic netlink families.
Include rt-route in the Makefile for generation and add a sample.

    $ ./tools/net/ynl/samples/rt-route
    oif: wlp0s20f3        gateway: 192.168.1.1
    oif: wlp0s20f3        dst: 192.168.1.0/24
    oif: vpn0             dst: fe80::/64
    oif: wlp0s20f3        dst: fe80::/64
    oif: wlp0s20f3        gateway: fe80::200:5eff:fe00:201

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250410014658.782120-14-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 20:14:42 -07:00
Jakub Kicinski
29d34a4d78 tools: ynl: generate code for rt-addr and add a sample
YNL C can now generate code for simple classic netlink families.
Include rt-addr in the Makefile for generation and add a sample.

  $ ./tools/net/ynl/samples/rt-addr
              lo: 127.0.0.1
       wlp0s20f3: 192.168.1.101
              lo: ::
       wlp0s20f3: fe80::6385:be6:746e:8116
            vpn0: fe80::3597:d353:b5a7:66dd

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250410014658.782120-13-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 20:14:41 -07:00
Jakub Kicinski
882e7b1365 tools: ynl-gen: use family c-name in notifications
Family names may include dashes. Fix notification handling
code gen to the c-compatible name.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250410014658.782120-12-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 20:14:41 -07:00
Jakub Kicinski
e8025e72aa tools: ynl-gen: consider dump ops without a do "type-consistent"
If the type for the response to do and dump are the same we don't
generate it twice. This is called "type_consistent" in the generator.
Consider operations which only have dump to also be consistent.
This removes unnecessary "_dump" from the names. There's a number
of GET ops in classic Netlink which only have dump handlers.

Make sure we output the "onesided" types, normally if the type
is consistent we only output it when we render the do structures.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250410014658.782120-11-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 20:14:41 -07:00
Jakub Kicinski
7e8ba0c7de tools: ynl: don't use genlmsghdr in classic netlink
Make sure the codegen calls the right YNL lib helper to start
the request based on family type. Classic netlink request must
not include the genl header.

Conversely don't expect genl headers in the responses.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250410014658.782120-10-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 20:14:41 -07:00
Jakub Kicinski
e0a7903c32 tools: ynl-gen: don't consider requests with fixed hdr empty
C codegen skips generating the structs if request/reply has no attrs.
In such cases the request op takes no argument and return int
(rather than response struct). In case of classic netlink a lot of
information gets passed using the fixed struct, however, so adjust
the logic to consider a request empty only if it has no attrs _and_
no fixed struct.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250410014658.782120-9-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 20:14:41 -07:00
Jakub Kicinski
17b3ce292d tools: ynl: support creating non-genl sockets
Classic netlink has static family IDs specified in YAML,
there is no family name -> ID lookup. Support providing
the ID info to the library via the generated struct and
make library use it. Since NETLINK_ROUTE is ID 0 we need
an extra boolean to indicate classic_id is to be used.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250410014658.782120-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 20:14:41 -07:00
Jakub Kicinski
1652e1f35d netlink: specs: rt-route: add C naming info
Add properties needed for C codegen to match names with uAPI headers.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250410014658.782120-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 20:14:41 -07:00
Jakub Kicinski
52d062362c netlink: specs: rt-addr: add C naming info
Add properties needed for C codegen to match names with uAPI headers.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250410014658.782120-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 20:14:41 -07:00
Jakub Kicinski
295ff1e952 netlink: specs: rt-route: remove the fixed members from attrs
The purpose of the attribute list is to list the attributes
which will be included in a given message to shrink the objects
for families with huge attr spaces. Fixed headers are always
present in their entirety (between netlink header and the attrs)
so there's no point in listing their members. Current C codegen
doesn't expect them and tries to look them up in the attribute space.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250410014658.782120-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 20:14:40 -07:00
Jakub Kicinski
d460016e7b netlink: specs: rt-addr: remove the fixed members from attrs
The purpose of the attribute list is to list the attributes
which will be included in a given message to shrink the objects
for families with huge attr spaces. Fixed headers are always
present in their entirety (between netlink header and the attrs)
so there's no point in listing their members. Current C codegen
doesn't expect them and tries to look them up in the attribute space.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250410014658.782120-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 20:14:40 -07:00
Jakub Kicinski
97a33caa90 netlink: specs: rt-route: specify fixed-header at operations level
The C codegen currently stores the fixed-header as part of family
info, so it only supports one fixed-header type per spec. Luckily
all rtm route message have the same fixed header so just move it up
to the higher level.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250410014658.782120-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 20:14:40 -07:00
Jakub Kicinski
cd5e64fb95 netlink: specs: rename rtnetlink specs in accordance with family name
The rtnetlink family names are set to rt-$name within the YAML
but the files are called rt_$name. C codegen assumes that the
generated file name will match the family. The use of dashes
is in line with our general expectation that name properties
in the spec use dashes not underscores (even tho, as Donald
points out most genl families use underscores in the name).

We have 3 un-ideal options to choose from:

 - accept the slight inconsistency with old families using _, or
 - accept the slight annoyance with all languages having to do s/-/_/
   when looking up family ID, or
 - accept the inconsistency with all name properties in new YAML spec
   being separated with - and just the family name always using _.

Pick option 1 and rename the rtnl spec files.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250410014658.782120-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 20:14:40 -07:00
Krzysztof Hałasa
4145f00227 usbnet: asix AX88772: leave the carrier control to phylink
ASIX AX88772B based USB 10/100 Ethernet adapter doesn't come
up ("carrier off"), despite the built-in 100BASE-FX PHY positive link
indication. The internal PHY is configured (using EEPROM) in fixed
100 Mbps full duplex mode.

The primary problem appears to be using carrier_netif_{on,off}() while,
at the same time, delegating carrier management to phylink. Use only the
latter and remove "manual control" in the asix driver.

I don't have any other AX88772 board here, but the problem doesn't seem
specific to a particular board or settings - it's probably
timing-dependent.

Remove unused asix_adjust_link() as well.

Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl>
Tested-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/m3plhmdfte.fsf_-_@t19.piap.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 18:42:50 -07:00
Jakub Kicinski
8127837aae Merge branch 'trace-add-tracepoint-for-tcp_sendmsg_locked'
Breno Leitao says:

====================
trace: add tracepoint for tcp_sendmsg_locked()

Meta has been using BPF programs to monitor tcp_sendmsg() for years,
indicating significant interest in observing this important
functionality. Adding a proper tracepoint provides a stable API for all
users who need visibility into TCP message transmission.

David Ahern is using a similar functionality with a custom patch[1]. So,
this means we have more than a single use case for this request, and it
might be a good idea to have such feature upstream.

Link: https://lore.kernel.org/all/70168c8f-bf52-4279-b4c4-be64527aa1ac@kernel.org/ [1]

v2: https://lore.kernel.org/20250407-tcpsendmsg-v2-0-9f0ea843ef99@debian.org
v1: https://lore.kernel.org/20250224-tcpsendmsg-v1-1-bac043c59cc8@debian.org
====================

Link: https://patch.msgid.link/20250408-tcpsendmsg-v3-0-208b87064c28@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 18:34:09 -07:00
Breno Leitao
0f08335ade trace: tcp: Add tracepoint for tcp_sendmsg_locked()
Add a tracepoint to monitor TCP send operations, enabling detailed
visibility into TCP message transmission.

Create a new tracepoint within the tcp_sendmsg_locked function,
capturing traditional fields along with size_goal, which indicates the
optimal data size for a single TCP segment. Additionally, a reference to
the struct sock sk is passed, allowing direct access for BPF programs.
The implementation is largely based on David's patch[1] and suggestions.

Link: https://lore.kernel.org/all/70168c8f-bf52-4279-b4c4-be64527aa1ac@kernel.org/ [1]
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250408-tcpsendmsg-v3-2-208b87064c28@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 18:34:05 -07:00
Breno Leitao
b1e9049995 net: pass const to msg_data_left()
The msg_data_left() function doesn't modify the struct msghdr parameter,
so mark it as const. This allows the function to be used with const
references, improving type safety and making the API more flexible.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250408-tcpsendmsg-v3-1-208b87064c28@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 18:34:05 -07:00
Jakub Kicinski
82e401319b Merge branch 'net-stmmac-stmmac_pltfr_find_clk'
Russell King says:

====================
net: stmmac: stmmac_pltfr_find_clk()

The GBETH glue driver that is being proposed duplicates the clock
finding from the bulk clock data in the stmmac platform data structure.
iLet's provide a generic implementation that glue drivers can use, and
convert dwc-qos-eth to use it.
====================

Link: https://patch.msgid.link/Z_Yn3dJjzcOi32uU@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 18:31:57 -07:00
Russell King (Oracle)
34e816acdb net: stmmac: dwc-qos: use stmmac_pltfr_find_clk()
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/E1u2QO9-001Rp8-Ii@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 18:31:54 -07:00
Russell King (Oracle)
de64872019 net: stmmac: provide stmmac_pltfr_find_clk()
Provide a generic way to find a clock in the bulk data.

Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1u2QO4-001Rp2-Dy@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 18:31:53 -07:00
Jakub Kicinski
c1e0100c6a Merge branch 'tcp-add-a-new-tw_paws-drop-reason'
Jiayuan Chen says:

====================
tcp: add a new TW_PAWS drop reason

Devices in the networking path, such as firewalls, NATs, or routers, which
can perform SNAT or DNAT, use addresses from their own limited address
pools to masquerade the source address during forwarding, causing PAWS
verification to fail more easily under TW status.

Currently, packet loss statistics for PAWS can only be viewed through MIB,
which is a global metric and cannot be precisely obtained through tracing
to get the specific 4-tuple of the dropped packet. In the past, we had to
use kprobe ret to retrieve relevant skb information from
tcp_timewait_state_process().

We add a drop_reason pointer and a new counter.

I didn't provide a packetdrill script.
I struggled for a long time to get packetdrill to fix the client port, but
ultimately failed to do so...

Instead, I wrote my own program to trigger PAWS, which can be found at
https://github.com/mrpre/nettrigger/tree/main
'''
//assume nginx running on 172.31.75.114:9999, current host is 172.31.75.115
iptables -t filter -I OUTPUT -p tcp --sport 12345 --tcp-flags RST RST -j DROP
./nettrigger -i eth0 -s 172.31.75.115:12345 -d 172.31.75.114:9999 -action paws
'''

v2: https://lore.kernel.org/5cdc1bdd9caee92a6ae932638a862fd5c67630e8@linux.dev
v3: https://lore.kernel.org/20250407140001.13886-1-jiayuan.chen@linux.dev
====================

Link: https://patch.msgid.link/20250409112614.16153-1-jiayuan.chen@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 18:29:28 -07:00
Jiayuan Chen
c449d5f3a3 tcp: add LINUX_MIB_PAWS_TW_REJECTED counter
When TCP is in TIME_WAIT state, PAWS verification uses
LINUX_PAWSESTABREJECTED, which is ambiguous and cannot be distinguished
from other PAWS verification processes.

We added a new counter, like the existing PAWS_OLD_ACK one.

Also we update the doc with previously missing PAWS_OLD_ACK.

usage:
'''
nstat -az | grep PAWSTimewait
TcpExtPAWSTimewait              1                  0.0
'''

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250409112614.16153-3-jiayuan.chen@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 18:29:26 -07:00
Jiayuan Chen
0427141112 tcp: add TCP_RFC7323_TW_PAWS drop reason
Devices in the networking path, such as firewalls, NATs, or routers, which
can perform SNAT or DNAT, use addresses from their own limited address
pools to masquerade the source address during forwarding, causing PAWS
verification to fail more easily.

Currently, packet loss statistics for PAWS can only be viewed through MIB,
which is a global metric and cannot be precisely obtained through tracing
to get the specific 4-tuple of the dropped packet. In the past, we had to
use kprobe ret to retrieve relevant skb information from
tcp_timewait_state_process().

We add a drop_reason pointer, similar to what previous commit does:
commit e34100c2ec ("tcp: add a drop_reason pointer to tcp_check_req()")

This commit addresses the PAWSESTABREJECTED case and also sets the
corresponding drop reason.

We use 'pwru' to test.

Before this commit:
''''
./pwru 'port 9999'
2025/04/07 13:40:19 Listening for events..
TUPLE                                        FUNC
172.31.75.115:12345->172.31.75.114:9999(tcp) sk_skb_reason_drop(SKB_DROP_REASON_NOT_SPECIFIED)
'''

After this commit:
'''
./pwru 'port 9999'
2025/04/07 13:51:34 Listening for events..
TUPLE                                        FUNC
172.31.75.115:12345->172.31.75.114:9999(tcp) sk_skb_reason_drop(SKB_DROP_REASON_TCP_RFC7323_TW_PAWS)
'''

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250409112614.16153-2-jiayuan.chen@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 18:29:26 -07:00
Michal Luczaj
709894c52c af_unix: Remove unix_unhash()
Dummy unix_unhash() was introduced for sockmap in commit 94531cfcbe
("af_unix: Add unix_stream_proto for sockmap"), but there's no need to
implement it anymore.

->unhash() is only called conditionally: in unix_shutdown() since commit
d359902d5c ("af_unix: Fix NULL pointer bug in unix_shutdown"), and in BPF
proto's sock_map_unhash() since commit 5b4a79ba65 ("bpf, sockmap: Don't
let sock_map_{close,destroy,unhash} call itself").

Remove it.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250409-cleanup-drop-unix-unhash-v1-1-1659e5b8ee84@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 17:32:57 -07:00
Jakub Kicinski
cb7103298d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.15-rc2).

Conflict:

Documentation/networking/netdevices.rst
net/core/lock_debug.c
  04efcee6ef ("net: hold instance lock during NETDEV_CHANGE")
  03df156dd3 ("xdp: double protect netdev->xdp_flags with netdev->lock")

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 16:51:07 -07:00
Linus Torvalds
ab59a86056 Merge tag 'net-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from netfilter.

  Current release - regressions:

    - core: hold instance lock during NETDEV_CHANGE

    - rtnetlink: fix bad unlock balance in do_setlink()

    - ipv6:
       - fix null-ptr-deref in addrconf_add_ifaddr()
       - align behavior across nexthops during path selection

  Previous releases - regressions:

    - sctp: prevent transport UaF in sendmsg

    - mptcp: only inc MPJoinAckHMacFailure for HMAC failures

  Previous releases - always broken:

    - sched:
       - make ->qlen_notify() idempotent
       - ensure sufficient space when sending filter netlink notifications
       - sch_sfq: really don't allow 1 packet limit

    - netfilter: fix incorrect avx2 match of 5th field octet

    - tls: explicitly disallow disconnect

    - eth: octeontx2-pf: fix VF root node parent queue priority"

* tag 'net-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (38 commits)
  ethtool: cmis_cdb: Fix incorrect read / write length extension
  selftests: netfilter: add test case for recent mismatch bug
  nft_set_pipapo: fix incorrect avx2 match of 5th field octet
  net: ppp: Add bound checking for skb data on ppp_sync_txmung
  net: Fix null-ptr-deref by sock_lock_init_class_and_name() and rmmod.
  ipv6: Align behavior across nexthops during path selection
  net: phy: allow MDIO bus PM ops to start/stop state machine for phylink-controlled PHY
  net: phy: move phy_link_change() prior to mdio_bus_phy_may_suspend()
  selftests/tc-testing: sfq: check that a derived limit of 1 is rejected
  net_sched: sch_sfq: move the limit validation
  net_sched: sch_sfq: use a temporary work area for validating configuration
  net: libwx: handle page_pool_dev_alloc_pages error
  selftests: mptcp: validate MPJoin HMacFailure counters
  mptcp: only inc MPJoinAckHMacFailure for HMAC failures
  rtnetlink: Fix bad unlock balance in do_setlink().
  net: ethtool: Don't call .cleanup_data when prepare_data fails
  tc: Ensure we have enough buffer space when sending filter netlink notifications
  net: libwx: Fix the wrong Rx descriptor field
  octeontx2-pf: qos: fix VF root node parent queue index
  selftests: tls: check that disconnect does nothing
  ...
2025-04-10 08:52:18 -07:00
Linus Torvalds
2eb959eeec Merge tag 'for-linus-6.15a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:

 - A simple fix adding the module description of the Xenbus frontend
   module

 - A fix correcting the xen-acpi-processor Kconfig dependency for PVH
   Dom0 support

 - A fix for the Xen balloon driver when running as Xen Dom0 in PVH mode

 - A fix for PVH Dom0 in order to avoid problems with CPU idle and
   frequency drivers conflicting with Xen

* tag 'for-linus-6.15a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: disable CPU idle and frequency drivers for PVH dom0
  x86/xen: fix balloon target initialization for PVH dom0
  xen: Change xen-acpi-processor dom0 dependency
  xenbus: add module description
2025-04-10 07:04:23 -07:00
Linus Torvalds
e4742a89cf Merge tag 'block-6.15-20250410' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:

 - Add a missing ublk selftest script, from test additions added last
   week

 - Two fixes for ublk error recovery and reissue

 - Cleanup of ublk argument passing

* tag 'block-6.15-20250410' of git://git.kernel.dk/linux:
  ublk: pass ublksrv_ctrl_cmd * instead of io_uring_cmd *
  ublk: don't fail request for recovery & reissue in case of ubq->canceling
  ublk: fix handling recovery & reissue in ublk_abort_queue()
  selftests: ublk: fix test_stripe_04
2025-04-10 07:02:22 -07:00
Linus Torvalds
a61ec0dd18 Merge tag 'io_uring-6.15-20250410' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:

 - Reject zero sized legacy provided buffers upfront. No ill side
   effects from this one, only really done to shut up a silly syzbot
   test case.

 - Fix for a regression in tag posting for registered files or buffers,
   where the tag would be posted even when the registration failed.

 - two minor zcrx cleanups for code added this merge window.

* tag 'io_uring-6.15-20250410' of git://git.kernel.dk/linux:
  io_uring/kbuf: reject zero sized provided buffers
  io_uring/zcrx: separate niov number from pages
  io_uring/zcrx: put refill data into separate cache line
  io_uring: don't post tag CQEs on file/buffer registration failure
2025-04-10 07:00:21 -07:00
Linus Torvalds
8f43640c91 Merge tag 'gpio-fixes-for-v6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:

 - fix resource handling in gpio-tegra186

 - fix wakeup source leaks in gpio-mpc8xxx and gpio-zynq

 - fix minor issues with some GPIO OF quirks

 - deprecate GPIOD_FLAGS_BIT_NONEXCLUSIVE and devm_gpiod_unhinge()
   symbols and add a TODO task to track replacing them with a better
   solution

* tag 'gpio-fixes-for-v6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpiolib: of: Move Atmel HSMCI quirk up out of the regulator comment
  gpiolib: of: Fix the choice for Ingenic NAND quirk
  gpio: zynq: Fix wakeup source leaks on device unbind
  gpio: mpc8xxx: Fix wakeup source leaks on device unbind
  gpio: TODO: track the removal of regulator-related workarounds
  MAINTAINERS: add more keywords for the GPIO subsystem entry
  gpio: deprecate devm_gpiod_unhinge()
  gpio: deprecate the GPIOD_FLAGS_BIT_NONEXCLUSIVE flag
  gpio: tegra186: fix resource handling in ACPI probe path
2025-04-10 06:58:06 -07:00
Linus Torvalds
b4991c01ad Merge tag 'mtd/fixes-for-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull mtd fixes from Miquel Raynal:
 "Two important fixes: the build of the SPI NAND layer with old GCC
  versions as well as the fix of the Qpic Makefile which was wrong in
  the first place.

  There are also two smaller fixes about a missing error and status
  check"

* tag 'mtd/fixes-for-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: spinand: Fix build with gcc < 7.5
  mtd: rawnand: Add status chack in r852_ready()
  mtd: inftlcore: Add error check for inftl_read_oob()
  mtd: nand: Drop explicit test for built-in CONFIG_SPI_QPIC_SNAND
2025-04-10 06:56:25 -07:00
Ido Schimmel
eaa517b77e ethtool: cmis_cdb: Fix incorrect read / write length extension
The 'read_write_len_ext' field in 'struct ethtool_cmis_cdb_cmd_args'
stores the maximum number of bytes that can be read from or written to
the Local Payload (LPL) page in a single multi-byte access.

Cited commit started overwriting this field with the maximum number of
bytes that can be read from or written to the Extended Payload (LPL)
pages in a single multi-byte access. Transceiver modules that support
auto paging can advertise a number larger than 255 which is problematic
as 'read_write_len_ext' is a 'u8', resulting in the number getting
truncated and firmware flashing failing [1].

Fix by ignoring the maximum EPL access size as the kernel does not
currently support auto paging (even if the transceiver module does) and
will not try to read / write more than 128 bytes at once.

[1]
Transceiver module firmware flashing started for device enp177s0np0
Transceiver module firmware flashing in progress for device enp177s0np0
Progress: 0%
Transceiver module firmware flashing encountered an error for device enp177s0np0
Status message: Write FW block EPL command failed, LPL length is longer
	than CDB read write length extension allows.

Fixes: 9a3b0d078b ("net: ethtool: Add support for writing firmware blocks using EPL payload")
Reported-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Closes: https://lore.kernel.org/netdev/20250402183123.321036-3-michael.chan@broadcom.com/
Tested-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20250409112440.365672-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-10 14:32:43 +02:00
Paolo Abeni
69ddc6522e Merge tag 'nf-25-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following batch contains a Netfilter fix and improved test coverage:

1) Fix AVX2 matching in nft_pipapo, from Florian Westphal.

2) Extend existing test to improve coverage for the aforementioned bug,
   also from Florian.

netfilter pull request 25-04-10

* tag 'nf-25-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  selftests: netfilter: add test case for recent mismatch bug
  nft_set_pipapo: fix incorrect avx2 match of 5th field octet
====================

Link: https://patch.msgid.link/20250410103647.1030244-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-10 13:13:35 +02:00