Commit Graph

1337517 Commits

Author SHA1 Message Date
Dan Carpenter
f252f23ab6 net: Prevent use after free in netif_napi_set_irq_locked()
The cpu_rmap_put() will call kfree() when the last reference is dropped
so it could result in a use after free when we dereference the same
pointer the next line.  Move the cpu_rmap_put() after the dereference.

Fixes: bd7c00605e ("net: move aRFS rmap management and CPU affinity to core")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://patch.msgid.link/5a9c53a4-5487-4b8c-9ffa-d8e5343aaaaf@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 18:33:24 -08:00
Sean Anderson
b9564ca3a2 net: cadence: macb: Synchronize standard stats
The new stats calculations add several additional calls to
macb/gem_update_stats() and accesses to bp->hw_stats. These are
protected by a spinlock since commit fa52f15c74 ("net: cadence: macb:
Synchronize stats calculations"), which was applied in parallel. Add
some locking now that the net has been merged into net-next.

Fixes: f6af690a29 ("net: cadence: macb: Report standard stats")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Link: https://patch.msgid.link/20250303231832.1648274-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:50:57 -08:00
Jakub Kicinski
85f66df39b Merge branch 'tcp-scale-connect-under-pressure'
Eric Dumazet says:

====================
tcp: scale connect() under pressure

Adoption of bhash2 in linux-6.1 made some operations almost twice
more expensive, because of additional locks.

This series adds RCU in __inet_hash_connect() to help the
case where many attempts need to be made before finding
an available 4-tuple.

This brings a ~200 % improvement in this experiment:

Server:
ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog

Client:
ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog -c -H server

Before series:

  utime_start=0.288582
  utime_end=1.548707
  stime_start=20.637138
  stime_end=2002.489845
  num_transactions=484453
  latency_min=0.156279245
  latency_max=20.922042756
  latency_mean=1.546521274
  latency_stddev=3.936005194
  num_samples=312537
  throughput=47426.00

perf top on the client:

 49.54%  [kernel]       [k] _raw_spin_lock
 25.87%  [kernel]       [k] _raw_spin_lock_bh
  5.97%  [kernel]       [k] queued_spin_lock_slowpath
  5.67%  [kernel]       [k] __inet_hash_connect
  3.53%  [kernel]       [k] __inet6_check_established
  3.48%  [kernel]       [k] inet6_ehashfn
  0.64%  [kernel]       [k] rcu_all_qs

After this series:

  utime_start=0.271607
  utime_end=3.847111
  stime_start=18.407684
  stime_end=1997.485557
  num_transactions=1350742
  latency_min=0.014131929
  latency_max=17.895073144
  latency_mean=0.505675853   # Nice reduction of latency metrics
  latency_stddev=2.125164772
  num_samples=307884
  throughput=139866.80       # 194 % increase

perf top on client:

 56.86%  [kernel]       [k] __inet6_check_established
 17.96%  [kernel]       [k] __inet_hash_connect
 13.88%  [kernel]       [k] inet6_ehashfn
  2.52%  [kernel]       [k] rcu_all_qs
  2.01%  [kernel]       [k] __cond_resched
  0.41%  [kernel]       [k] _raw_spin_lock
====================

Link: https://patch.msgid.link/20250302124237.3913746-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:46:30 -08:00
Eric Dumazet
86c2bc293b tcp: use RCU lookup in __inet_hash_connect()
When __inet_hash_connect() has to try many 4-tuples before
finding an available one, we see a high spinlock cost from
the many spin_lock_bh(&head->lock) performed in its loop.

This patch adds an RCU lookup to avoid the spinlock cost.

check_established() gets a new @rcu_lookup argument.
First reason is to not make any changes while head->lock
is not held.
Second reason is to not make this RCU lookup a second time
after the spinlock has been acquired.

Tested:

Server:

ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog

Client:

ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog -c -H server

Before series:

  utime_start=0.288582
  utime_end=1.548707
  stime_start=20.637138
  stime_end=2002.489845
  num_transactions=484453
  latency_min=0.156279245
  latency_max=20.922042756
  latency_mean=1.546521274
  latency_stddev=3.936005194
  num_samples=312537
  throughput=47426.00

perf top on the client:

 49.54%  [kernel]       [k] _raw_spin_lock
 25.87%  [kernel]       [k] _raw_spin_lock_bh
  5.97%  [kernel]       [k] queued_spin_lock_slowpath
  5.67%  [kernel]       [k] __inet_hash_connect
  3.53%  [kernel]       [k] __inet6_check_established
  3.48%  [kernel]       [k] inet6_ehashfn
  0.64%  [kernel]       [k] rcu_all_qs

After this series:

  utime_start=0.271607
  utime_end=3.847111
  stime_start=18.407684
  stime_end=1997.485557
  num_transactions=1350742
  latency_min=0.014131929
  latency_max=17.895073144
  latency_mean=0.505675853  # Nice reduction of latency metrics
  latency_stddev=2.125164772
  num_samples=307884
  throughput=139866.80      # 190 % increase

perf top on client:

 56.86%  [kernel]       [k] __inet6_check_established
 17.96%  [kernel]       [k] __inet_hash_connect
 13.88%  [kernel]       [k] inet6_ehashfn
  2.52%  [kernel]       [k] rcu_all_qs
  2.01%  [kernel]       [k] __cond_resched
  0.41%  [kernel]       [k] _raw_spin_lock

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Tested-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250302124237.3913746-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:46:27 -08:00
Eric Dumazet
d186f405fd tcp: add RCU management to inet_bind_bucket
Add RCU protection to inet_bind_bucket structure.

- Add rcu_head field to the structure definition.

- Use kfree_rcu() at destroy time, and remove inet_bind_bucket_destroy()
  first argument.

- Use hlist_del_rcu() and hlist_add_head_rcu() methods.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250302124237.3913746-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:46:26 -08:00
Eric Dumazet
ca79d80b0b tcp: optimize inet_use_bhash2_on_bind()
There is no reason to call ipv6_addr_type().

Instead, use highly optimized ipv6_addr_any() and ipv6_addr_v4mapped().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250302124237.3913746-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:46:26 -08:00
Eric Dumazet
ae9d5b19b3 tcp: use RCU in __inet{6}_check_established()
When __inet_hash_connect() has to try many 4-tuples before
finding an available one, we see a high spinlock cost from
__inet_check_established() and/or __inet6_check_established().

This patch adds an RCU lookup to avoid the spinlock
acquisition when the 4-tuple is found in the hash table.

Note that there are still spin_lock_bh() calls in
__inet_hash_connect() to protect inet_bind_hashbucket,
this will be fixed later in this series.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Tested-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250302124237.3913746-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:46:26 -08:00
Alexander Sverdlin
7ff1c88fc8 net: ethernet: ti: cpsw_new: populate netdev of_node
So that of_find_net_device_by_node() can find CPSW ports and other DSA
switches can be stacked downstream. Tested in conjunction with KSZ8873.

Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Link: https://patch.msgid.link/20250303074703.1758297-1-alexander.sverdlin@siemens.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:20:42 -08:00
Markus Elfring
859abe3f92 tipc: Reduce scope for the variable “fdefq” in tipc_link_tnl_prepare()
The address of a data structure member was determined before
a corresponding null pointer check in the implementation of
the function “tipc_link_tnl_prepare”.

Thus avoid the risk for undefined behaviour by moving the definition
for the local variable “fdefq” into an if branch at the end.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Link: https://patch.msgid.link/08fe8fc3-19c3-4324-8719-0ee74b0f32c9@web.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:19:49 -08:00
Jakub Kicinski
ea4342739d selftests: drv-net: use env.rpath in the HDS test
Commit 29b036be1b ("selftests: drv-net: test XDP, HDS auto and
the ioctl path") added a new test case in the net tree, now that
this code has made its way to net-next convert it to use the env.rpath()
helper instead of manually computing the relative path.

Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250228212956.25399-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:14:23 -08:00
Daniel Golle
254f6b272e dsa: mt7530: Utilize REGMAP_IRQ for interrupt handling
Replace the custom IRQ chip handler and mask/unmask functions with
REGMAP_IRQ. This significantly simplifies the code and allows for the
removal of almost all interrupt-related functions from mt7530.c.

Tested on MT7988A built-in switch (MMIO) as well as MT7531AE IC (MDIO).

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Acked-by: Chester A. Unal <chester.a.unal@arinc9.com>
Link: https://patch.msgid.link/221013c3530b61504599e285c341a993f6188f00.1740792674.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:14:23 -08:00
Qingfang Deng
95d0d094ba ppp: use IFF_NO_QUEUE in virtual interfaces
For PPPoE, PPTP, and PPPoL2TP, the start_xmit() function directly
forwards packets to the underlying network stack and never returns
anything other than 1. So these interfaces do not require a qdisc,
and the IFF_NO_QUEUE flag should be set.

Introduces a direct_xmit flag in struct ppp_channel to indicate when
IFF_NO_QUEUE should be applied. The flag is set in ppp_connect_channel()
for relevant protocols.

While at it, remove the usused latency member from struct ppp_channel.

Signed-off-by: Qingfang Deng <dqfext@gmail.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20250301135517.695809-1-dqfext@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:11:17 -08:00
Jakub Kicinski
00d66b5fcd Merge branch 'eth-fbnic-cleanup-macros-and-string-function'
Lee Trager says:

====================
eth: fbnic: Cleanup macros and string function

We have received some feedback that the macros we use for reading FW mailbox
attributes are too large in scope and confusing to understanding. Additionally
the string function did not provide errors allowing it to silently succeed.
This patch set fixes theses issues.
====================

Link: https://patch.msgid.link/20250228191935.3953712-1-lee@trager.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:03:29 -08:00
Lee Trager
8cb3e49b23 eth: fbnic: Replace firmware field macros
Replace the firmware field macros with new macros which follow typical
kernel standards. No variables are required to be predefined for use and
results are now returned. These macros are prefixed with fta or fbnic
TLV attribute.

Signed-off-by: Lee Trager <lee@trager.us>
Link: https://patch.msgid.link/20250228191935.3953712-4-lee@trager.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:03:26 -08:00
Lee Trager
e5cf5107c9 eth: fbnic: Update fbnic_tlv_attr_get_string() to work like nla_strscpy()
Allow fbnic_tlv_attr_get_string() to return an error code. In the event the
source mailbox attribute is missing return -EINVAL. Like nla_strscpy() return
-E2BIG when the source string is larger than the destination string. In this
case the amount of data copied is equal to dstsize.

Signed-off-by: Lee Trager <lee@trager.us>
Link: https://patch.msgid.link/20250228191935.3953712-3-lee@trager.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:03:26 -08:00
Lee Trager
56bcc6ecff eth: fbnic: Prepend TSENE FW fields with FBNIC_FW
All other firmware fields are prepended with FBNIC_FW. Update TSENE fields
to follow the same format.

Signed-off-by: Lee Trager <lee@trager.us>
Link: https://patch.msgid.link/20250228191935.3953712-2-lee@trager.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:03:26 -08:00
Jakub Kicinski
35df5eb9bf Merge branch 'net-convert-gianfar-triple-speed-ethernet-controller-bindings-to-yaml'
J. Neuschäfer says:

====================
net: Convert Gianfar (Triple Speed Ethernet Controller) bindings to YAML

The aim of this series is to modernize the device tree bindings for the
Freescale "Gianfar" ethernet controller (a.k.a. TSEC, Triple Speed
Ethernet Controller) by converting them to YAML.

v1: https://lore.kernel.org/20250220-gianfar-yaml-v1-0-0ba97fd1ef92@posteo.net
====================

Link: https://patch.msgid.link/20250228-gianfar-yaml-v2-0-6beeefbd4818@posteo.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:02:44 -08:00
J. Neuschäfer
a70fdd9368 dt-bindings: net: Convert fsl,gianfar to YAML
Add a binding for the "Gianfar" ethernet controller, also known as
TSEC/eTSEC.

Signed-off-by: J. Neuschäfer <j.ne@posteo.net>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20250228-gianfar-yaml-v2-3-6beeefbd4818@posteo.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:02:16 -08:00
J. Neuschäfer
0386e29e60 dt-bindings: net: fsl,gianfar-mdio: Update information about TBI
When this binding was originally written, all known TSEC Ethernet
controllers had a Ten-Bit Interface (TBI). However, some datasheets such
as for the MPC8315E suggest that this is not universally true:

  The eTSECs do not support TBI, GMII, and FIFO operating modes, so all
  references to these interfaces and features should be ignored for this
  device.

Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: J. Neuschäfer <j.ne@posteo.net>
Link: https://patch.msgid.link/20250228-gianfar-yaml-v2-2-6beeefbd4818@posteo.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:02:16 -08:00
J. Neuschäfer
e4c4522390 dt-bindings: net: Convert fsl,gianfar-{mdio,tbi} to YAML
Move the information related to the Freescale Gianfar (TSEC) MDIO bus
and the Ten-Bit Interface (TBI) from fsl-tsec-phy.txt to a new binding
file in YAML format, fsl,gianfar-mdio.yaml.

Signed-off-by: J. Neuschäfer <j.ne@posteo.net>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20250228-gianfar-yaml-v2-1-6beeefbd4818@posteo.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:02:16 -08:00
Jakub Kicinski
24412be81d Merge branch 'net-phy-nxp-c45-tja11xx-add-support-for-tja1121'
Andrei Botila says:

====================
net: phy: nxp-c45-tja11xx: add support for TJA1121

This patch series adds .match_phy_device for the existing TJAs
to differentiate between TJA1103/TJA1104 and TJA1120/TJA1121.
TJA1103 and TJA1104 share the same PHY_ID but TJA1104 has MACsec
capabilities while TJA1103 doesn't.
Also add support for TJA1121 which is based on TJA1120 hardware
with additional MACsec IP.
====================

Link: https://patch.msgid.link/20250228154320.2979000-1-andrei.botila@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:00:05 -08:00
Andrei Botila
7215e93756 net: phy: nxp-c45-tja11xx: add support for TJA1121
Add support for TJA1121 which is based on TJA1120 but with
additional MACsec IP.

Signed-off-by: Andrei Botila <andrei.botila@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250228154320.2979000-3-andrei.botila@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:00:03 -08:00
Andrei Botila
a06a868a0c net: phy: nxp-c45-tja11xx: add match_phy_device to TJA1103/TJA1104
Add .match_phy_device for the existing TJAs to differentiate between
TJA1103 and TJA1104.
TJA1103 and TJA1104 share the same PHY_ID but TJA1104 has MACsec
capabilities while TJA1103 doesn't.

Signed-off-by: Andrei Botila <andrei.botila@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250228154320.2979000-2-andrei.botila@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:00:03 -08:00
Jiasheng Jiang
39e912a959 dpll: Add an assertion to check freq_supported_num
Since the driver is broken in the case that src->freq_supported is not
NULL but src->freq_supported_num is 0, add an assertion for it.

Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Link: https://patch.msgid.link/20250228150210.34404-1-jiashengjiangcool@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 16:59:16 -08:00
Jakub Kicinski
a144da586c Merge branch 'mptcp-improve-code-coverage-and-small-optimisations'
Matthieu Baerts says:

====================
mptcp: improve code coverage and small optimisations

This small series have various unrelated patches:

- Patch 1 and 2: improve code coverage by validating mptcp_diag_dump_one
  thanks to a new tool displaying MPTCP info for a specific token.

- Patch 3: a fix for a commit which is only in net-next.

- Patch 4: reduce parameters for one in-kernel PM helper.

- Patch 5: exit early when processing an ADD_ADDR echo to avoid unneeded
  operations.
====================

Link: https://patch.msgid.link/20250228-net-next-mptcp-coverage-small-opti-v1-0-f933c4275676@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 16:57:40 -08:00
Matthieu Baerts (NGI0)
f0de92479a mptcp: pm: exit early with ADD_ADDR echo if possible
When the userspace PM is used, or when the in-kernel limits are reached,
there will be no need to schedule the PM worker to signal new addresses.
That corresponds to pm->work_pending set to 0.

In this case, an early exit can be done in mptcp_pm_add_addr_echoed()
not to hold the PM lock, and iterate over the announced addresses list,
not to schedule the worker anyway in this case. This is similar to what
is done when a connection or a subflow has been established.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250228-net-next-mptcp-coverage-small-opti-v1-5-f933c4275676@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 16:57:38 -08:00
Geliang Tang
70c575d5a9 mptcp: pm: in-kernel: reduce parameters of set_flags
The number of parameters in mptcp_nl_set_flags() can be reduced.
Only need to pass a "local" parameter to it instead of "local->addr"
and "local->flags".

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250228-net-next-mptcp-coverage-small-opti-v1-4-f933c4275676@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 16:57:38 -08:00
Geliang Tang
e85d33b355 mptcp: pm: in-kernel: avoid access entry without lock
In mptcp_pm_nl_set_flags(), "entry" is copied to "local" when pernet->lock
is held to avoid direct access to entry without pernet->lock.

Therefore, "local->flags" should be passed to mptcp_nl_set_flags instead
of "entry->flags" when pernet->lock is not held, so as to avoid access to
entry.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Fixes: 145dc6cc4a ("mptcp: pm: change to fullmesh only for 'subflow'")
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250228-net-next-mptcp-coverage-small-opti-v1-3-f933c4275676@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 16:57:38 -08:00
Gang Yan
ba24001665 selftests: mptcp: add a test for mptcp_diag_dump_one
This patch introduces a new 'chk_diag' test in diag.sh. It retrieves
the token for a specified MPTCP socket (msk) using the 'ss' command and
then accesses the 'mptcp_diag_dump_one' in kernel via ./mptcp_diag
to verify if the correct token is returned.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/524
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250228-net-next-mptcp-coverage-small-opti-v1-2-f933c4275676@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 16:57:38 -08:00
Gang Yan
00f5e338cf selftests: mptcp: Add a tool to get specific msk_info
This patch enables the retrieval of the mptcp_info structure corresponding
to a specified MPTCP socket (msk). When multiple MPTCP connections are
present, specific information can be obtained for a given connection
through the 'mptcp_diag_dump_one' by using the 'token' associated with
the msk.

Signed-off-by: Gang Yan <yangang@kylinos.cn>
Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250228-net-next-mptcp-coverage-small-opti-v1-1-f933c4275676@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 16:57:37 -08:00
Jakub Kicinski
71f8992e34 Merge tag 'wireless-next-2025-03-04-v2' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:

====================
First 6.15 material:
 * cfg80211/mac80211
   - remove cooked monitor support
   - strict mode for better AP testing
   - basic EPCS support
   - OMI RX bandwidth reduction support
 * rtw88
   - preparation for RTL8814AU support
 * rtw89
   - use wiphy_lock/wiphy_work
   - preparations for MLO
   - BT-Coex improvements
   - regulatory support in firmware files
 * iwlwifi
   - preparations for the new iwlmld sub-driver

* tag 'wireless-next-2025-03-04-v2' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (128 commits)
  wifi: iwlwifi: remove mld/roc.c
  wifi: mac80211: refactor populating mesh related fields in sinfo
  wifi: cfg80211: reorg sinfo structure elements for mesh
  wifi: iwlwifi: Fix spelling mistake "Increate" -> "Increase"
  wifi: iwlwifi: add Debug Host Command APIs
  wifi: iwlwifi: add IWL_MAX_NUM_IGTKS macro
  wifi: iwlwifi: add OMI bandwidth reduction APIs
  wifi: iwlwifi: remove mvm prefix from iwl_mvm_d3_end_notif
  wifi: iwlwifi: remember if the UATS table was read successfully
  wifi: iwlwifi: export iwl_get_lari_config_bitmap
  wifi: iwlwifi: add support for external 32 KHz clock
  wifi: iwlwifi: mld: add a debug level for EHT prints
  wifi: iwlwifi: mld: add a debug level for PTP prints
  wifi: iwlwifi: remove mvm prefix from iwl_mvm_esr_mode_notif
  wifi: iwlwifi: use 0xff instead of 0xffffffff for invalid
  wifi: iwlwifi: location api cleanup
  wifi: cfg80211: expose update timestamp to drivers
  wifi: mac80211: add ieee80211_iter_chan_contexts_mtx
  wifi: mac80211: fix integer overflow in hwmp_route_info_get()
  wifi: mac80211: Fix possible integer promotion issue
  ...
====================

Link: https://patch.msgid.link/20250304125605.127914-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 08:50:42 -08:00
Paolo Abeni
5b62996184 Merge branch 'netconsole-add-taskname-sysdata-support'
Breno Leitao says:

====================
netconsole: Add taskname sysdata support

This patchset introduces a new feature to the netconsole extradata
subsystem that enables the inclusion of the current task's name in the
sysdata output of netconsole messages.

This enhancement is particularly valuable for large-scale deployments,
such as Meta's, where netconsole collects messages from millions of
servers and stores them in a data warehouse for analysis. Engineers
often rely on these messages to investigate issues and assess kernel
health.

One common challenge we face is determining the context in which
a particular message was generated. By including the task name
(task->comm) with each message, this feature provides a direct answer to
the frequently asked question: "What was running when this message was
generated?"

This added context will significantly improve our ability to diagnose
and troubleshoot issues, making it easier to interpret output of
netconsole.

The patchset consists of seven patches that implement the following changes:

 * Refactor CPU number formatting into a separate function
 * Prefix CPU_NR sysdata feature with SYSDATA_
 * Patch to covert a bitwise operation into boolean
 * Add configfs controls for taskname sysdata feature
 * Add taskname to extradata entry count
 * Add support for including task name in netconsole's extra data output
 * Document the task name feature in Documentation/networking/netconsole.rst
 * Add test coverage for the task name feature to the existing sysdata selftest script

These changes allow users to enable or disable the task name feature via
configfs and provide additional context for kernel messages by showing
which task generated each console message.

I have tested these patches on some servers and they seem to work as
expected.

v1: https://lore.kernel.org/r/20250221-netcons_current-v1-0-21c86ae8fc0d@debian.org

Signed-off-by: Breno Leitao <leitao@debian.org>
====================

Link: https://patch.msgid.link/20250228-netcons_current-v2-0-f53ff79a0db2@debian.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 15:28:31 +01:00
Breno Leitao
d7a2522426 netconsole: selftest: add task name append testing
Add test coverage for the netconsole task name feature to the existing
sysdata selftest script. This extends the test infrastructure to verify
that task names are correctly appended when enabled and absent when
disabled.

The test validates that:
  - Task names appear in the expected format "taskname=<name>"
  - Task names are included when the feature is enabled
  - Task names are excluded when the feature is disabled
  - The feature works correctly alongside other sysdata fields like CPU

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 15:28:28 +01:00
Breno Leitao
7010b61983 netconsole: docs: document the task name feature
Add documentation for the netconsole task name feature in
Documentation/networking/netconsole.rst. This explains how to enable
task name via configfs and demonstrates the output format.

The documentation includes:
- How to enable/disable the feature via taskname_enabled
- The format of the task name in the output
- An example showing the task name appearing in messages

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 15:28:28 +01:00
Breno Leitao
dd30ae5332 netconsole: add task name to extra data fields
This is the core patch for this whole patchset. Add support for
including the current task's name in netconsole's extra data output.
This adds a new append_taskname() function that writes the task name
(from current->comm) into the target's extradata buffer, similar to how
CPU numbers are handled.

The task name is included when the SYSDATA_TASKNAME field is set,
appearing in the format "taskname=<name>" in the output. This additional
context can help with debugging by showing which task generated each
console message.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 15:28:28 +01:00
Breno Leitao
09e877590b netconsole: add configfs controls for taskname sysdata feature
Add configfs interface to enable/disable the taskname sysdata feature.
This adds the following functionality:

The implementation follows the same pattern as the existing CPU number
feature, ensuring consistent behavior and error handling across sysdata
features.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 15:28:28 +01:00
Breno Leitao
33e4b29f2b netconsole: add taskname to extradata entry count
New SYSDATA_TASKNAME feature flag to track when taskname append is enabled.

Additional check in count_extradata_entries() to include taskname in
total, counting it as an entry in extradata. This function is used to
check if we are not overflowing the number of extradata items.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 15:28:28 +01:00
Breno Leitao
4d989521a9 netconsole: refactor CPU number formatting into separate function
Extract CPU number formatting logic from prepare_extradata() into a new
append_cpu_nr() function.

This refactoring improves code organization by isolating CPU number
formatting into its own function while reducing the complexity of
prepare_extradata().

The change prepares the codebase for the upcoming taskname feature by
establishing a consistent pattern for handling sysdata features.

The CPU number formatting logic itself remains unchanged; only its
location has moved to improve maintainability.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 15:28:27 +01:00
Breno Leitao
efb878fbe8 netconsole: Make boolean comparison consistent
Convert the current state assignment to use explicit boolean conversion,
making the code more robust and easier to read. This change adds a
double-negation operator to ensure consistent boolean conversion as
suggested by Paolo[1].

This approach aligns with the existing pattern used in
sysdata_cpu_nr_enabled_show().

Link: https://lore.kernel.org/all/7309e760-63b0-4b58-ad33-2fb8db361141@redhat.com/ [1]
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 15:28:27 +01:00
Breno Leitao
8a683295c2 netconsole: prefix CPU_NR sysdata feature with SYSDATA_
Rename the CPU_NR enum value to SYSDATA_CPU_NR to establish a consistent
naming convention for sysdata features. This change prepares for
upcoming additions to the sysdata feature set by clearly grouping
related features under the SYSDATA prefix.

This change is purely cosmetic and does not modify any functionality.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 15:28:27 +01:00
Johannes Berg
799b7f93c0 wifi: iwlwifi: remove mld/roc.c
This file should never have been part of the commit, remove it.

Fixes: af3be90884 ("wifi: iwlwifi: support ROC version 6")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-03-04 13:48:30 +01:00
Paolo Abeni
188fa9b9e2 Merge branch 'support-some-enhances-features-for-the-hibmcge-driver'
Jijie Shao says:

====================
Support some enhances features for the HIBMCGE driver

In this patch set, we mainly implement some enhanced features.
It mainly includes the statistics, diagnosis, and ioctl to
improve fault locating efficiency,
abnormal irq and MAC link exception handling feature
to enhance driver robustness,
and rx checksum offload feature to improve performance
(tx checksum feature has been implemented).

v3: https://lore.kernel.org/all/20250221115526.1082660-2-shaojijie@huawei.com/
v2: https://lore.kernel.org/all/20250218085829.3172126-1-shaojijie@huawei.com/
v1: https://lore.kernel.org/all/20250213035529.2402283-1-shaojijie@huawei.com/
====================

Link: https://patch.msgid.link/20250228115411.1750803-1-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 13:45:36 +01:00
Jijie Shao
615552c601 net: hibmcge: Add support for ioctl
This patch implements the .ndo_eth_ioctl() to
read and write the PHY register.

Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 13:45:33 +01:00
Jijie Shao
7a5d60dcf9 net: hibmcge: Add support for BMC diagnose feature
The MAC hardware is on the BMC side, and the driver is on the host side.
When the driver is abnormal, the BMC cannot directly detect the
exception cause.
Therefore, this patch implements the BMC diagnosis feature.

When users query driver diagnosis information on the BMC, the driver
detects the query request in the scheduled task and reports
driver statistics and link status to the BMC through the bar space.
The BMC collects logs to analyze exception causes.

Currently, the scheduled task is executed every 30 seconds
To quickly respond to user query requests,
this patch changes the scheduled task to once every second.

Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 13:45:33 +01:00
Jijie Shao
e0306637e8 net: hibmcge: Add support for mac link exception handling feature
If the rate changed frequently, the PHY link ok,
but the MAC link maybe fails.
As a result, the network port is unavailable.

According to the documents of the chip,
core_reset needs to do to fix the fault.

In hw_adjus_link(), the core_reset is added to try to
ensure that MAC link status is normal.
In addition, MAC link failure detection is added.
If the MAC link fails after core_reset, driver invokes
the phy_stop() and phy_start() to re-link.

Due to phydev->lock, re-link cannot be triggered
in adjust_link(). Therefore, this operation
is invoked in a scheduled task.

Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 13:45:33 +01:00
Jijie Shao
fd394a334b net: hibmcge: Add support for abnormal irq handling feature
the hardware error was reported by interrupt,
and need be fixed by doing function reset,
but the whole reset flow takes a long time,
should not do it in irq handler,
so do it in scheduled task.

Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 13:45:33 +01:00
Jijie Shao
833b65a3b5 net: hibmcge: Add support for checksum offload
This patch implements the rx checksum offload feature.

The tx checksum offload processing in .ndo_start_xmit()
has been accepted. This patch also adds the tx checksum
feature, including NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM

Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 13:45:33 +01:00
Jijie Shao
c0bf9bf31e net: hibmcge: Add support for dump statistics
The driver supports many hw statistics. This patch supports
dump statistics through ethtool_ops and ndo.get_stats64().

The type of hw statistics register is u32,
To prevent the statistics register from overflowing,
the driver dump the statistics every 30 seconds.
in a scheduled task.

Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 13:45:33 +01:00
Paolo Abeni
d1352f76ae Merge branch 'introduce-flowtable-hw-offloading-in-airoha_eth-driver'
Lorenzo Bianconi says:

====================
Introduce flowtable hw offloading in airoha_eth driver

Introduce netfilter flowtable integration in airoha_eth driver to
offload 5-tuple flower rules learned by the PPE module if the user
accelerates them using a nft configuration similar to the one reported
below:

table inet filter {
	flowtable ft {
		hook ingress priority filter
		devices = { lan1, lan2, lan3, lan4, eth1 }
		flags offload;
	}
	chain forward {
		type filter hook forward priority filter; policy accept;
		meta l4proto { tcp, udp } flow add @ft
	}
}

Packet Processor Engine (PPE) module available on EN7581 SoC populates
the PPE table with 5-tuples flower rules learned from traffic forwarded
between the GDM ports connected to the Packet Switch Engine (PSE) module.
airoha_eth driver configures and collects data from the PPE module via a
Network Processor Unit (NPU) RISC-V module available on the EN7581 SoC.
Move airoha_eth driver in a dedicated folder
(drivers/net/ethernet/airoha).

v7: https://lore.kernel.org/r/20250224-airoha-en7581-flowtable-offload-v7-0-b4a22ad8364e@kernel.org
v6: https://lore.kernel.org/r/20250221-airoha-en7581-flowtable-offload-v6-0-d593af0e9487@kernel.org
v5: https://lore.kernel.org/r/20250217-airoha-en7581-flowtable-offload-v5-0-28be901cb735@kernel.org
v4: https://lore.kernel.org/r/20250213-airoha-en7581-flowtable-offload-v4-0-b69ca16d74db@kernel.org
v3: https://lore.kernel.org/r/20250209-airoha-en7581-flowtable-offload-v3-0-dba60e755563@kernel.org
v2: https://lore.kernel.org/r/20250207-airoha-en7581-flowtable-offload-v2-0-3a2239692a67@kernel.org
v1: https://lore.kernel.org/r/20250205-airoha-en7581-flowtable-offload-v1-0-d362cfa97b01@kernel.org
====================

Link: https://patch.msgid.link/20250228-airoha-en7581-flowtable-offload-v8-0-01dc1653f46e@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 13:22:14 +01:00
Lorenzo Bianconi
3fe15c640f net: airoha: Introduce PPE debugfs support
Similar to PPE support for Mediatek devices, introduce PPE debugfs
in order to dump binded and unbinded flows.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 13:22:10 +01:00