Commit Graph

1324333 Commits

Author SHA1 Message Date
Ido Schimmel
c72004aac6 netlink: specs: Add FIB rule flow label attributes
Add the new flow label attributes to the spec. Example:

 # ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/rt_rule.yaml \
	--do newrule \
	--json '{"family": 10, "flowlabel": 1, "flowlabel-mask": 1, "action": 1, "table": 1}'
 None
 $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/rt_rule.yaml \
	--dump getrule --json '{"family": 10}' --output-json \
	| jq '.[] | select(.flowlabel == "0x1")'
 {
   "table": 1,
   "suppress-prefixlen": "0xffffffff",
   "protocol": 0,
   "priority": 32765,
   "flowlabel": "0x1",
   "flowlabel-mask": "0x1",
   "family": 10,
   "dst-len": 0,
   "src-len": 0,
   "tos": 0,
   "action": "to-tbl",
   "flags": 0
 }

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-19 16:02:22 +01:00
Ido Schimmel
4c25f3f051 net: fib_rules: Enable flow label selector usage
Now that both IPv4 and IPv6 correctly handle the new flow label
attributes, enable user space to configure FIB rules that make use of
the flow label by changing the policy to stop rejecting them and
accepting 32 bit values in big-endian byte order.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-19 16:02:22 +01:00
Ido Schimmel
9aa77531a1 ipv6: fib_rules: Add flow label support
Implement support for the new flow label selector which allows IPv6 FIB
rules to match on the flow label with a mask. Ensure that both flow
label attributes are specified (or none) and that the mask is valid.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-19 16:02:22 +01:00
Ido Schimmel
f0c898d8c2 ipv4: fib_rules: Reject flow label attributes
IPv4 FIB rules cannot match on flow label so reject requests that try to
add such rules. Do that in the IPv4 configure callback as the netlink
policy resides in the core and used by both IPv4 and IPv6.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-19 16:02:21 +01:00
Ido Schimmel
d1d761b301 net: fib_rules: Add flow label selector attributes
Add new FIB rule attributes which will allow user space to match on the
IPv6 flow label with a mask. Temporarily set the type of the attributes
to 'NLA_REJECT' while support is being added in the IPv6 code.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-19 16:02:21 +01:00
Jakub Kicinski
4fefbc66df Merge branch 'mdio-support-updates'
Nikita Yushchenko says:

====================
rswitch: mdio support updates

This series cleans up rswitch mdio support, and adds C22 operations.
====================

Link: https://patch.msgid.link/20241216071957.2587354-1-nikita.yoush@cogentembedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18 19:32:24 -08:00
Nikita Yushchenko
db48fe905d net: renesas: rswitch: add mdio C22 support
The generic MPSM operation added by the previous patch can be used both
for C45 and C22.

Add handlers for C22 operations.

Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://patch.msgid.link/20241216071957.2587354-6-nikita.yoush@cogentembedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18 19:32:06 -08:00
Nikita Yushchenko
2aa722b6d8 net: renesas: rswitch: use generic MPSM operation for mdio C45
Introduce rswitch_etha_mpsm_op() that accepts values for MPSM register
fields and executes the transaction.

This avoids some code duptication, and can be used both for C45 and C22.

Convert C45 read and write operations to use that.

Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://patch.msgid.link/20241216071957.2587354-5-nikita.yoush@cogentembedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18 19:32:05 -08:00
Nikita Yushchenko
1ced1b8cac net: renesas: rswitch: align mdio C45 operations with datasheet
Per rswitch datasheet, software can know that mdio operation completed
either by polling MPSM.PSME bit, or via interrupt.

Instead, the driver currently polls for interrupt status bit. Although
this still provides correct result, it requires additional register
operations to clean the interrupt status bits, and generally looks wrong.

Fix it to poll MPSM.PSME bit, as the datasheet suggests.

Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://patch.msgid.link/20241216071957.2587354-4-nikita.yoush@cogentembedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18 19:32:05 -08:00
Nikita Yushchenko
da75ba93e3 net: renesas: rswitch: use FIELD_PREP for remaining MPIC register fields
Commit fb9e6039c3 ("net: renesas: rswitch: fix initial MPIC register
setting") converted setting some MPIC fields to FIELD_PREP.

To keep common style, do the same with mii bus related fields of the
same register.

Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://patch.msgid.link/20241216071957.2587354-3-nikita.yoush@cogentembedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18 19:32:05 -08:00
Nikita Yushchenko
206112fa65 net: renesas: rswitch: do not write to MPSM register at init time
MPSM register is used to execute mdio bus transactions.
There is no need to initialize it early.

Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://patch.msgid.link/20241216071957.2587354-2-nikita.yoush@cogentembedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18 19:32:05 -08:00
Jakub Kicinski
44d49629bf Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
ice: add support for devlink health events

Przemek Kitszel says:

Reports for two kinds of events are implemented, Malicious Driver
Detection (MDD) and Tx hang.

Patches 1, 2, 3: core improvements (checkpatch.pl, devlink extension)
Patch 4: rename current ice devlink/ files
Patches 5, 6, 7: ice devlink health infra + reporters

Mateusz did good job caring for this series, and hardening the code.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: Add MDD logging via devlink health
  ice: add Tx hang devlink health reporter
  ice: rename devlink_port.[ch] to port.[ch]
  devlink: add devlink_fmsg_dump_skb() function
  devlink: add devlink_fmsg_put() macro
  checkpatch: don't complain on _Generic() use
====================

Link: https://patch.msgid.link/20241217210835.3702003-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18 19:17:08 -08:00
Eric Dumazet
a126061c80 ptr_ring: do not block hard interrupts in ptr_ring_resize_multiple()
Jakub added a lockdep_assert_no_hardirq() check in __page_pool_put_page()
to increase test coverage.

syzbot found a splat caused by hard irq blocking in
ptr_ring_resize_multiple() [1]

As current users of ptr_ring_resize_multiple() do not require
hard irqs being masked, replace it to only block BH.

Rename helpers to better reflect they are safe against BH only.

- ptr_ring_resize_multiple() to ptr_ring_resize_multiple_bh()
- skb_array_resize_multiple() to skb_array_resize_multiple_bh()

[1]

WARNING: CPU: 1 PID: 9150 at net/core/page_pool.c:709 __page_pool_put_page net/core/page_pool.c:709 [inline]
WARNING: CPU: 1 PID: 9150 at net/core/page_pool.c:709 page_pool_put_unrefed_netmem+0x157/0xa40 net/core/page_pool.c:780
Modules linked in:
CPU: 1 UID: 0 PID: 9150 Comm: syz.1.1052 Not tainted 6.11.0-rc3-syzkaller-00202-gf8669d7b5f5d #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
RIP: 0010:__page_pool_put_page net/core/page_pool.c:709 [inline]
RIP: 0010:page_pool_put_unrefed_netmem+0x157/0xa40 net/core/page_pool.c:780
Code: 74 0e e8 7c aa fb f7 eb 43 e8 75 aa fb f7 eb 3c 65 8b 1d 38 a8 6a 76 31 ff 89 de e8 a3 ae fb f7 85 db 74 0b e8 5a aa fb f7 90 <0f> 0b 90 eb 1d 65 8b 1d 15 a8 6a 76 31 ff 89 de e8 84 ae fb f7 85
RSP: 0018:ffffc9000bda6b58 EFLAGS: 00010083
RAX: ffffffff8997e523 RBX: 0000000000000000 RCX: 0000000000040000
RDX: ffffc9000fbd0000 RSI: 0000000000001842 RDI: 0000000000001843
RBP: 0000000000000000 R08: ffffffff8997df2c R09: 1ffffd40003a000d
R10: dffffc0000000000 R11: fffff940003a000e R12: ffffea0001d00040
R13: ffff88802e8a4000 R14: dffffc0000000000 R15: 00000000ffffffff
FS:  00007fb7aaf716c0(0000) GS:ffff8880b9300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa15a0d4b72 CR3: 00000000561b0000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 tun_ptr_free drivers/net/tun.c:617 [inline]
 __ptr_ring_swap_queue include/linux/ptr_ring.h:571 [inline]
 ptr_ring_resize_multiple_noprof include/linux/ptr_ring.h:643 [inline]
 tun_queue_resize drivers/net/tun.c:3694 [inline]
 tun_device_event+0xaaf/0x1080 drivers/net/tun.c:3714
 notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
 call_netdevice_notifiers_extack net/core/dev.c:2032 [inline]
 call_netdevice_notifiers net/core/dev.c:2046 [inline]
 dev_change_tx_queue_len+0x158/0x2a0 net/core/dev.c:9024
 do_setlink+0xff6/0x41f0 net/core/rtnetlink.c:2923
 rtnl_setlink+0x40d/0x5a0 net/core/rtnetlink.c:3201
 rtnetlink_rcv_msg+0x73f/0xcf0 net/core/rtnetlink.c:6647
 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2550

Fixes: ff4e538c8c ("page_pool: add a lockdep check for recycling in hardirq")
Reported-by: syzbot+f56a5c5eac2b28439810@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/671e10df.050a0220.2b8c0f.01cf.GAE@google.com/T/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://patch.msgid.link/20241217135121.326370-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18 17:55:30 -08:00
shunlizhou
65c233d8e3 docs: net: bonding: fix typos
The bonding documentation had several "insure" which is not
properly used in the context. Suggest to change to "ensure"
to improve readability.

Signed-off-by: shunlizhou <shunlizhou@aliyun.com>
Link: https://patch.msgid.link/20241216135447.57681-1-shunlizhou@aliyun.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18 17:27:35 -08:00
Yafang Shao
c9cfced173 net/mlx5e: Report rx_discards_phy via rx_dropped
We noticed a high number of rx_discards_phy events on certain servers while
running `ethtool -S`. However, this critical counter is not currently
included in the standard /proc/net/dev statistics file, making it difficult
to monitor effectively—especially given the diversity of vendors across a
large fleet of servers.

Let's report it via the standard rx_dropped metric.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Saeed Mahameed <saeedm@nvidia.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Gal Pressman <gal@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241210022706.6665-1-laoar.shao@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18 15:18:25 -08:00
Jakub Kicinski
4b252f2dab Merge branch 'selftests-net-packetdrill-import-multiple-tests'
Soham Chakradeo says:

====================
selftests/net: packetdrill: import multiple tests

Import tests for the following features (folder names in brackets):
ECN (ecn) : RFC 3168
Close (close) : RFC 9293
TCP_INFO (tcp_info) : RFC 9293
Fast recovery (fast_recovery) : RFC 5681
Timestamping (timestamping) : RFC 1323
Nagle (nagle) : RFC 896
Selective Acknowledgments (sack) : RFC 2018
Recent Timestamp (ts_recent) : RFC 1323
Send file (sendfile)
Syscall bad arg (syscall_bad_arg)
Validate (validate)
Blocking (blocking)
Splice (splice)
End of record (eor)
Limited transmit (limited_transmit)

Procedure to import and test the packetdrill tests into upstream linux
is explained in the first patch of this series

These tests have many authors. We only import them here from
github.com/google/packetdrill. Thanks to the following authors fo their
contributions over the years to these tests: Neal Cardwell, Shuo Chen,
Yuchung Cheng, Jerry Chu, Eric Dumazet, Luke Hsiao, Priyaranjan Jha,
Chonggang Li, Tanner Love, John Sperbeck, Wei Wang and Maciej
Żenczykowski. For more info see the original github commits, such as
https://github.com/google/packetdrill/commit/8229c94928ac.
====================

Link: https://patch.msgid.link/20241217185203.297935-1-sohamch.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18 10:05:52 -08:00
Soham Chakradeo
5d4cadef52 selftests/net: packetdrill: import tcp/user_timeout, tcp/validate, tcp/sendfile, tcp/limited-transmit, tcp/syscall_bad_arg
Use the standard import and testing method, as described in the
import of tcp/ecn and tcp/close , tcp/sack , tcp/tcp_info.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Soham Chakradeo <sohamch@google.com>
Link: https://patch.msgid.link/20241217185203.297935-5-sohamch.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18 10:05:48 -08:00
Soham Chakradeo
6f66920539 selftests/net: packetdrill: import tcp/eor, tcp/splice, tcp/ts_recent, tcp/blocking
Use the standard import and testing method, as described in the
import of tcp/ecn and tcp/close , tcp/sack , tcp/tcp_info.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Soham Chakradeo <sohamch@google.com>
Link: https://patch.msgid.link/20241217185203.297935-4-sohamch.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18 10:05:33 -08:00
Soham Chakradeo
eab35989cc selftests/net: packetdrill: import tcp/fast_recovery, tcp/nagle, tcp/timestamping
Use the standard import and testing method, as described in the
import of tcp/ecn , tcp/close , tcp/sack , tcp/tcp_info.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Soham Chakradeo <sohamch@google.com>
Link: https://patch.msgid.link/20241217185203.297935-3-sohamch.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18 10:05:33 -08:00
Soham Chakradeo
88395c071f selftests/net: packetdrill: import tcp/ecn, tcp/close, tcp/sack, tcp/tcp_info
Same as initial tests, import verbatim from
github.com/google/packetdrill, aside from:

- update `source ./defaults.sh` path to adjust for flat dir
- add SPDX headers
- remove author statements if any
- drop blank lines at EOF

Same test process as previous tests. Both with and without debug mode.
Recording the steps once:

make mrproper
vng --build \
--config tools/testing/selftests/net/packetdrill/config \
--config kernel/configs/debug.config
vng -v --run . --user root --cpus 4 -- \
make -C tools/testing/selftests TARGETS=net/packetdrill run_tests

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Soham Chakradeo <sohamch@google.com>
Link: https://patch.msgid.link/20241217185203.297935-2-sohamch.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18 10:05:28 -08:00
Dr. David Alan Gilbert
c1bad69f8b net: Remove bouncing hippi list
linux-hippi is bouncing with:

 <linux-hippi@sunsite.dk>:
 Sorry, no mailbox here by that name. (#5.1.1)

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-18 12:51:19 +00:00
Andrew Lunn
5a49edec44 net: dsa: qca8k: Fix inconsistent use of jiffies vs milliseconds
wait_for_complete_timeout() expects a timeout in jiffies. With the
driver, some call sites converted QCA8K_ETHERNET_TIMEOUT to jiffies,
others did not. Make the code consistent by changes the #define to
include a call to msecs_to_jiffies, and remove all other calls to
msecs_to_jiffies.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: from Christian would be very welcome.
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-18 12:46:45 +00:00
Jakub Kicinski
2b9da35f48 Merge branch 'support-some-features-for-the-hibmcge-driver'
Jijie Shao says:

====================
Support some features for the HIBMCGE driver

In this patch series, The HIBMCGE driver implements some functions
such as dump register, unicast MAC address filtering, debugfs and reset.
====================

Link: https://patch.msgid.link/20241216040532.1566229-1-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 20:01:41 -08:00
Jijie Shao
adb42b1e0e net: hibmcge: Add nway_reset supported in this module
Add nway_reset supported in this module

Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241216040532.1566229-8-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 20:01:39 -08:00
Jijie Shao
3f5a61f6d5 net: hibmcge: Add reset supported in this module
Sometimes, if the port doesn't work, we can try to fix it by resetting it.

This patch supports reset triggered by ethtool or FLR of PCIe, For example:
 ethtool --reset eth0 dedicated
 echo 1 > /sys/bus/pci/devices/0000\:83\:00.1/reset

We hope that the reset can be performed only when the port is down,
and the port cannot be up during the reset.
Therefore, the entire reset process is protected by the rtnl lock.

After the reset is complete, the hardware registers are restored
to their default values. Therefore, some rebuild operations are
required to rewrite the user configuration to the registers.

Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241216040532.1566229-7-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 20:01:39 -08:00
Jijie Shao
3a03763f38 net: hibmcge: Add pauseparam supported in this module
The MAC can automatically send or respond to pause frames.
This patch supports the function of enabling pause frames
by using ethtool.

Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241216040532.1566229-6-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 20:01:39 -08:00
Jijie Shao
51574da8dc net: hibmcge: Add register dump supported in this module
The dump register is an effective way to analyze problems.

To ensure code flexibility, each register contains the type,
offset, and value information. The ethtool does the pretty print
based on these information.

The driver can dynamically add or delete registers that need to be dumped
in the future because information such as type and offset is contained.
ethtool always can do pretty print.

With the ethtool of a specific version,
the following effects are achieved:
[root@localhost sjj]# ./ethtool -d enp131s0f1
[SPEC] VALID                    [0x0000]: 0x00000001
[SPEC] EVENT_REQ                [0x0004]: 0x00000000
[SPEC] MAC_ID                   [0x0008]: 0x00000002
[SPEC] PHY_ADDR                 [0x000c]: 0x00000002
[SPEC] MAC_ADDR_L               [0x0010]: 0x00000808
[SPEC] MAC_ADDR_H               [0x0014]: 0x08080802
[SPEC] UC_MAX_NUM               [0x0018]: 0x00000004
[SPEC] MAX_MTU                  [0x0028]: 0x00000fc2
[SPEC] MIN_MTU                  [0x002c]: 0x00000100
[SPEC] TX_FIFO_NUM              [0x0030]: 0x00000040
[SPEC] RX_FIFO_NUM              [0x0034]: 0x0000007f
[SPEC] VLAN_LAYERS              [0x0038]: 0x00000002
[MDIO] COMMAND_REG              [0x0000]: 0x0000185f
[MDIO] ADDR_REG                 [0x0004]: 0x00000000
[MDIO] WDATA_REG                [0x0008]: 0x0000a000
[MDIO] RDATA_REG                [0x000c]: 0x00000000
[MDIO] STA_REG                  [0x0010]: 0x00000000
[GMAC] DUPLEX_TYPE              [0x0008]: 0x00000001
[GMAC] FD_FC_TYPE               [0x000c]: 0x00008808
[GMAC] FC_TX_TIMER              [0x001c]: 0x000000ff
[GMAC] FD_FC_ADDR_LOW           [0x0020]: 0xc2000001
[GMAC] FD_FC_ADDR_HIGH          [0x0024]: 0x00000180
[GMAC] MAX_FRM_SIZE             [0x003c]: 0x000005f6
[GMAC] PORT_MODE                [0x0040]: 0x00000002
[GMAC] PORT_EN                  [0x0044]: 0x00000006
...

Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241216040532.1566229-5-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 20:01:38 -08:00
Jijie Shao
37b367d60d net: hibmcge: Add unicast frame filter supported in this module
MAC supports filtering unmatched unicast packets according to
the MAC address table. This patch adds the support for
unicast frame filtering.

To support automatic restoration of MAC entries
after reset, the driver saves a copy of MAC entries in the driver.

Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Hariprasad Kelam <hkelam@marvell.com>
Link: https://patch.msgid.link/20241216040532.1566229-4-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 20:01:38 -08:00
Jijie Shao
df491c419b net: hibmcge: Add irq_info file to debugfs
the driver requested three interrupts: "tx", "rx", "err".
The err interrupt is a summary interrupt. We distinguish
different errors based on the status register and mask.

With "cat /proc/interrupts | grep hibmcge",
we can't distinguish the detailed cause of the error,
so we added this file to debugfs.

the following effects are achieved:
[root@localhost sjj]# cat /sys/kernel/debug/hibmcge/0000\:83\:00.1/irq_info
RX                  : enabled: true , logged: false, count: 0
TX                  : enabled: true , logged: false, count: 0
MAC_MII_FIFO_ERR    : enabled: false, logged: true , count: 0
MAC_PCS_RX_FIFO_ERR : enabled: false, logged: true , count: 0
MAC_PCS_TX_FIFO_ERR : enabled: false, logged: true , count: 0
MAC_APP_RX_FIFO_ERR : enabled: false, logged: true , count: 0
MAC_APP_TX_FIFO_ERR : enabled: false, logged: true , count: 0
SRAM_PARITY_ERR     : enabled: true , logged: true , count: 0
TX_AHB_ERR          : enabled: true , logged: true , count: 0
RX_BUF_AVL          : enabled: true , logged: false, count: 0
REL_BUF_ERR         : enabled: true , logged: true , count: 0
TXCFG_AVL           : enabled: true , logged: false, count: 0
TX_DROP             : enabled: true , logged: false, count: 0
RX_DROP             : enabled: true , logged: false, count: 0
RX_AHB_ERR          : enabled: true , logged: true , count: 0
MAC_FIFO_ERR        : enabled: true , logged: false, count: 0
RBREQ_ERR           : enabled: true , logged: false, count: 0
WE_ERR              : enabled: true , logged: false, count: 0

The irq framework of hibmcge driver also includes tx/rx interrupts.
Therefore, TX and RX are not moved separately form this file.

Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241216040532.1566229-3-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 20:01:38 -08:00
Jijie Shao
86331b5102 net: hibmcge: Add debugfs supported in this module
This patch initializes debugfs and creates root directory
for each device. The tx_ring and rx_ring debugfs files
are implemented together.

Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241216040532.1566229-2-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 20:01:38 -08:00
Jakub Kicinski
95dcfdff8b Merge branch 'lan78xx-preparations-for-phylink'
Oleksij Rempel says:

====================
lan78xx: Preparations for PHYlink

This patch set is a third part of the preparatory work for migrating
the lan78xx USB Ethernet driver to the PHYlink framework. During
extensive testing, I observed that resetting the USB adapter can lead to
various read/write errors. While the errors themselves are acceptable,
they generate excessive log messages, resulting in significant log spam.
This set improves error handling to reduce logging noise by addressing
errors directly and returning early when necessary.
====================

Link: https://patch.msgid.link/20241216120941.1690908-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 19:51:58 -08:00
Oleksij Rempel
01e2f4d55b net: usb: lan78xx: Improve error handling in WoL operations
Enhance error handling in Wake-on-LAN (WoL) operations:
- Log a warning in `lan78xx_get_wol` if `lan78xx_read_reg` fails.
- Check and handle errors from `device_set_wakeup_enable` and
  `phy_ethtool_set_wol` in `lan78xx_set_wol`.
- Ensure proper cleanup with a unified error handling path.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241216120941.1690908-7-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 19:51:55 -08:00
Oleksij Rempel
d09de7ebd4 net: usb: lan78xx: remove PHY register access from ethtool get_regs
Remove PHY register handling from `lan78xx_get_regs` and
`lan78xx_get_regs_len`. Since the controller can have different PHYs
attached, the first 32 registers are not universally relevant or the
most interesting. Simplify the implementation to focus on MAC and device
registers.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20241216120941.1690908-6-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 19:51:55 -08:00
Oleksij Rempel
3a59437ed9 net: usb: lan78xx: rename phy_mutex to mdiobus_mutex
Rename `phy_mutex` to `mdiobus_mutex` for clarity, as the mutex protects
MDIO bus access rather than PHY-specific operations. Update all
references to ensure consistency.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20241216120941.1690908-5-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 19:51:55 -08:00
Oleksij Rempel
7433d022b9 net: usb: lan78xx: Use action-specific label in lan78xx_mac_reset
Rename the generic `done` label to the action-specific `exit_unlock`
label in `lan78xx_mac_reset`. This improves clarity by indicating the
specific cleanup action (mutex unlock) and aligns with best practices
for error handling and cleanup labels.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Link: https://patch.msgid.link/20241216120941.1690908-4-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 19:51:54 -08:00
Oleksij Rempel
18bdefe624 net: usb: lan78xx: Use ETIMEDOUT instead of ETIME in lan78xx_stop_hw
Update lan78xx_stop_hw to return -ETIMEDOUT instead of -ETIME when
a timeout occurs. While -ETIME indicates a general timer expiration,
-ETIMEDOUT is more commonly used for signaling operation timeouts and
provides better consistency with standard error handling in the driver.

The -ETIME checks in tx_complete() and rx_complete() are unrelated to
this error handling change. In these functions, the error values are derived
from urb->status, which reflects USB transfer errors. The error value from
lan78xx_stop_hw will be exposed in the following cases:
- usb_driver::suspend
- net_device_ops::ndo_stop (potentially, though currently the return value
  is not used).

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Link: https://patch.msgid.link/20241216120941.1690908-3-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 19:51:54 -08:00
Oleksij Rempel
30c63abaee net: usb: lan78xx: Add error handling to lan78xx_get_regs
Update `lan78xx_get_regs` to handle errors during register and PHY
reads. Log warnings for failed reads and exit the function early if an
error occurs. Drop all previously logged registers to signal
inconsistent readings to the user space. This ensures that invalid data
is not returned to users.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20241216120941.1690908-2-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 19:51:54 -08:00
Matthew Wilcox (Oracle)
33d06d1d28 niu: Use page->private instead of page->index
We are close to removing page->index.  Use page->private instead, which
is least likely to be removed.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Link: https://patch.msgid.link/20241216155124.3114-1-willy@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 19:46:09 -08:00
Ido Schimmel
1ba06ca96c mlxsw: Switch to napi_gro_receive()
Benefit from the recent conversion of the driver to NAPI and enable GRO
support through the use of napi_gro_receive(). Pass the NAPI pointer
from the bus driver (mlxsw_pci) to the switch driver (mlxsw_spectrum)
through the skb control block where various packet metadata is already
encoded.

The main motivation is to improve forwarding performance through the use
of GRO fraglist [1]. In my testing, when the forwarding data path is
simple (routing between two ports) there is not much difference in
forwarding performance between GRO disabled and GRO enabled with
fraglist.

The improvement becomes more noticeable as the data path becomes more
complex since it is traversed less times with GRO enabled. For example,
with 10 ingress and 10 egress flower filters with different priorities
on the two ports between which routing is performed, there is an
improvement of about 140% in forwarded bandwidth.

[1] https://lore.kernel.org/netdev/20200125102645.4782-1-steffen.klassert@secunet.com/

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/21258fe55f608ccf1ee2783a5a4534220af28903.1734354812.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 19:40:11 -08:00
Jakub Kicinski
3a41305509 Merge branch 'inetpeer-reduce-false-sharing-and-atomic-operations'
Eric Dumazet says:

====================
inetpeer: reduce false sharing and atomic operations

After commit 8c2bd38b95 ("icmp: change the order of rate limits"),
there is a risk that a host receiving packets from an unique
source targeting closed ports is using a common inet_peer structure
from many cpus.

All these cpus have to acquire/release a refcount and update
the inet_peer timestamp (p->dtime)

Switch to pure RCU to avoid changing the refcount, and update
p->dtime only once per jiffy.

Tested:
  DUT : 128 cores, 32 hw rx queues.
  receiving 8,400,000 UDP packets per second, targeting closed ports.

Before the series:
- napi poll can not keep up, NIC drops 1,200,000 packets
  per second.
- We use 20 % of cpu cycles

After this series:
- All packets are received (no more hw drops)
- We use 12 % of cpu cycles.

v1: https://lore.kernel.org/20241213130212.1783302-1-edumazet@google.com
====================

Link: https://patch.msgid.link/20241215175629.1248773-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 19:37:57 -08:00
Eric Dumazet
a853c60950 inetpeer: do not get a refcount in inet_getpeer()
All inet_getpeer() callers except ip4_frag_init() don't need
to acquire a permanent refcount on the inetpeer.

They can switch to full RCU protection.

Move the refcount_inc_not_zero() into ip4_frag_init(),
so that all the other callers no longer have to
perform a pair of expensive atomic operations on
a possibly contended cache line.

inet_putpeer() no longer needs to be exported.

After this patch, my DUT can receive 8,400,000 UDP packets
per second targeting closed ports, using 50% less cpu cycles
than before.

Also change two calls to l3mdev_master_ifindex() by
l3mdev_master_ifindex_rcu() (Ido ideas)

Fixes: 8c2bd38b95 ("icmp: change the order of rate limits")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241215175629.1248773-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 19:37:48 -08:00
Eric Dumazet
50b362f21d inetpeer: update inetpeer timestamp in inet_getpeer()
inet_putpeer() will be removed in the following patch,
because we will no longer use refcounts.

Update inetpeer timestamp (p->dtime) at lookup time.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241215175629.1248773-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 19:37:00 -08:00
Eric Dumazet
7a596a50c4 inetpeer: remove create argument of inet_getpeer()
All callers of inet_getpeer() want to create an inetpeer.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241215175629.1248773-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 19:37:00 -08:00
Eric Dumazet
661cd8fc8e inetpeer: remove create argument of inet_getpeer_v[46]()
All callers of inet_getpeer_v4() and inet_getpeer_v6()
want to create an inetpeer.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241215175629.1248773-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 19:37:00 -08:00
Jakub Kicinski
bf8469fc4d Merge branch 'net-constify-struct-bin_attribute'
Thomas Weißschuh says:

====================
net: constify 'struct bin_attribute'

The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.
====================

Link: https://patch.msgid.link/20241216-sysfs-const-bin_attr-net-v1-0-ec460b91f274@weissschuh.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 19:02:40 -08:00
Thomas Weißschuh
ae026eae08 netxen_nic: constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241216-sysfs-const-bin_attr-net-v1-4-ec460b91f274@weissschuh.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 19:00:51 -08:00
Thomas Weißschuh
2d7b422fa7 net: phy: ks8995: constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241216-sysfs-const-bin_attr-net-v1-2-ec460b91f274@weissschuh.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 19:00:44 -08:00
Thomas Weißschuh
a2558b410d net: bridge: constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20241216-sysfs-const-bin_attr-net-v1-1-ec460b91f274@weissschuh.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 19:00:43 -08:00
Jakub Kicinski
d3c9510dc9 net: page_pool: rename page_pool_is_last_ref()
page_pool_is_last_ref() releases a reference while the name,
to me at least, suggests it just checks if the refcount is 1.
The semantics of the function are the same as those of
atomic_dec_and_test() and refcount_dec_and_test(), so just
use the _and_test() suffix.

Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://patch.msgid.link/20241215212938.99210-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17 17:45:17 -08:00
Ben Shelton
bc10274739 ice: Add MDD logging via devlink health
Add a devlink health reporter for MDD events. The 'dump' handler will
return the information captured in each call to ice_handle_mdd_event().
A device reset (CORER/PFR) will put the reporter back in healthy state.

Signed-off-by: Ben Shelton <benjamin.h.shelton@intel.com>
Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Co-developed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-17 11:32:51 -08:00