Commit Graph

1279096 Commits

Author SHA1 Message Date
Donald Hunter
cb7351ac17 doc: netlink: Fix formatting of op flags in generated .rst
Generate op flags as an inline list instead of a stringified python
value.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20240528140652.9445-4-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29 18:10:26 -07:00
Donald Hunter
ebf9004136 doc: netlink: Don't 'sanitize' op docstrings in generated .rst
The doc strings for do/dump ops are emitted as toplevel .rst constructs
so they can be multi-line. Pass multi-line text straight through to the
.rst to retain any simple formatting from the .yaml

This fixes e.g. list formatting for the pin-get docs in dpll.yaml:

https://docs.kernel.org/6.9/networking/netlink_spec/dpll.html#pin-get

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20240528140652.9445-3-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29 18:10:25 -07:00
Donald Hunter
c697f515b6 doc: netlink: Fix generated .rst for multi-line docs
Fix the newline replacement in ynl-gen-rst.py to put spaces between
concatenated lines. This fixes the broken doc string formatting.

See the dpll docs for an example of broken concatenation:

https://docs.kernel.org/6.9/networking/netlink_spec/dpll.html#lock-status

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20240528140652.9445-2-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29 18:10:25 -07:00
Jakub Kicinski
1e37449fe3 Merge branch 'net-ethernet-ti-am65-cpsw-nuss-support-stacked-switches'
Alexander Sverdlin says:

====================
net: ethernet: ti: am65-cpsw-nuss: support stacked switches

Currently an external Ethernet switch connected to a am65-cpsw-nuss CPU
port will not be probed successfully because of_find_net_device_by_node()
will not be able to find the netdev of the CPU port.

It's necessary to populate of_node of the struct device for the
am65-cpsw-nuss ports. DT nodes of the ports are already stored in per-port
private data, but because of some legacy reasons the naming ("phy_node")
was misleading.
====================

Link: https://lore.kernel.org/r/20240528075954.3608118-1-alexander.sverdlin@siemens.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29 17:32:51 -07:00
Alexander Sverdlin
29c71bf2a0 net: ethernet: ti: am65-cpsw-nuss: populate netdev of_node
So that of_find_net_device_by_node() can find cpsw-nuss ports and other DSA
switches can be stacked downstream.

Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Link: https://lore.kernel.org/r/20240528075954.3608118-3-alexander.sverdlin@siemens.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29 17:32:49 -07:00
Alexander Sverdlin
78269025e1 net: ethernet: ti: am65-cpsw-nuss: rename phy_node -> port_np
Rename phy_node to port_np to better reflect what it actually is,
because the new phylink API takes netdev node (or DSA port node),
and resolves the phandle internally.

Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Link: https://lore.kernel.org/r/20240528075954.3608118-2-alexander.sverdlin@siemens.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29 17:32:49 -07:00
Jakub Kicinski
0f4b437b5f Merge branch 'tcp-fix-tcp_poll-races'
Eric Dumazet says:

====================
tcp: fix tcp_poll() races

Flakes in packetdrill tests stressing epoll_wait()
were root caused to bad ordering in tcp_write_err()

Precisely, we have to call sk_error_report() after
tcp_done().

When fixing this issue, we discovered tcp_abort(),
tcp_v4_err() and tcp_v6_err() had similar issues.

Since tcp_reset() has the correct ordering,
first patch takes part of it and creates
tcp_done_with_error() helper.
====================

Link: https://lore.kernel.org/r/20240528125253.1966136-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29 17:21:38 -07:00
Eric Dumazet
fde6f897f2 tcp: fix races in tcp_v[46]_err()
These functions have races when they:

1) Write sk->sk_err
2) call sk_error_report(sk)
3) call tcp_done(sk)

As described in prior patches in this series:

An smp_wmb() is missing.
We should call tcp_done() before sk_error_report(sk)
to have consistent tcp_poll() results on SMP hosts.

Use tcp_done_with_error() where we centralized the
correct sequence.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Link: https://lore.kernel.org/r/20240528125253.1966136-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29 17:21:36 -07:00
Eric Dumazet
5ce4645c23 tcp: fix races in tcp_abort()
tcp_abort() has the same issue than the one fixed in the prior patch
in tcp_write_err().

In order to get consistent results from tcp_poll(), we must call
sk_error_report() after tcp_done().

We can use tcp_done_with_error() to centralize this logic.

Fixes: c1e64e298b ("net: diag: Support destroying TCP sockets.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Link: https://lore.kernel.org/r/20240528125253.1966136-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29 17:21:35 -07:00
Eric Dumazet
853c3bd7b7 tcp: fix race in tcp_write_err()
I noticed flakes in a packetdrill test, expecting an epoll_wait()
to return EPOLLERR | EPOLLHUP on a failed connect() attempt,
after multiple SYN retransmits. It sometimes return EPOLLERR only.

The issue is that tcp_write_err():
 1) writes an error in sk->sk_err,
 2) calls sk_error_report(),
 3) then calls tcp_done().

tcp_done() is writing SHUTDOWN_MASK into sk->sk_shutdown,
among other things.

Problem is that the awaken user thread (from 2) sk_error_report())
might call tcp_poll() before tcp_done() has written sk->sk_shutdown.

tcp_poll() only sees a non zero sk->sk_err and returns EPOLLERR.

This patch fixes the issue by making sure to call sk_error_report()
after tcp_done().

tcp_write_err() also lacks an smp_wmb().

We can reuse tcp_done_with_error() to factor out the details,
as Neal suggested.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Link: https://lore.kernel.org/r/20240528125253.1966136-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29 17:21:35 -07:00
Eric Dumazet
5e514f1cba tcp: add tcp_done_with_error() helper
tcp_reset() ends with a sequence that is carefuly ordered.

We need to fix [e]poll bugs in the following patches,
it makes sense to use a common helper.

Suggested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Link: https://lore.kernel.org/r/20240528125253.1966136-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29 17:21:35 -07:00
Breno Leitao
c3390677f6 netconsole: Do not shutdown dynamic configuration if cmdline is invalid
If a user provides an invalid netconsole configuration during boot time
(e.g., specifying an invalid ethX interface), netconsole will be
entirely disabled. Consequently, the user won't be able to create new
entries in /sys/kernel/config/netconsole/ as that directory does not
exist.

Apart from misconfiguration, another issue arises when ethX is loaded as
a module and the netconsole= line in the command line points to ethX,
resulting in an obvious failure. This renders netconsole unusable, as
/sys/kernel/config/netconsole/ will never appear. This is more annoying
since users reconfigure (or just toggle) the configuratin later (see
commit 5fbd6cdbe3 ("netconsole: Attach cmdline target to dynamic
target"))

Create /sys/kernel/config/netconsole/ even if the command line arguments
are invalid, so, users can create dynamic entries in netconsole.

Reported-by: Aijay Adams <aijay@meta.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240528084225.3215853-1-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29 17:19:50 -07:00
Jonas Karlman
544a74c32b dt-bindings: net: rockchip-dwmac: Fix rockchip,rk3308-gmac compatible
Schema validation using rockchip,rk3308-gmac compatible fails with:

  ethernet@ff4e0000: compatible: ['rockchip,rk3308-gmac'] does not contain items matching the given schema
        from schema $id: http://devicetree.org/schemas/net/rockchip-dwmac.yaml#
  ethernet@ff4e0000: Unevaluated properties are not allowed ('interrupt-names', 'interrupts', 'phy-mode',
                     'reg', 'reset-names', 'resets', 'snps,reset-active-low', 'snps,reset-delays-us',
                     'snps,reset-gpio' were unexpected)
        from schema $id: http://devicetree.org/schemas/net/rockchip-dwmac.yaml#

Add rockchip,rk3308-gmac to snps,dwmac.yaml to fix DT schema validation.

Fixes: 2cc8c910f5 ("dt-bindings: net: rockchip-dwmac: add rk3308 gmac compatible")
Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240528093751.3690231-1-jonas@kwiboo.se
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29 17:12:39 -07:00
Alexander Sverdlin
126913479e net: dsa: lan9303: imply SMSC_PHY
Both LAN9303 and LAN9354 have internal PHYs on both external ports.
Therefore a configuration without SMSC PHY support is non-practical at
least and leads to:

LAN9303_MDIO 8000f00.mdio:00: Found LAN9303 rev. 1
mdio_bus 8000f00.mdio:00: deferred probe pending: (reason unknown)

Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240528073147.3604083-1-alexander.sverdlin@siemens.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-29 17:11:45 -07:00
David S. Miller
782471db6c Merge branch 'xilinx-clock-support'
Vineeth Karumanchi says:

====================
net: xilinx_gmii2rgmii: Add clock support

Add input clock support to gmii_to_rgmii IP.
Add "clocks" bindings for the input clock.

Changes in v3:
- Added items constraints.

Changes in v2:
- removed "clkin" clock name property.
v2 link : https://lore.kernel.org/netdev/20240517054745.4111922-1-vineeth.karumanchi@amd.com/

v1 link : https://lore.kernel.org/netdev/20240515094645.3691877-1-vineeth.karumanchi@amd.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-29 13:10:57 +01:00
Vineeth Karumanchi
daab0ac53e net: phy: xilinx-gmii2rgmii: Adopt clock support
Add clock support to the gmii_to_rgmii IP.
Make clk optional to keep DTB backward compatibility.

Signed-off-by: Vineeth Karumanchi <vineeth.karumanchi@amd.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-29 13:10:57 +01:00
Vineeth Karumanchi
c1d9667108 dt-bindings: net: xilinx_gmii2rgmii: Add clock support
Add "clocks" bindings for the input clock.

Signed-off-by: Vineeth Karumanchi <vineeth.karumanchi@amd.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-29 13:10:57 +01:00
Linus Walleij
2942dfab63 net: ethernet: cortina: Restore TSO support
An earlier commit deleted the TSO support in the Cortina Gemini
driver because the driver was confusing gso_size and MTU,
probably because what the Linux kernel calls "gso_size" was
called "MTU" in the datasheet.

Restore the functionality properly reading the gso_size from
the skbuff.

Tested with iperf3, running a server on a different machine
and client on the device with the cortina gemini ethernet:

Connecting to host 192.168.1.2, port 5201
60008000.ethernet-port eth0: segment offloading mss = 05ea len=1c8a
60008000.ethernet-port eth0: segment offloading mss = 05ea len=1c8a
60008000.ethernet-port eth0: segment offloading mss = 05ea len=27da
60008000.ethernet-port eth0: segment offloading mss = 05ea len=0b92
60008000.ethernet-port eth0: segment offloading mss = 05ea len=2bda
(...)

(The hardware MSS 0x05ea here includes the ethernet headers.)

If I disable all segment offloading on the receiving host and
dump packets using tcpdump -xx like this:

ethtool -K enp2s0 gro off gso off tso off
tcpdump -xx -i enp2s0 host 192.168.1.136

I get segmented packages such as this when running iperf3:

23:16:54.024139 IP OpenWrt.lan.59168 > Fecusia.targus-getdata1:
Flags [.], seq 1486:2934, ack 1, win 4198,
options [nop,nop,TS val 3886192908 ecr 3601341877], length 1448
0x0000:  fc34 9701 a0c6 14d6 4da8 3c4f 0800 4500
0x0010:  05dc 16a0 4000 4006 9aa1 c0a8 0188 c0a8
0x0020:  0102 e720 1451 ff25 9822 4c52 29cf 8010
0x0030:  1066 ac8c 0000 0101 080a e7a2 990c d6a8
(...)
0x05c0:  5e49 e109 fe8c 4617 5e18 7a82 7eae d647
0x05d0:  e8ee ae64 dc88 c897 3f8a 07a4 3a33 6b1b
0x05e0:  3501 a30f 2758 cc44 4b4a

Several such packets often follow after each other verifying
the segmentation into 0x05a8 (1448) byte packages also on the
reveiving end. As can be seen, the ethernet frames are
0x05ea (1514) in size.

Performance with iperf3 before this patch: ~15.5 Mbit/s
Performance with iperf3 after this patch: ~175 Mbit/s

This was running a 60 second test (twice) the best measurement
was 179 Mbit/s.

For comparison if I run iperf3 with UDP I get around 1.05 Mbit/s
both before and after this patch.

While this is a gigabit ethernet interface, the CPU is a cheap
D-Link DIR-685 router (based on the ARMv5 Faraday FA526 at
~50 MHz), and the software is not supposed to drive traffic,
as the device has a DSA chip, so this kind of numbers can be
expected.

Fixes: ac631873c9 ("net: ethernet: cortina: Drop TSO support")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-29 10:41:55 +01:00
Jakub Kicinski
93bda33046 Merge branch 'net-constify-ctl_table-arguments-of-utility-functions'
Thomas Weißschuh says:

====================
net: constify ctl_table arguments of utility functions

The sysctl core is preparing to only expose instances of
struct ctl_table as "const".
This will also affect the ctl_table argument of sysctl handlers.

As the function prototype of all sysctl handlers throughout the tree
needs to stay consistent that change will be done in one commit.

To reduce the size of that final commit, switch utility functions which
are not bound by "typedef proc_handler" to "const struct ctl_table".

No functional change.

This patch(set) is meant to be applied through your subsystem tree.
Or at your preference through the sysctl tree.

Motivation
==========

Moving structures containing function pointers into unmodifiable .rodata
prevents attackers or bugs from corrupting and diverting those pointers.

Also the "struct ctl_table" exposed by the sysctl core were never meant
to be mutated by users.

For this goal changes to both the sysctl core and "const" qualifiers for
various sysctl APIs are necessary.
====================

Link: https://lore.kernel.org/r/20240527-sysctl-const-handler-net-v1-0-16523767d0b2@weissschuh.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-28 19:49:51 -07:00
Thomas Weißschuh
0a9f788fdd ipvs: constify ctl_table arguments of utility functions
The sysctl core is preparing to only expose instances of
struct ctl_table as "const".
This will also affect the ctl_table argument of sysctl handlers.

As the function prototype of all sysctl handlers throughout the tree
needs to stay consistent that change will be done in one commit.

To reduce the size of that final commit, switch utility functions which
are not bound by "typedef proc_handler" to "const struct ctl_table".

No functional change.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20240527-sysctl-const-handler-net-v1-5-16523767d0b2@weissschuh.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-28 19:49:47 -07:00
Thomas Weißschuh
7a20cd1e71 net/ipv6/ndisc: constify ctl_table arguments of utility function
The sysctl core is preparing to only expose instances of
struct ctl_table as "const".
This will also affect the ctl_table argument of sysctl handlers.

As the function prototype of all sysctl handlers throughout the tree
needs to stay consistent that change will be done in one commit.

To reduce the size of that final commit, switch utility functions which
are not bound by "typedef proc_handler" to "const struct ctl_table".

No functional change.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20240527-sysctl-const-handler-net-v1-4-16523767d0b2@weissschuh.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-28 19:49:47 -07:00
Thomas Weißschuh
c55eb03765 net/ipv6/addrconf: constify ctl_table arguments of utility functions
The sysctl core is preparing to only expose instances of
struct ctl_table as "const".
This will also affect the ctl_table argument of sysctl handlers.

As the function prototype of all sysctl handlers throughout the tree
needs to stay consistent that change will be done in one commit.

To reduce the size of that final commit, switch utility functions which
are not bound by "typedef proc_handler" to "const struct ctl_table".

No functional change.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20240527-sysctl-const-handler-net-v1-3-16523767d0b2@weissschuh.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-28 19:49:47 -07:00
Thomas Weißschuh
551814313f net/ipv4/sysctl: constify ctl_table arguments of utility functions
The sysctl core is preparing to only expose instances of
struct ctl_table as "const".
This will also affect the ctl_table argument of sysctl handlers.

As the function prototype of all sysctl handlers throughout the tree
needs to stay consistent that change will be done in one commit.

To reduce the size of that final commit, switch utility functions which
are not bound by "typedef proc_handler" to "const struct ctl_table".

No functional change.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20240527-sysctl-const-handler-net-v1-2-16523767d0b2@weissschuh.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-28 19:49:47 -07:00
Thomas Weißschuh
874aa96d78 net/neighbour: constify ctl_table arguments of utility function
The sysctl core is preparing to only expose instances of
struct ctl_table as "const".
This will also affect the ctl_table argument of sysctl handlers.

As the function prototype of all sysctl handlers throughout the tree
needs to stay consistent that change will be done in one commit.

To reduce the size of that final commit, switch utility functions which
are not bound by "typedef proc_handler" to "const struct ctl_table".

No functional change.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20240527-sysctl-const-handler-net-v1-1-16523767d0b2@weissschuh.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-28 19:49:47 -07:00
Heiner Kallweit
982300c115 r8169: remove detection of chip version 11 (early RTL8168b)
This early RTL8168b version was the first PCIe chip version, and it's
quite quirky. Last sign of life is from more than 15 yrs ago.
Let's remove detection of this chip version, we'll see whether anybody
complains. If not, support for this chip version can be removed a few
kernel versions later.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/875cdcf4-843c-420a-ad5d-417447b68572@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-28 19:47:42 -07:00
Heiner Kallweit
6994520a33 r8169: disable interrupt source RxOverflow
Vendor driver calls this bit RxDescUnavail. All we do in the interrupt
handler in this case is scheduling NAPI. If we should be out of
RX descriptors, then NAPI is scheduled anyway. Therefore remove this
interrupt source. Tested on RTL8168h.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Sunil Goutham <sgoutham@marvell.com>
Link: https://lore.kernel.org/r/9b2054b2-0548-4f48-bf91-b646572093b4@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-28 19:47:24 -07:00
Jakub Kicinski
4b3529edbb Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2024-05-28

We've added 23 non-merge commits during the last 11 day(s) which contain
a total of 45 files changed, 696 insertions(+), 277 deletions(-).

The main changes are:

1) Rename skb's mono_delivery_time to tstamp_type for extensibility
   and add SKB_CLOCK_TAI type support to bpf_skb_set_tstamp(),
   from Abhishek Chauhan.

2) Add netfilter CT zone ID and direction to bpf_ct_opts so that arbitrary
   CT zones can be used from XDP/tc BPF netfilter CT helper functions,
   from Brad Cowie.

3) Several tweaks to the instruction-set.rst IETF doc to address
   the Last Call review comments, from Dave Thaler.

4) Small batch of riscv64 BPF JIT optimizations in order to emit more
   compressed instructions to the JITed image for better icache efficiency,
   from Xiao Wang.

5) Sort bpftool C dump output from BTF, aiming to simplify vmlinux.h
   diffing and forcing more natural type definitions ordering,
   from Mykyta Yatsenko.

6) Use DEV_STATS_INC() macro in BPF redirect helpers to silence
   a syzbot/KCSAN race report for the tx_errors counter,
   from Jiang Yunshui.

7) Un-constify bpf_func_info in bpftool to fix compilation with LLVM 17+
   which started treating const structs as constants and thus breaking
   full BTF program name resolution, from Ivan Babrou.

8) Fix up BPF program numbers in test_sockmap selftest in order to reduce
   some of the test-internal array sizes, from Geliang Tang.

9) Small cleanup in Makefile.btf script to use test-ge check for v1.25-only
   pahole, from Alan Maguire.

10) Fix bpftool's make dependencies for vmlinux.h in order to avoid needless
    rebuilds in some corner cases, from Artem Savkov.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (23 commits)
  bpf, net: Use DEV_STAT_INC()
  bpf, docs: Fix instruction.rst indentation
  bpf, docs: Clarify call local offset
  bpf, docs: Add table captions
  bpf, docs: clarify sign extension of 64-bit use of 32-bit imm
  bpf, docs: Use RFC 2119 language for ISA requirements
  bpf, docs: Move sentence about returning R0 to abi.rst
  bpf: constify member bpf_sysctl_kern:: Table
  riscv, bpf: Try RVC for reg move within BPF_CMPXCHG JIT
  riscv, bpf: Use STACK_ALIGN macro for size rounding up
  riscv, bpf: Optimize zextw insn with Zba extension
  selftests/bpf: Handle forwarding of UDP CLOCK_TAI packets
  net: Add additional bit to support clockid_t timestamp type
  net: Rename mono_delivery_time to tstamp_type for scalabilty
  selftests/bpf: Update tests for new ct zone opts for nf_conntrack kfuncs
  net: netfilter: Make ct zone opts configurable for bpf ct helpers
  selftests/bpf: Fix prog numbers in test_sockmap
  bpf: Remove unused variable "prev_state"
  bpftool: Un-const bpf_func_info to fix it for llvm 17 and newer
  bpf: Fix order of args in call to bpf_map_kvcalloc
  ...
====================

Link: https://lore.kernel.org/r/20240528105924.30905-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-28 07:27:29 -07:00
Dr. David Alan Gilbert
c30ff5f3ae net: usb: remove unused structs 'usb_context'
Both lan78xx and smsc75xx have a 'usb_context'
struct which is unused, since their original commits.

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Link: https://lore.kernel.org/r/20240526205922.176578-1-linux@treblig.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-28 15:24:34 +02:00
Dr. David Alan Gilbert
18ae4c093c net: ethernet: 8390: ne2k-pci: remove unused struct 'ne2k_pci_card'
'ne2k_pci_card' is unused since 2.3.99-pre3 in March 2000.

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-28 15:21:12 +02:00
Dr. David Alan Gilbert
ef7f9febb3 net: ethernet: mlx4: remove unused struct 'mlx4_port_config'
'mlx4_port_config was added by
commit ab9c17a009 ("mlx4_core: Modify driver initialization flow to
accommodate SRIOV for Ethernet")
but remained unused.

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-28 15:21:12 +02:00
Dr. David Alan Gilbert
a09892f6e2 net: ethernet: liquidio: remove unused structs
'niclist' and 'oct_link_status_resp' are unused since the original
commit f21fb3ed36 ("Add support of Cavium Liquidio ethernet
adapters").

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-28 15:21:11 +02:00
Dr. David Alan Gilbert
b2ff269850 net: ethernet: starfire: remove unused structs
'short_rx_done_desc' and 'basic_rx_done_desc' are unused since
commit fdecea6668 ("  [netdrvr starfire] Add GPL'd firmware, remove
compat code").

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-28 15:21:04 +02:00
Gou Hao
de31e96cf4 net/core: move the lockdep-init of sk_callback_lock to sk_init_common()
In commit cdfbabfb2f ("net: Work around lockdep limitation in
sockets that use sockets"), it introduces 'af_kern_callback_keys'
to lockdep-init of sk_callback_lock according to 'sk_kern_sock',
it modifies sock_init_data() only, and sk_clone_lock() calls
sk_init_common() to initialize sk_callback_lock too, so the
lockdep-init of sk_callback_lock should be moved to sk_init_common().

Signed-off-by: Gou Hao <gouhao@uniontech.com>
Link: https://lore.kernel.org/r/20240526145718.9542-2-gouhao@uniontech.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-28 13:29:36 +02:00
Gou Hao
c65b652111 net/core: remove redundant sk_callback_lock initialization
sk_callback_lock has already been initialized in sk_init_common().

Signed-off-by: Gou Hao <gouhao@uniontech.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240526145718.9542-1-gouhao@uniontech.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-28 13:29:36 +02:00
yunshui
d9cbd8343b bpf, net: Use DEV_STAT_INC()
syzbot/KCSAN reported that races happen when multiple CPUs updating
dev->stats.tx_error concurrently. Adopt SMP safe DEV_STATS_INC() to
update the dev->stats fields.

Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: yunshui <jiangyunshui@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240523033520.4029314-1-jiangyunshui@kylinos.cn
2024-05-28 12:04:11 +02:00
Dr. David Alan Gilbert
5233a55a52 mISDN: remove unused struct 'bf_ctx'
'bf_ctx' appears unused since the original
commit 960366cf8d ("Add mISDN DSP").

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Link: https://lore.kernel.org/r/20240523155922.67329-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-27 16:48:00 -07:00
Dave Thaler
e245ef8a0b bpf, docs: Fix instruction.rst indentation
The table captions patch corrected indented most tables to work with
the table directive for adding a caption but missed two of them.

Signed-off-by: Dave Thaler <dthaler1968@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20240526061815.22497-1-dthaler1968@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-26 09:58:42 -07:00
Dave Thaler
f980f13e4e bpf, docs: Clarify call local offset
In the Jump instructions section it explains that the offset is
"relative to the instruction following the jump instruction".
But the program-local section confusingly said "referenced by
offset from the call instruction, similar to JA".

This patch updates that sentence with consistent wording, saying
it's relative to the instruction following the call instruction.

Signed-off-by: Dave Thaler <dthaler1968@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240525153332.21355-1-dthaler1968@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-25 10:41:57 -07:00
Dave Thaler
6a6d8b6f00 bpf, docs: Add table captions
As suggested by Ines Robles in his IETF GENART review at
https://datatracker.ietf.org/doc/review-ietf-bpf-isa-02-genart-lc-robles-2024-05-16/

Signed-off-by: Dave Thaler <dthaler1968@gmail.com>
Link: https://lore.kernel.org/r/20240524164618.18894-1-dthaler1968@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-25 10:39:31 -07:00
Dave Thaler
4e1215d9a1 bpf, docs: clarify sign extension of 64-bit use of 32-bit imm
imm is defined as a 32-bit signed integer.

{MOV, K, ALU64} says it does "dst = src" (where src is 'imm') and it
does do dst = (s64)imm, which in that sense does sign extend imm. The MOVSX
instruction is explained as sign extending, so added the example of
{MOV, K, ALU64} to make this more clear.

{JLE, K, JMP} says it does "PC += offset if dst <= src" (where src is 'imm',
and the comparison is unsigned). This was apparently ambiguous to some
readers as to whether the comparison was "dst <= (u64)(u32)imm" or
"dst <= (u64)(s64)imm" so added an example to make this more clear.

v1 -> v2: Address comments from Yonghong

Signed-off-by: Dave Thaler <dthaler1968@googlemail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20240520215255.10595-1-dthaler1968@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-25 10:39:03 -07:00
Dave Thaler
a985fdca5e bpf, docs: Use RFC 2119 language for ISA requirements
Per IETF convention and discussion at LSF/MM/BPF, use MUST etc.
keywords as requested by IETF Area Director review.  Also as
requested, indicate that documenting BTF is out of scope of this
document and will be covered by a separate IETF specification.

Added paragraph about the terminology that is required IETF boilerplate
and must be worded exactly as such.

Signed-off-by: Dave Thaler <dthaler1968@googlemail.com>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20240517165855.4688-1-dthaler1968@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-25 10:38:35 -07:00
Dave Thaler
4652072e7b bpf, docs: Move sentence about returning R0 to abi.rst
As discussed at LSF/MM/BPF, the sentence about using R0 for returning
values from calls is part of the calling convention and belongs in
abi.rst.  Any further additions or clarifications to this text are left
for future patches on abi.rst.  The current patch is simply to unblock
progression of instruction-set.rst to a standard.

In contrast, the restriction of register numbers to the range 0-10
is untouched, left in the instruction-set.rst definition of the
src_reg and dst_reg fields.

Signed-off-by: Dave Thaler <dthaler1968@googlemail.com>
Link: https://lore.kernel.org/r/20240517153445.3914-1-dthaler1968@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-25 10:37:49 -07:00
Thomas Weißschuh
2c1713a8f1 bpf: constify member bpf_sysctl_kern:: Table
The sysctl core is preparing to only expose instances of struct ctl_table
as "const". This will also affect the ctl_table argument of sysctl handlers,
for which bpf_sysctl_kern::table is also used.

As the function prototype of all sysctl handlers throughout the tree
needs to stay consistent that change will be done in one commit.

To reduce the size of that final commit, switch this utility type which
is not bound by "typedef proc_handler" to "const struct ctl_table".

No functional change.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Joel Granados <j.granados@samsung.com>
Link: https://lore.kernel.org/bpf/20240518-sysctl-const-handler-bpf-v1-1-f0d7186743c1@weissschuh.net
2024-05-24 17:44:32 +02:00
Xiao Wang
99fa63d9ca riscv, bpf: Try RVC for reg move within BPF_CMPXCHG JIT
We could try to emit compressed insn for reg move operation during CMPXCHG
JIT, the instruction compression has no impact on the jump offsets of
following forward and backward jump instructions.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/bpf/20240519050507.2217791-1-xiao.w.wang@intel.com
2024-05-24 17:40:33 +02:00
Xiao Wang
e944fc8152 riscv, bpf: Use STACK_ALIGN macro for size rounding up
Use the macro STACK_ALIGN that is defined in asm/processor.h for stack size
rounding up, just like bpf_jit_comp32.c does.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/bpf/20240523031835.3977713-1-xiao.w.wang@intel.com
2024-05-24 17:16:56 +02:00
Xiao Wang
c12603e76e riscv, bpf: Optimize zextw insn with Zba extension
The Zba extension provides add.uw insn which can be used to implement
zext.w with rs2 set as ZERO.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Pu Lehui <pulehui@huawei.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/bpf/20240516090430.493122-1-xiao.w.wang@intel.com
2024-05-24 16:53:12 +02:00
Martin KaFai Lau
ecec1887e2 Merge branch 'Replace mono_delivery_time with tstamp_type'
Abhishek Chauhan says:

====================
Patch 1 :- This patch takes care of only renaming the mono delivery
timestamp to tstamp_type with no change in functionality of
existing available code in kernel also
Starts assigning tstamp_type with either mono or real and
introduces a new enum in the skbuff.h, again no change in functionality
of the existing available code in kernel , just making the code scalable.

Patch 2 :- Additional bit was added to support tai timestamp type to
avoid tstamp drops in the forwarding path when testing TC-ETF.
Patch is also updating bpf filter.c
Some updates to bpf header files with introduction to BPF_SKB_CLOCK_TAI
and documentation updates stating deprecation of BPF_SKB_TSTAMP_UNSPEC
and BPF_SKB_TSTAMP_DELIVERY_MONO

Patch 3:- Handles forwarding of UDP packets with TAI clock id tstamp_type
type with supported changes for tc_redirect/tc_redirect_dtime
to handle forwarding of UDP packets with TAI tstamp_type
====================

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-23 14:14:50 -07:00
Abhishek Chauhan
c34e3ab2a7 selftests/bpf: Handle forwarding of UDP CLOCK_TAI packets
With changes in the design to forward CLOCK_TAI in the skbuff
framework,  existing selftest framework needs modification
to handle forwarding of UDP packets with CLOCK_TAI as clockid.

Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com>
Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509211834.3235191-4-quic_abchauha@quicinc.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-23 14:14:43 -07:00
Abhishek Chauhan
1693c5db6a net: Add additional bit to support clockid_t timestamp type
tstamp_type is now set based on actual clockid_t compressed
into 2 bits.

To make the design scalable for future needs this commit bring in
the change to extend the tstamp_type:1 to tstamp_type:2 to support
other clockid_t timestamp.

We now support CLOCK_TAI as part of tstamp_type as part of this
commit with existing support CLOCK_MONOTONIC and CLOCK_REALTIME.

Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509211834.3235191-3-quic_abchauha@quicinc.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-23 14:14:36 -07:00
Abhishek Chauhan
4d25ca2d68 net: Rename mono_delivery_time to tstamp_type for scalabilty
mono_delivery_time was added to check if skb->tstamp has delivery
time in mono clock base (i.e. EDT) otherwise skb->tstamp has
timestamp in ingress and delivery_time at egress.

Renaming the bitfield from mono_delivery_time to tstamp_type is for
extensibilty for other timestamps such as userspace timestamp
(i.e. SO_TXTIME) set via sock opts.

As we are renaming the mono_delivery_time to tstamp_type, it makes
sense to start assigning tstamp_type based on enum defined
in this commit.

Earlier we used bool arg flag to check if the tstamp is mono in
function skb_set_delivery_time, Now the signature of the functions
accepts tstamp_type to distinguish between mono and real time.

Also skb_set_delivery_type_by_clockid is a new function which accepts
clockid to determine the tstamp_type.

In future tstamp_type:1 can be extended to support userspace timestamp
by increasing the bitfield.

Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509211834.3235191-2-quic_abchauha@quicinc.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-23 14:14:23 -07:00