Mohan Prasad says:
====================
selftests: Add selftest for link layer and performance testing
The series of patches are for doing basic tests
of NIC driver. Test comprises checks for auto-negotiation,
speed, duplex state and throughput between local NIC and
partner. Tools such as ethtool, iperf3 are used.
Signed-off-by: Mohan Prasad J <mohan.prasad@microchip.com>
====================
Link: https://patch.msgid.link/20241114192545.1742514-1-mohan.prasad@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add selftest case to check the send and receive throughput.
Supported link modes between local NIC driver and partner
are varied. Then send and receive throughput is captured
and verified. Test uses iperf3 tool.
Add iperf3 server/client function in GenerateTraffic class.
Signed-off-by: Mohan Prasad J <mohan.prasad@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add selftest case for testing the speed and duplex state of
local NIC driver and the partner based on the supported
link modes obtained from the ethtool. Speed and duplex states
are varied and verified using ethtool.
Signed-off-by: Mohan Prasad J <mohan.prasad@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add selftest file for the link layer tests of a NIC driver.
Test for auto-negotiation is added.
Add LinkConfig class for changing link layer configs.
Selftest makes use of ksft modules and ethtool.
Include selftest file in the Makefile.
Signed-off-by: Mohan Prasad J <mohan.prasad@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Michael Chan says:
====================
bnxt_en: Add context memory dump to coredump
The driver currently supports Live FW dump and crashed FW dump. On
the newer chips, the driver allocates host backing store context
memory for the chip and FW to store various states. The content of
this context memory will be useful for debugging. This patchset
adds the context memory contents to the ethtool -w coredump using
a new dump flag.
The first patch is the FW interface update, followed by some
refactoring of the context memory logic. Next, we add support for
some new context memory types that contain various FW debug logs.
After that, we add a new dump flag and the code to dump all the
available context memory during ethtool -w coredump.
v1: https://lore.kernel.org/20241113053649.405407-1-michael.chan@broadcom.com
====================
Link: https://patch.msgid.link/20241115151438.550106-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The FW trace memory pages will be added to the ethtool -w coredump
in later patches. In addition to the raw data, the driver has to
add a header to provide the head and tail information on each FW
trace log segment when creating the coredump. The FW sends an async
message to the driver after DMAing a chunk of logs to the context
memory to indicate the last offset containing the tail of the logs.
The driver needs to keep track of that.
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Signed-off-by: Shruti Parab <shruti.parab@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20241115151438.550106-7-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If 'force' is false, it will keep the memory pages and all data
structures for the context memory type if the memory is valid.
This patch always passes true for the 'force' parameter so there is
no change in behavior. Later patches will adjust the 'force' parameter
for the FW log context memory types so that the logs will not be reset
after FW reset.
Signed-off-by: Hongguang Gao <hongguang.gao@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20241115151438.550106-5-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a new bit to struct bnxt_ctx_mem_type to indicate that host
memory has been successfully allocated for this context memory type.
In the next patches, we'll be adding some additional context memory
types for FW debugging/logging. If memory cannot be allocated for
any of these new types, we will not abort and the cleared mem_valid
bit will indicate to skip configuring the memory type.
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Signed-off-by: Shruti Parab <shruti.parab@broadcom.com>
Signed-of-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20241115151438.550106-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jason A. Donenfeld says:
====================
wireguard updates and fixes for 6.13
This tiny series (+3/-2) fixes one bug and has three small improvements.
1) Fix running the netns.sh test suite on systems that haven't yet
inserted the nf_conntrack module.
2) Remove a stray useless function call in a selftest.
3) There's no need to zero out the netdev private data in recent
kernels.
4) Set the TSO max size to be GSO_MAX_SIZE, so that we aggregate larger
packets. Daniel reports seeing a 15% improvement in a simple load and
suggested the speedups would be even better in more complex loads.
====================
Link: https://patch.msgid.link/20241117212030.629159-1-Jason@zx2c4.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Advertise GSO_MAX_SIZE as TSO max size in order support BIG TCP for wireguard.
This helps to improve wireguard performance a bit when enabled as it allows
wireguard to aggregate larger skbs in wg_packet_consume_data_done() via
napi_gro_receive(), but also allows the stack to build larger skbs on xmit
where the driver then segments them before encryption inside wg_xmit().
We've seen a 15% improvement in TCP stream performance.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://patch.msgid.link/20241117212030.629159-5-Jason@zx2c4.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
fun_create_queue was added in 2022 by
commit e1ffcc6681 ("net/fungible: Add service module for Fungible
drivers")
but hasn't been used.
Remove it.
Also remove the static helper functions it was the only user of.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kees Cook says:
====================
UAPI: ethtool: Avoid flex-array in struct ethtool_link_settings
This reverts the tagged struct group in struct ethtool_link_settings and
instead just removes the flexible array member from Linux's view as it
is entirely unused.
====================
Link: https://patch.msgid.link/20241115204115.work.686-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
struct ethtool_link_settings tends to be used as a header for other
structures that have trailing bytes[1], but has a trailing flexible array
itself. Using this overlapped with other structures leads to ambiguous
object sizing in the compiler, so we want to avoid such situations (which
have caused real bugs in the past). Detecting this can be done with
-Wflex-array-member-not-at-end, which will need to be enabled globally.
Using a tagged struct_group() to create a new ethtool_link_settings_hdr
structure isn't possible as it seems we cannot use the tagged variant of
struct_group() due to syntax issues from C++'s perspective (even within
"extern C")[2]. Instead, we can just leave the offending member defined
in UAPI and remove it from the kernel's view of the structure, as Linux
doesn't actually use this member at all. There is also no change in
size since it was already a flexible array that didn't contribute to
size returned by any use of sizeof().
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/lkml/20241109100213.262a2fa0@kernel.org/ [2]
Link: https://lore.kernel.org/lkml/0bc2809fe2a6c11dd4c8a9a10d9bd65cccdb559b.1730238285.git.gustavoars@kernel.org/ [1]
Signed-off-by: Kees Cook <kees@kernel.org>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20241115204308.3821419-3-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
bpf_offload caught a spurious warning in TC recently, but the error
message did not provide enough information to know what the problem
is:
FAIL: Found 'netdevsim' in command output, leaky extack?
Add the extack to the output:
FAIL: Unexpected command output, leaky extack? ('netdevsim', 'Warning: Filter with specified priority/protocol not found.')
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Commits for the SMC protocol usually get carried through the netdev
mailing list. Some portions use InfiniBand verbs that are discussed on
the RDMA mailing list. So run patches by that list too to increase the
likelihood that all interested parties can see them.
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Matthieu Baerts says:
====================
mptcp: pm: lockless list traversal and cleanup
Here are two patches improving the MPTCP in-kernel path-manager.
- Patch 1: the get and dump endpoints operations are iterating over the
endpoints list in a lockless way.
- Patch 2: reduce the code duplication to lookup an endpoint.
====================
Link: https://patch.msgid.link/20241115-net-next-mptcp-pm-lockless-dump-v1-0-f4a1bcb4ca2c@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To return an endpoint to the userspace via Netlink, and to dump all of
them, the endpoint list was iterated while holding the pernet->lock, but
only to read the content of the list.
In these cases, the spin locks can be replaced by RCU read ones, and use
the _rcu variants to iterate over the entries list in a lockless way.
Note that the __lookup_addr_by_id() helper has been modified to use the
_rcu variants of list_for_each_entry(), but with an extra conditions, so
it can be called either while the RCU read lock is held, or when the
associated pernet->lock is held.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241115-net-next-mptcp-pm-lockless-dump-v1-1-f4a1bcb4ca2c@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The driver’s compatibility with devices is confirmed earlier in
platform_match(). Since reaching probe means the device is valid,
the extra check can be removed to simplify the code.
Signed-off-by: Vitalii Mordan <mordan@ispras.ru>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski says:
====================
eth: fbnic: cleanup and add a few stats
Cleanup trival problems with fbnic and add the PCIe and RPC (Rx parser)
stats.
All stats are read under rtnl_lock for now, so the code is pretty
trivial. We'll need to add more locking when we start gathering
drops used by .ndo_get_stats64.
====================
Link: https://patch.msgid.link/20241115015344.757567-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add PCIe hardware statistics support to the fbnic driver. These stats
provide insight into PCIe transaction performance and error conditions.
Which includes, read/write and completion TLP counts and DWORD counts and
debug counters for tag, completion credit and NP credit exhaustion
The stats are exposed via debugfs and can be used to monitor PCIe
performance and debug PCIe issues.
Signed-off-by: Sanman Pradhan <sanman.p211993@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241115015344.757567-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Commit 51183d233b ("net/neighbor: Update neigh_dump_info for strict
data checking") added strict checking. The err variable is not cleared,
so if we find no table to dump we will return the validation error even
if user did not want strict checking.
I think the only way to hit this is to send an buggy request, and ask
for a table which doesn't exist, so there's no point treating this
as a real fix. I only noticed it because a syzbot repro depended on it
to trigger another bug.
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241115003221.733593-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since '1 << rocker_port->pport' may be undefined for port >= 32,
cast the left operand to 'unsigned long long' like it's done in
'rocker_port_set_enable()' above. Compile tested only.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Link: https://patch.msgid.link/20241114151946.519047-1-dmantipov@yandex.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Because optimizing the power consumption of t7XX,
change auto suspend time to 5000.
The Tests uses a script to loop through the power_state
of t7XX.
(for example: /sys/bus/pci/devices/0000\:72\:00.0/power_state)
* If Auto suspend is 20 seconds,
test script show power_state have 0~5% of the time was in D3 state
when host don't have data packet transmission.
* Changed auto suspend time to 5 seconds,
test script show power_state have 50%~80% of the time was in D3 state
when host don't have data packet transmission.
We tested Fibocom FM350 and our products using the t7xx and they all
benefited from this.
Signed-off-by: Jack Wu <wojackbb@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Link: https://patch.msgid.link/20241114102002.481081-1-wojackbb@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Binder places its headers under include/uapi/linux/android/
Make sure replace / with _ in the uAPI header guard, the c_upper()
is more strict and only converts - to _. This is likely a good
constraint to have, to enforce sane naming in enums etc.
But paths may include /.
Signed-off-by: Li Li <dualli@google.com>
Link: https://patch.msgid.link/20241113193239.2113577-2-dualli@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
IEEE802.1Q-2014 supersedes IEEE802.1D-2004. Now Priority Code Point (PCP)
2 is no longer at a lower priority than PCP 0. PCP 1 (Background) is still
at a lower priority than PCP 0 (Best Effort).
Reference:
IEEE802.1Q-2014, Standard for Local and metropolitan area networks
Table I-2 - Traffic type acronyms
Table I-3 - Defining traffic types
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Philo Lu says:
====================
udp: Add 4-tuple hash for connected sockets
This patchset introduces 4-tuple hash for connected udp sockets, to make
connected udp lookup faster.
Stress test results (with 1 cpu fully used) are shown below, in pps:
(1) _un-connected_ socket as server
[a] w/o hash4: 1,825176
[b] w/ hash4: 1,831750 (+0.36%)
(2) 500 _connected_ sockets as server
[c] w/o hash4: 290860 (only 16% of [a])
[d] w/ hash4: 1,889658 (+3.1% compared with [b])
With hash4, compute_score is skipped when lookup, so [d] is slightly
better than [b].
Patch1: Add a new counter for hslot2 named hash4_cnt, to avoid cache line
miss when lookup.
Patch2: Add hslot/hlist_nulls for 4-tuple hash.
Patch3 and 4: Implement 4-tuple hash for ipv4 and ipv6.
The detailed motivation is described in Patch 3.
The 4-tuple hash increases the size of udp_sock and udp_hslot. Thus add it
with CONFIG_BASE_SMALL, i.e., it's a no op with CONFIG_BASE_SMALL.
Intentionally, the feature is not available for udplite. Though udplite
shares some structs and functions with udp, its connect() keeps unchanged.
So all udplite sockets perform the same as un-connected udp sockets.
Besides, udplite also shares the additional memory consumption in udp_sock
and udptable.
changelogs:
v8 -> v9 (Paolo Abeni):
- Add explanation about udplite in cover letter
- Update tags for co-developers
- Add acked-by tags of Paolo and Willem
v7 -> v8:
- add EXPORT_SYMBOL for ipv6.ko build
v6 -> v7 (Kuniyuki Iwashima):
- export udp_ehashfn to be used by udpv6 rehash
v5 -> v6 (Paolo Abeni):
- move udp_table_hash4_init from patch2 to patch1
- use hlist_nulls for lookup-rehash race
- add test results in commit log
- add more comment, e.g., for rehash4 used in hash4
- add ipv6 support (Patch4), and refactor some functions for better
sharing, without functionality change
v4 -> v5 (Paolo Abeni):
- add CONFIG_BASE_SMALL with which udp hash4 does nothing
v3 -> v4 (Willem de Bruijn):
- fix mistakes in udp_pernet_table_alloc()
RFCv2 -> v3 (Gur Stavi):
- minor fix in udp_hashslot2() and udp_table_init()
- add rcu sync in rehash4()
RFCv1 -> RFCv2:
- add a new struct for hslot2
- remove the sockopt UDP_HASH4 because it has little side effect for
unconnected sockets
- add rehash in connect()
- re-organize the patch into 3 smaller ones
- other minor fix
v8:
https://lore.kernel.org/all/20241108054836.123484-1-lulie@linux.alibaba.com/
v7:
https://lore.kernel.org/all/20241105121225.12513-1-lulie@linux.alibaba.com/
v6:
https://lore.kernel.org/all/20241031124550.20227-1-lulie@linux.alibaba.com/
v5:
https://lore.kernel.org/all/20241018114535.35712-1-lulie@linux.alibaba.com/
v4:
https://lore.kernel.org/all/20241012012918.70888-1-lulie@linux.alibaba.com/
v3:
https://lore.kernel.org/all/20241010090351.79698-1-lulie@linux.alibaba.com/
RFCv2:
https://lore.kernel.org/all/20240924110414.52618-1-lulie@linux.alibaba.com/
RFCv1:
https://lore.kernel.org/all/20240913100941.8565-1-lulie@linux.alibaba.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>