This script sets net.ipv4.tcp_rto_max_ms to 1000 and checks
if SYN+ACK RTO is capped at 1s for TFO and non-TFO.
Without the previous patch, the max RTO is applied to TFO
SYN+ACK only, and non-TFO SYN+ACK RTO increases exponentially.
# selftests: net/packetdrill: tcp_rto_synack_rto_max.pkt
# TAP version 13
# 1..2
# tcp_rto_synack_rto_max.pkt:46: error handling packet: timing error:
expected outbound packet at 5.091936 sec but happened at 6.107826 sec; tolerance 0.127974 sec
# script packet: 5.091936 S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK>
# actual packet: 6.107826 S. 0:0(0) ack 1 win 65535 <mss 1460,nop,nop,sackOK>
# not ok 1 ipv4
# tcp_rto_synack_rto_max.pkt:46: error handling packet: timing error:
expected outbound packet at 5.075901 sec but happened at 6.091841 sec; tolerance 0.127976 sec
# script packet: 5.075901 S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK>
# actual packet: 6.091841 S. 0:0(0) ack 1 win 65535 <mss 1460,nop,nop,sackOK>
# not ok 2 ipv6
# # Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0
not ok 49 selftests: net/packetdrill: tcp_rto_synack_rto_max.pkt # exit=1
With the previous patch, all SYN+ACKs are retransmitted
after 1s.
# selftests: net/packetdrill: tcp_rto_synack_rto_max.pkt
# TAP version 13
# 1..2
# ok 1 ipv4
# ok 2 ipv6
# # Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0
ok 49 selftests: net/packetdrill: tcp_rto_synack_rto_max.pkt
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251106003357.273403-7-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull iommufd fixes from Jason Gunthorpe:
- Syzkaller found a case where maths overflows can cause divide by 0
- Typo in a compiler bug warning fix in the selftests broke the
selftests
- type1 compatability had a mismatch when unmapping an already unmapped
range, it should succeed
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
iommufd: Make vfio_compat's unmap succeed if the range is already empty
iommufd/selftest: Fix ioctl return value in _test_cmd_trigger_vevents()
iommufd: Don't overflow during division for dirty tracking
Cross-merge networking fixes after downstream PR (net-6.18-rc5).
Conflicts:
drivers/net/wireless/ath/ath12k/mac.c
9222582ec5 ("Revert "wifi: ath12k: Fix missing station power save configuration"")
6917e268c4 ("wifi: ath12k: Defer vdev bring-up until CSA finalize to avoid stale beacon")
https://lore.kernel.org/11cece9f7e36c12efd732baa5718239b1bf8c950.camel@sipsolutions.net
Adjacent changes:
drivers/net/ethernet/intel/Kconfig
b1d16f7c00 ("libie: depend on DEBUG_FS when building LIBIE_FWLOG")
93f53db9f9 ("ice: switch to Page Pool")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull networking fixes from Jakub Kicinski:
Including fixes from bluetooth and wireless.
Current release - new code bugs:
- ptp: expose raw cycles only for clocks with free-running counter
- bonding: fix null-deref in actor_port_prio setting
- mdio: ERR_PTR-check regmap pointer returned by
device_node_to_regmap()
- eth: libie: depend on DEBUG_FS when building LIBIE_FWLOG
Previous releases - regressions:
- virtio_net: fix perf regression due to bad alignment of
virtio_net_hdr_v1_hash
- Revert "wifi: ath10k: avoid unnecessary wait for service ready
message" caused regressions for QCA988x and QCA9984
- Revert "wifi: ath12k: Fix missing station power save configuration"
caused regressions for WCN7850
- eth: bnxt_en: shutdown FW DMA in bnxt_shutdown(), fix memory
corruptions after kexec
Previous releases - always broken:
- virtio-net: fix received packet length check for big packets
- sctp: fix races in socket diag handling
- wifi: add an hrtimer-based delayed work item to avoid low
granularity of timers set relatively far in the future, and use it
where it matters (e.g. when performing AP-scheduled channel switch)
- eth: mlx5e:
- correctly propagate error in case of module EEPROM read failure
- fix HW-GRO on systems with PAGE_SIZE == 64kB
- dsa: b53: fixes for tagging, link configuration / RMII, FDB,
multicast
- phy: lan8842: implement latest errata"
* tag 'net-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (63 commits)
selftests/vsock: avoid false-positives when checking dmesg
net: bridge: fix MST static key usage
net: bridge: fix use-after-free due to MST port state bypass
lan966x: Fix sleeping in atomic context
bonding: fix NULL pointer dereference in actor_port_prio setting
net: dsa: microchip: Fix reserved multicast address table programming
net: wan: framer: pef2256: Switch to devm_mfd_add_devices()
net: libwx: fix device bus LAN ID
net/mlx5e: SHAMPO, Fix header formulas for higher MTUs and 64K pages
net/mlx5e: SHAMPO, Fix skb size check for 64K pages
net/mlx5e: SHAMPO, Fix header mapping for 64K pages
net: ti: icssg-prueth: Fix fdb hash size configuration
net/mlx5e: Fix return value in case of module EEPROM read error
net: gro_cells: Reduce lock scope in gro_cell_poll
libie: depend on DEBUG_FS when building LIBIE_FWLOG
wifi: mac80211_hwsim: Limit destroy_on_close radio removal to netgroup
netpoll: Fix deadlock in memory allocation under spinlock
net: ethernet: ti: netcp: Standardize knav_dma_open_channel to return NULL on error
virtio-net: fix received length check in big packets
bnxt_en: Fix warning in bnxt_dl_reload_down()
...
Sometimes VMs will have some intermittent dmesg warnings that are
unrelated to vsock. Change the dmesg parsing to filter on strings
containing 'vsock' to avoid false positive failures that are unrelated
to vsock. The downside is that it is possible for some vsock related
warnings to not contain the substring 'vsock', so those will be missed.
Fixes: a4a65c6fe0 ("selftests/vsock: add initial vmtest.sh for vsock")
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20251105-vsock-vmtest-dmesg-fix-v2-1-1a042a14892c@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Update all struct proto_ops connect() callback function prototypes from
"struct sockaddr *" to "struct sockaddr_unsized *" to avoid lying to the
compiler about object sizes. Calls into struct proto handlers gain casts
that will be removed in the struct proto conversion patch.
No binary changes expected.
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20251104002617.2752303-3-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Update all struct proto_ops bind() callback function prototypes from
"struct sockaddr *" to "struct sockaddr_unsized *" to avoid lying to the
compiler about object sizes. Calls into struct proto handlers gain casts
that will be removed in the struct proto conversion patch.
No binary changes expected.
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20251104002617.2752303-2-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
By design, an MPTCP connection will not accept extra subflows where no
MPTCP listening sockets can accept such requests.
In other words, it means that if the 'server' listens on a specific
address / device, it cannot accept MP_JOIN sent to a different address /
device. Except if there is another MPTCP listening socket accepting
them.
This is what the new tests are validating:
- Forcing a bind on the main v4/v6 address, and checking that MP_JOIN
to announced addresses are not accepted.
- Also forcing a bind on the main v4/v6 address, but before, another
listening socket is created to accept additional subflows. Note that
'mptcpize run nc -l' -- or something else only doing: socket(MPTCP),
bind(<IP>), listen(0) -- would be enough, but here mptcp_connect is
reused not to depend on another tool just for that.
- Same as the previous one, but using v6 link-local addresses: this is
a bit particular because it is required to specify the outgoing
network interface when connecting to a link-local address announced
by the other peer. When using the routing rules, this doesn't work
(the outgoing interface is not known) ; but it does work with a
'laminar' endpoint having a specified interface.
Note that extra small modifications are needed for these tests to work:
- mptcp_connect's check_getpeername_connect() check should strip the
specified interface when comparing addresses.
- With IPv6 link-local addresses, it is required to wait for them to
be ready (no longer in 'tentative' mode) before using them, otherwise
the bind() will not be allowed.
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/591
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251101-net-next-mptcp-fm-endp-nb-bind-v1-4-b4166772d6bb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Our documentation is saying that the in-kernel PM is only using fullmesh
endpoints to establish subflows to announced addresses when at least one
endpoint has a fullmesh flag. But this was not totally correct: only
fullmesh endpoints were used if at least one endpoint *from the same
address family as the received ADD_ADDR* has the fullmesh flag.
This is confusing, and it seems clearer not to have differences
depending on the address family.
So, now, when at least one MPTCP endpoint has a fullmesh flag, the local
addresses are picked from all fullmesh endpoints, which might be 0 if
there are no endpoints for the correct address family.
One selftest needs to be adapted for this behaviour change.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251101-net-next-mptcp-fm-endp-nb-bind-v1-2-b4166772d6bb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The script "ethtool-common.sh" is not installed in INSTALL_PATH, and
triggers some errors when I try to run the test
'drivers/net/netdevsim/ethtool-coalesce.sh':
TAP version 13
1..1
# timeout set to 600
# selftests: drivers/net/netdevsim: ethtool-coalesce.sh
# ./ethtool-coalesce.sh: line 4: ethtool-common.sh: No such file or directory
# ./ethtool-coalesce.sh: line 25: make_netdev: command not found
# ethtool: bad command line argument(s)
# ./ethtool-coalesce.sh: line 124: check: command not found
# ./ethtool-coalesce.sh: line 126: [: -eq: unary operator expected
# FAILED /0 checks
not ok 1 selftests: drivers/net/netdevsim: ethtool-coalesce.sh # exit=1
Install this file to avoid this error. After this patch:
TAP version 13
1..1
# timeout set to 600
# selftests: drivers/net/netdevsim: ethtool-coalesce.sh
# PASSED all 22 checks
ok 1 selftests: drivers/net/netdevsim: ethtool-coalesce.sh
Fixes: fbb8531e58 ("selftests: extract common functions in ethtool-common.sh")
Signed-off-by: Wang Liang <wangliang74@huawei.com>
Link: https://patch.msgid.link/20251030040340.3258110-1-wangliang74@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The GRO self-test, gro.c, currently constructs IPv6 packets containing a
Hop-by-Hop Options header (IPPROTO_HOPOPTS) to ensure the GRO path
correctly handles IPv6 extension headers.
However, network elements may be configured to drop packets with the
Hop-by-Hop Options header (HBH). This causes the self-test to fail
in environments where such network elements are present.
To improve the robustness and reliability of this test in diverse
network environments, switch from using IPPROTO_HOPOPTS to
IPPROTO_DSTOPTS (Destination Options).
The Destination Options header is less likely to be dropped by
intermediate routers and still serves the core purpose of the test:
validating GRO's handling of an IPv6 extension header. This change
ensures the test can execute successfully without being incorrectly
failed by network policies outside the kernel's control.
Fixes: 7d1575014a ("selftests/net: GRO coalesce test")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Anubhav Singh <anubhavsinggh@google.com>
Link: https://patch.msgid.link/20251030060436.1556664-1-anubhavsinggh@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Due to the gro_sender sending data packets and FIN packets
in very quick succession, these are received almost simultaneously
by the gro_receiver. FIN packets are sometimes processed before the
data packets leading to intermittent (~1/100) test failures.
This change adds a delay of 100ms before sending FIN packets
in gro:tcp test to avoid the out-of-order delivery. The same
mitigation already exists for the gro:ip test.
Fixes: 7d1575014a ("selftests/net: GRO coalesce test")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Anubhav Singh <anubhavsinggh@google.com>
Link: https://patch.msgid.link/20251030062818.1562228-1-anubhavsinggh@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull VFIO fixes from Alex Williamson:
- Fix overflows in vfio type1 backend for mappings at the end of the
64-bit address space, resulting in leaked pinned memory.
New selftest support included to avoid such issues in the future
(Alex Mastro)
* tag 'vfio-v6.18-rc4' of https://github.com/awilliam/linux-vfio:
vfio: selftests: add end of address space DMA map/unmap tests
vfio: selftests: update DMA map/unmap helpers to support more test kinds
vfio/type1: handle DMA map/unmap up to the addressable limit
vfio/type1: move iova increment to unmap_unpin_*() caller
vfio/type1: sanitize for overflow using check_*_overflow()
Pull kselftest fixes from Shuah Khan:
"Fix build warning in cachestat found during clang build and add
tmpshmcstat to .gitignore"
* tag 'linux_kselftest-fixes-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: cachestat: Fix warning on declaration under label
selftests/cachestat: add tmpshmcstat file to .gitignore
We are trying to move away from netdevsim-only tests and towards
tests which can be run both against netdevsim and real drivers.
Replace the simple bash script we have for checking ethtool -g/-G
on netdevsim with a Python test tweaking those params as well
as channel count.
The new test is not exactly equivalent to the netdevsim one,
but real drivers don't often support random ring sizes,
let alone modifying max values via debugfs.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20251029164930.2923448-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
get-reg-list includes ZCR_EL2 in the list of EL2 registers that it looks
for when NV is enabled but does not have any feature gate for this register,
meaning that testing any combination of features that includes EL2 but does
not include SVE will result in a test failure due to a missing register
being reported:
| The following lines are missing registers:
|
| ARM64_SYS_REG(3, 4, 1, 2, 0),
Add ZCR_EL2 to feat_id_regs so that the test knows not to expect to see it
without SVE being enabled.
Fixes: 3a90b6f279 ("KVM: arm64: selftests: get-reg-list: Add base EL2 registers")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://patch.msgid.link/20251024-kvm-arm64-get-reg-list-zcr-el2-v1-1-0cd0ff75e22f@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
Since GITS_TYPER.PTA == 0, the ITS MAPC command demands a CPU ID,
rather than a physical redistributor address, for its RDbase
command argument.
As such, when MAPC-ing guest ITS collections, vgic_lpi_stress iterates
over CPU IDs in the range [0, nr_cpus), passing them as the RDbase
vcpu_id argument to its_send_mapc_cmd().
However, its_encode_target() in the its_send_mapc_cmd() selftest
handler expects RDbase arguments to be formatted with a 16 bit
offset, as shown by the 16-bit target_addr right shift its implementation:
its_mask_encode(&cmd->raw_cmd[2], target_addr >> 16, 51, 16)
At the moment, all CPU IDs passed into its_send_mapc_cmd() have no
offset, therefore becoming 0x0 after the bit shift. Thus, when
vgic_its_cmd_handle_mapc() receives the ITS command in vgic-its.c,
it always interprets the RDbase target CPU as CPU 0. All interrupts
sent to collections will be processed by vCPU 0, which defeats the
purpose of this multi-vCPU test.
Fix by creating procnum_to_rdbase() helper function, which left-shifts
the vCPU parameter received by its_send_mapc_cmd 16 bits before passing
it to its_encode_target for encoding.
Signed-off-by: Maximilian Dittgen <mdittgen@amazon.de>
Link: https://patch.msgid.link/20251020145946.48288-1-mdittgen@amazon.de
Signed-off-by: Marc Zyngier <maz@kernel.org>
Test that ICMP extensions are reported correctly when enabled and not
reported when disabled. Test both IPv4 and IPv6 and using different
packet sizes, to make sure trimming / padding works correctly.
Disable ICMP rate limiting (defaults to 1 per-second per-target) so that
the kernel will always generate ICMP errors when needed.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20251027082232.232571-4-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In bareudp.sh, this script uses /bin/sh and it will load another lib.sh
BASH script at the very beginning.
But on some operating systems like Ubuntu, /bin/sh is actually pointed to
DASH, thus it will try to run BASH commands with DASH and consequently
leads to syntax issues:
# ./bareudp.sh: 4: ./lib.sh: Bad substitution
# ./bareudp.sh: 5: ./lib.sh: source: not found
# ./bareudp.sh: 24: ./lib.sh: Syntax error: "(" unexpected
Fix this by explicitly using BASH for bareudp.sh. This fixes test
execution failures on systems where /bin/sh is not BASH.
Reported-by: Edoardo Canepa <edoardo.canepa@canonical.com>
Link: https://bugs.launchpad.net/bugs/2129812
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20251027095710.2036108-2-po-hsu.lin@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Socket APIs like recvfrom(), accept(), and getsockname() expect socklen_t*
arg, but tests were using int variables. This causes -Wpointer-sign
warnings on platforms where socklen_t is unsigned.
Change the variable type from int to socklen_t to resolve the warning and
ensure type safety across platforms.
warning fixed:
sctp_collision.c:62:70: warning: passing 'int *' to parameter of
type 'socklen_t *' (aka 'unsigned int *') converts between pointers to
integer types with different sign [-Wpointer-sign]
62 | ret = recvfrom(sd, buf, sizeof(buf),
0, (struct sockaddr *)&daddr, &len);
| ^~~~
/usr/include/sys/socket.h:165:27: note: passing argument to
parameter '__addr_len' here
165 | socklen_t *__restrict __addr_len);
| ^
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Ankit Khushwaha <ankitkhushwaha.linux@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251028172947.53153-1-ankitkhushwaha.linux@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add __vfio_pci_dma_*() helpers which return -errno from the underlying
ioctls.
Add __vfio_pci_dma_unmap_all() to test more unmapping code paths. Add an
out unmapped arg to report the unmapped byte size.
The existing vfio_pci_dma_*() functions, which are intended for
happy-path usage (assert on failure) are now thin wrappers on top of the
double-underscore helpers.
Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Alex Mastro <amastro@fb.com>
Link: https://lore.kernel.org/r/20251028-fix-unmap-v6-4-2542b96bcc8e@fb.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
Test that outgoing plaintext records respect the tls TLS_TX_MAX_PAYLOAD_LEN
set using setsockopt(). The limit is set to be 128, thus, in all received
records, the plaintext must not exceed this amount.
Also test that setting a new record size limit whilst a pending open
record exists is handled correctly by discarding the request.
Suggested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/20251022001937.20155-2-wilfred.opensource@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix to avoid the usage of the `ret` variable uninitialized in the
following macro expansions.
It solves the following warning:
In file included from netlink-dumps.c:21:
netlink-dumps.c: In function ‘dump_extack’:
../kselftest_harness.h:788:35: warning: ‘ret’ may be used uninitialized [-Wmaybe-uninitialized]
788 | intmax_t __exp_print = (intmax_t)__exp; \
| ^~~~~~~~~~~
../kselftest_harness.h:631:9: note: in expansion of macro ‘__EXPECT’
631 | __EXPECT(expected, #expected, seen, #seen, ==, 0)
| ^~~~~~~~
netlink-dumps.c:169:9: note: in expansion of macro ‘EXPECT_EQ’
169 | EXPECT_EQ(ret, FOUND_EXTACK);
| ^~~~~~~~~
The issue can be reproduced, building the tests, with the command:
make -C tools/testing/selftests TARGETS=net
Signed-off-by: Alessandro Zanni <alessandro.zanni87@gmail.com>
Link: https://patch.msgid.link/20251023205354.28249-1-alessandro.zanni87@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull networking fixes from Jakub Kicinski:
"Including fixes from can. Slim pickings, I'm guessing people haven't
really started testing.
Current release - new code bugs:
- eth: mlx5e:
- psp: avoid 'accel' NULL pointer dereference
- skip PPHCR register query for FEC histogram if not supported
Previous releases - regressions:
- bonding: update the slave array for broadcast mode
- rtnetlink: re-allow deleting FDB entries in user namespace
- eth: dpaa2: fix the pointer passed to PTR_ALIGN on Tx path
Previous releases - always broken:
- can: drop skb on xmit if device is in listen-only mode
- gro: clear skb_shinfo(skb)->hwtstamps in napi_reuse_skb()
- eth: mlx5e
- RX, fix generating skb from non-linear xdp_buff if program
trims frags
- make devcom init failures non-fatal, fix races with IPSec
Misc:
- some documentation formatting 'fixes'"
* tag 'net-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
net/mlx5: Fix IPsec cleanup over MPV device
net/mlx5: Refactor devcom to return NULL on failure
net/mlx5e: Skip PPHCR register query if not supported by the device
net/mlx5: Add PPHCR to PCAM supported registers mask
virtio-net: zero unused hash fields
net: phy: micrel: always set shared->phydev for LAN8814
vsock: fix lock inversion in vsock_assign_transport()
ovpn: use datagram_poll_queue for socket readiness in TCP
espintcp: use datagram_poll_queue for socket readiness
net: datagram: introduce datagram_poll_queue for custom receive queues
net: bonding: fix possible peer notify event loss or dup issue
net: hsr: prevent creation of HSR device with slaves from another netns
sctp: avoid NULL dereference when chunk data buffer is missing
ptp: ocp: Fix typo using index 1 instead of i in SMA initialization loop
net: ravb: Ensure memory write completes before ringing TX doorbell
net: ravb: Enforce descriptor type ordering
net: hibmcge: select FIXED_PHY
net: dlink: use dev_kfree_skb_any instead of dev_kfree_skb
Documentation: networking: ax25: update the mailing list info.
net: gro_cells: fix lock imbalance in gro_cells_receive()
...
sctp_vrf.sh could fail:
TEST 12: bind vrf-2 & 1 in server, connect from client 1 & 2, N [FAIL]
not ok 1 selftests: net: sctp_vrf.sh # exit=3
The failure happens when the server bind in a new run conflicts with an
existing association from the previous run:
[1] ip netns exec $SERVER_NS ./sctp_hello server ...
[2] ip netns exec $CLIENT_NS ./sctp_hello client ...
[3] ip netns exec $SERVER_NS pkill sctp_hello ...
[4] ip netns exec $SERVER_NS ./sctp_hello server ...
It occurs if the client in [2] sends a message and closes immediately.
With the message unacked, no SHUTDOWN is sent. Killing the server in [3]
triggers a SHUTDOWN the client also ignores due to the unacked message,
leaving the old association alive. This causes the bind at [4] to fail
until the message is acked and the client responds to a second SHUTDOWN
after the server’s T2 timer expires (3s).
This patch fixes the issue by preventing the client from sending data.
Instead, the client blocks on recv() and waits for the server to close.
It also waits until both the server and the client sockets are fully
released in stop_server and wait_client before restarting.
Additionally, replace 2>&1 >/dev/null with -q in sysctl and grep, and
drop other redundant 2>&1 >/dev/null redirections, and fix a typo from
N to Y (connect successfully) in the description of the last test.
Fixes: a61bd7b9fe ("selftests: add a selftest for sctp vrf")
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Tested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/be2dacf52d0917c4ba5e2e8c5a9cb640740ad2b6.1760731574.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull bpf fixes from Alexei Starovoitov:
- Replace bpf_map_kmalloc_node() with kmalloc_nolock() to fix kmemleak
imbalance in tracking of bpf_async_cb structures (Alexei Starovoitov)
- Make selftests/bpf arg_parsing.c more robust to errors (Andrii
Nakryiko)
- Fix redefinition of 'off' as different kind of symbol when I40E
driver is builtin (Brahmajit Das)
- Do not disable preemption in bpf_test_run (Sahil Chandna)
- Fix memory leak in __lookup_instance error path (Shardul Bankar)
- Ensure test data is flushed to disk before reading it (Xing Guo)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: Fix redefinition of 'off' as different kind of symbol
bpf: Do not disable preemption in bpf_test_run().
bpf: Fix memory leak in __lookup_instance error path
selftests: arg_parsing: Ensure data is flushed to disk before reading.
bpf: Replace bpf_map_kmalloc_node() with kmalloc_nolock() to allocate bpf_async_cb structures.
selftests/bpf: make arg_parsing.c more robust to crashes
bpf: test_run: Fix ctx leak in bpf_prog_test_run_xdp error path
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Fix the handling of ZCR_EL2 in NV VMs
- Pick the correct translation regime when doing a PTW on the back of
a SEA
- Prevent userspace from injecting an event into a vcpu that isn't
initialised yet
- Move timer save/restore to the sysreg handling code, fixing EL2
timer access in the process
- Add FGT-based trapping of MDSCR_EL1 to reduce the overhead of debug
- Fix trapping configuration when the host isn't GICv3
- Improve the detection of HCR_EL2.E2H being RES1
- Drop a spurious 'break' statement in the S1 PTW
- Don't try to access SPE when owned by EL3
Documentation updates:
- Document the failure modes of event injection
- Document that a GICv3 guest can be created on a GICv5 host with
FEAT_GCIE_LEGACY
Selftest improvements:
- Add a selftest for the effective value of HCR_EL2.AMO
- Address build warning in the timer selftest when building with
clang
- Teach irqfd selftests about non-x86 architectures
- Add missing sysregs to the set_id_regs selftest
- Fix vcpu allocation in the vgic_lpi_stress selftest
- Correctly enable interrupts in the vgic_lpi_stress selftest
x86:
- Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test
for the bug fixed by commit 3ccbf6f470 ("KVM: x86/mmu: Return
-EAGAIN if userspace deletes/moves memslot during prefault")
- Don't try to get PMU capabilities from perf when running a CPU with
hybrid CPUs/PMUs, as perf will rightly WARN.
guest_memfd:
- Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a
more generic KVM_CAP_GUEST_MEMFD_FLAGS
- Add a guest_memfd INIT_SHARED flag and require userspace to
explicitly set said flag to initialize memory as SHARED,
irrespective of MMAP.
The behavior merged in 6.18 is that enabling mmap() implicitly
initializes memory as SHARED, which would result in an ABI
collision for x86 CoCo VMs as their memory is currently always
initialized PRIVATE.
- Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with
private memory, to enable testing such setups, i.e. to hopefully
flush out any other lurking ABI issues before 6.18 is officially
released.
- Add testcases to the guest_memfd selftest to cover guest_memfd
without MMAP, and host userspace accesses to mmap()'d private
memory"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (46 commits)
arm64: Revamp HCR_EL2.E2H RES1 detection
KVM: arm64: nv: Use FGT write trap of MDSCR_EL1 when available
KVM: arm64: Compute per-vCPU FGTs at vcpu_load()
KVM: arm64: selftests: Fix misleading comment about virtual timer encoding
KVM: arm64: selftests: Add an E2H=0-specific configuration to get_reg_list
KVM: arm64: selftests: Make dependencies on VHE-specific registers explicit
KVM: arm64: Kill leftovers of ad-hoc timer userspace access
KVM: arm64: Fix WFxT handling of nested virt
KVM: arm64: Move CNT*CT_EL0 userspace accessors to generic infrastructure
KVM: arm64: Move CNT*_CVAL_EL0 userspace accessors to generic infrastructure
KVM: arm64: Move CNT*_CTL_EL0 userspace accessors to generic infrastructure
KVM: arm64: Add timer UAPI workaround to sysreg infrastructure
KVM: arm64: Make timer_set_offset() generally accessible
KVM: arm64: Replace timer context vcpu pointer with timer_id
KVM: arm64: Introduce timer_context_to_vcpu() helper
KVM: arm64: Hide CNTHV_*_EL2 from userspace for nVHE guests
Documentation: KVM: Update GICv3 docs for GICv5 hosts
KVM: arm64: gic-v3: Only set ICH_HCR traps for v2-on-v3 or v3 guests
KVM: arm64: selftests: Actually enable IRQs in vgic_lpi_stress
KVM: arm64: selftests: Allocate vcpus with correct size
...
KVM x86 fixes for 6.18:
- Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test for the
bug fixed by commit 3ccbf6f470 ("KVM: x86/mmu: Return -EAGAIN if userspace
deletes/moves memslot during prefault")
- Don't try to get PMU capabbilities from perf when running a CPU with hybrid
CPUs/PMUs, as perf will rightly WARN.
- Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a more
generic KVM_CAP_GUEST_MEMFD_FLAGS
- Add a guest_memfd INIT_SHARED flag and require userspace to explicitly set
said flag to initialize memory as SHARED, irrespective of MMAP. The
behavior merged in 6.18 is that enabling mmap() implicitly initializes
memory as SHARED, which would result in an ABI collision for x86 CoCo VMs
as their memory is currently always initialized PRIVATE.
- Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with private
memory, to enable testing such setups, i.e. to hopefully flush out any
other lurking ABI issues before 6.18 is officially released.
- Add testcases to the guest_memfd selftest to cover guest_memfd without MMAP,
and host userspace accesses to mmap()'d private memory.
Martin KaFai Lau says:
====================
pull-request: bpf-next 2025-10-16
We've added 6 non-merge commits during the last 1 day(s) which contain
a total of 18 files changed, 577 insertions(+), 38 deletions(-).
The main changes are:
1) Bypass the global per-protocol memory accounting either by setting
a netns sysctl or using bpf_setsockopt in a bpf program,
from Kuniyuki Iwashima.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
selftests/bpf: Add test for sk->sk_bypass_prot_mem.
bpf: Introduce SK_BPF_BYPASS_PROT_MEM.
bpf: Support bpf_setsockopt() for BPF_CGROUP_INET_SOCK_CREATE.
net: Introduce net.core.bypass_prot_mem sysctl.
net: Allow opt-out from global protocol memory accounting.
tcp: Save lock_sock() for memcg in inet_csk_accept().
====================
Link: https://patch.msgid.link/20251016204539.773707-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This fixes the following build error
CLNG-BPF [test_progs] verifier_global_ptr_args.bpf.o
progs/verifier_global_ptr_args.c:228:5: error: redefinition of 'off' as
different kind of symbol
228 | u32 off;
| ^
The symbol 'off' was previously defined in
tools/testing/selftests/bpf/tools/include/vmlinux.h, which includes an
enum i40e_ptp_gpio_pin_state from
drivers/net/ethernet/intel/i40e/i40e_ptp.c:
enum i40e_ptp_gpio_pin_state {
end = -2,
invalid = -1,
off = 0,
in_A = 1,
in_B = 2,
out_A = 3,
out_B = 4,
};
This enum is included when CONFIG_I40E is enabled. As of commit
032676ff82 ("LoongArch: Update Loongson-3 default config file"),
CONFIG_I40E is set in the defconfig, which leads to the conflict.
Renaming the local variable avoids the redefinition and allows the
build to succeed.
Suggested-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Brahmajit Das <listout@listout.xyz>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20251017171551.53142-1-listout@listout.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The test does the following for IPv4/IPv6 x TCP/UDP sockets
with/without sk->sk_bypass_prot_mem, which can be turned on by
net.core.bypass_prot_mem or bpf_setsockopt(SK_BPF_BYPASS_PROT_MEM).
1. Create socket pairs
2. Send NR_PAGES (32) of data (TCP consumes around 35 pages,
and UDP consuems 66 pages due to skb overhead)
3. Read memory_allocated from sk->sk_prot->memory_allocated and
sk->sk_prot->memory_per_cpu_fw_alloc
4. Check if unread data is charged to memory_allocated
If sk->sk_bypass_prot_mem is set, memory_allocated should not be
changed, but we allow a small error (up to 10 pages) in case
other processes on the host use some amounts of TCP/UDP memory.
The amount of allocated pages are buffered to per-cpu variable
{tcp,udp}_memory_per_cpu_fw_alloc up to +/- net.core.mem_pcpu_rsv
before reported to {tcp,udp}_memory_allocated.
At 3., memory_allocated is calculated from the 2 variables at
fentry of socket create function.
We drain the receive queue only for UDP before close() because UDP
recv queue is destroyed after RCU grace period. When I printed
memory_allocated, UDP bypass cases sometimes saw the no-bypass
case's leftover, but it's still in the small error range (<10 pages).
bpf_trace_printk: memory_allocated: 0 <-- TCP no-bypass
bpf_trace_printk: memory_allocated: 35
bpf_trace_printk: memory_allocated: 0 <-- TCP w/ sysctl
bpf_trace_printk: memory_allocated: 0
bpf_trace_printk: memory_allocated: 0 <-- TCP w/ bpf
bpf_trace_printk: memory_allocated: 0
bpf_trace_printk: memory_allocated: 0 <-- UDP no-bypass
bpf_trace_printk: memory_allocated: 66
bpf_trace_printk: memory_allocated: 2 <-- UDP w/ sysctl (2 pages leftover)
bpf_trace_printk: memory_allocated: 2
bpf_trace_printk: memory_allocated: 2 <-- UDP w/ bpf (2 pages leftover)
bpf_trace_printk: memory_allocated: 2
We prefer finishing tests faster than oversleeping for call_rcu()
+ sk_destruct().
The test completes within 2s on QEMU (64 CPUs) w/ KVM.
# time ./test_progs -t sk_bypass
#371/1 sk_bypass_prot_mem/TCP :OK
#371/2 sk_bypass_prot_mem/UDP :OK
#371/3 sk_bypass_prot_mem/TCPv6:OK
#371/4 sk_bypass_prot_mem/UDPv6:OK
#371 sk_bypass_prot_mem:OK
Summary: 1/4 PASSED, 0 SKIPPED, 0 FAILED
real 0m1.481s
user 0m0.181s
sys 0m0.441s
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Link: https://patch.msgid.link/20251014235604.3057003-7-kuniyu@google.com