linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-19 08:11:11 -04:00

Author	SHA1	Message	Date
Jakub Kicinski	7dae8ffb09	selftests: drv-net: gro: add a test for GRO depth Reuse the long sequence test to max out the GRO contexts. Repeat for a single queue, 8 queues, and default number of queues but flow steering to just one. The SW GRO's capacity should be around 64 per queue (8 buckets, up to 8 skbs in a chain). Link: https://patch.msgid.link/20260318033819.1469350-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-19 16:57:29 -07:00
Jakub Kicinski	ff1cb3ad2a	selftests: drv-net: gro: add test for packet ordering Add a test to check if the NIC reorders packets if the hit GRO. Link: https://patch.msgid.link/20260318033819.1469350-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-19 16:57:28 -07:00
Jakub Kicinski	ba5d4128fc	selftests: drv-net: gro: test GRO stats Test accuracy of GRO stats. We want to cover two potentially tricky cases: - single segment GRO - packets which were eligible but didn't get GRO'd The first case is trivial, teach gro.c to send one packet, and check GRO stats didn't move. Second case requires gro.c to send a lot of flows expecting the NIC to run out of GRO flow capacity. To avoid system traffic noise we steer the packets to a dedicated queue and operate on qstat. Link: https://patch.msgid.link/20260318033819.1469350-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-19 16:57:28 -07:00
Jakub Kicinski	f26d43acf1	selftests: drv-net: gro: use SO_TXTIME to schedule packets together Longer packet sequence tests are quite flaky when the test is run over a real network. Try to avoid at least the jitter on the sender side by scheduling all the packets to be sent at once using SO_TXTIME. Use hardcoded tx time of 5msec in the future. In my test increasing this time past 2msec makes no difference so 5msec is plenty of margin. Since we now expect more output buffering make sure to raise SNDBUF. Note that this is an opportunistic reliability improvement which will only work if the qdisc can schedule Tx time for us (fq). Fiddling with qdisc config was deemed too complex, so it's not part of the patch. Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20260318033819.1469350-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-19 16:57:28 -07:00
Jakub Kicinski	9b29afa116	selftests: drv-net: give HW stats sync time extra 25% of margin There are transient failures for devices which update stats periodically, especially if it's the FW DMA'ing the stats rather than host periodic work querying the FW. Wait 25% longer than strictly necessary. For devices which don't report stats-block-usecs we retain 25 msec as the default wait time (0.025sec == 20,000usec * 1.25). Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260318033819.1469350-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-19 16:57:28 -07:00
Jakub Kicinski	8888bf4fb9	selftests: net: move gro to lib for HW vs SW reuse The gro.c packet sender is used for SW testing but bulk of incoming new tests will be HW-specific. So it's better to put them under drivers/net/hw/, to avoid tip-toeing around netdevsim. Move gro.c to lib so we can reuse it. Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260318033819.1469350-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-19 16:57:28 -07:00
Jakub Kicinski	edab1ca5ec	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-7.0-rc5). net/netfilter/nft_set_rbtree.c `598adea720` ("netfilter: revert nft_set_rbtree: validate open interval overlap") `3aea466a43` ("netfilter: nft_set_rbtree: don't disable bh when acquiring tree lock") https://lore.kernel.org/abgaQBpeGstdN4oq@sirena.org.uk No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-19 14:16:00 -07:00
Paolo Abeni	0c45064487	Merge tag 'ovpn-net-next-20260317' of https://github.com/OpenVPN/ovpn-net-next Antonio Quartulli says: ==================== Included features: * use bitops.h API when possible * send netlink notification in case of client float event * implement support for asymmetric peer IDs * consolidate memory allocations during crypto operations * add netlink notification check in selftests * add FW mark check in selftest * tag 'ovpn-net-next-20260317' of https://github.com/OpenVPN/ovpn-net-next: ovpn: consolidate crypto allocations in one chunk selftests: ovpn: add test for the FW mark feature selftests: ovpn: check asymmetric peer-id ovpn: add support for asymmetric peer IDs selftests: ovpn: add notification parsing and matching ovpn: notify userspace on client float event ovpn: pktid: use bitops.h API ovpn: use correct array size to parse nested attributes in ovpn_nl_key_swap_doit selftests: ovpn: allow compiling ovpn-cli.c with mbedtls3 ==================== Link: https://patch.msgid.link/20260317104023.192548-1-antonio@openvpn.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-19 12:50:42 +01:00
Simon Baatz	96a584db75	selftests/net: packetdrill: improve tcp_rcv_neg_window.pkt The test depends on accepting a packet that is larger than the advertised window and that does not trigger an immediate ACK. Previously, the test might still pass even if kernel behavior changed unexpectedly. Add assertions verifying that the large packet was accepted and no ACK was sent. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Simon Baatz <gmbnomis@gmail.com> Link: https://patch.msgid.link/20260316-improve_tcp_neg_usable_wnd_test-v1-1-f16d5e365107@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-19 10:26:25 +01:00
Bobby Eshleman	3883c2b509	selftests/vsock: auto-detect kernel for guest VMs When running vmtest.sh inside a nested VM the running kernel may not be installed on the filesystem at the standard /boot/ or /usr/lib/modules/ paths. Previously, this would cause vng to fail with "does not exist" since it could not find the kernel image. Instead, this patch uses --dry-run to detect if the kernel is available. If not, then we fall back to the kernel in the kernel source tree. If that fails, then we die. This way runners, like NIPA, can use vng --run arch/x86/boot/bzImage to setup an outer VM, and vmtest.sh will still do the right thing setting up the inner VM. Due to job control issues in vng, a workaround is used to prevent 'make kselftest TARGETS=vsock' from hanging until test timeout. A PR has been placed upstream to solve the issue in vng: https://github.com/arighi/virtme-ng/pull/453 Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260316-vsock-vmtest-autodetect-kernel-v2-1-5eec7b4831f8@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-18 19:28:23 -07:00
Jakub Kicinski	17a55ddb19	tools: ynl: rework policy access to support recursion Donald points out that the current naive implementation using dicts breaks if policy is recursive (child nest uses policy idx already used by its parent). Lean more into the NlPolicy class. This lets us "render" the policy on demand, when user accesses it. If someone wants to do an infinite walk that's on them :) Show policy info as attributes of the class and use dict format to descend into sub-policies for extra neatness. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20260313232047.2068518-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-18 16:41:42 -07:00
Linus Torvalds	f0caa1d49c	Merge tag 'hid-for-linus-2026031701' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - various fixes dealing with (intentionally) broken devices in HID core, logitech-hidpp and multitouch drivers (Lee Jones) - fix for OOB in wacom driver (Benoît Sevens) - fix for potentialy HID-bpf-induced buffer overflow in () (Benjamin Tissoires) - various other small fixes and device ID / quirk additions * tag 'hid-for-linus-2026031701' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: multitouch: Check to ensure report responses match the request HID: logitech-hidpp: Prevent use-after-free on force feedback initialisation failure HID: bpf: prevent buffer overflow in hid_hw_request selftests/hid: fix compilation when bpf_wq and hid_device are not exported HID: core: Mitigate potential OOB by removing bogus memset() HID: intel-thc-hid: Set HID_PHYS with PCI BDF HID: appletb-kbd: add .resume method in PM HID: logitech-hidpp: Enable MX Master 4 over bluetooth HID: input: Add HID_BATTERY_QUIRK_DYNAMIC for Elan touchscreens HID: input: Drop Asus UX550* touchscreen ignore battery quirks HID: asus: add xg mobile 2022 external hardware support HID: wacom: fix out-of-bounds read in wacom_intuos_bt_irq	2026-03-17 13:55:51 -07:00
Fernando Fernandez Mancera	fdd973148a	selftests: net: add ipv6 RA route to ECMP merge test As commit `bbf4a17ad9` ("ipv6: Fix ECMP sibling count mismatch when clearing RTF_ADDRCONF") pointed out, RA routes are not elegible for ECMP merging. Add a test scenario mixing RA and static routes with gateway to check that they are not getting merged. Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Link: https://patch.msgid.link/20260313124827.3945-1-fmancera@suse.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-17 12:53:30 +01:00
Ralf Lici	7b80d8a335	selftests: ovpn: add test for the FW mark feature Add a selftest to verify that the FW mark socket option is correctly supported and its value propagated by ovpn. The test adds and removes nftables DROP rules based on the mark value, and checks that the rule counter aligns with the number of lost ping packets. Cc: Shuah Khan <shuah@kernel.org> Cc: linux-kselftest@vger.kernel.org Cc: horms@kernel.org Signed-off-by: Ralf Lici <ralf@mandelbit.com> Signed-off-by: Antonio Quartulli <antonio@openvpn.net>	2026-03-17 11:09:20 +01:00
Ralf Lici	367f4b163a	selftests: ovpn: check asymmetric peer-id Extend the base test to verify that the correct peer-id is set in data packet headers. This is done by capturing ping packets with tcpdump during the initial exchange and matching the first portion of the header against the expected sequence for every connection. Cc: Shuah Khan <shuah@kernel.org> Cc: linux-kselftest@vger.kernel.org Cc: horms@kernel.org Signed-off-by: Ralf Lici <ralf@mandelbit.com> Signed-off-by: Antonio Quartulli <antonio@openvpn.net>	2026-03-17 11:09:05 +01:00
Ralf Lici	77de28cd7c	selftests: ovpn: add notification parsing and matching To verify that netlink notifications are correctly emitted and contain the expected fields, this commit uses the tools/net/ynl/pyynl/cli.py script to create multicast listeners. These listeners record the captured notifications to a JSON file, which is later compared to the expected output. Cc: linux-kselftest@vger.kernel.org Cc: shuah@kernel.org Cc: horms@kernel.org Signed-off-by: Ralf Lici <ralf@mandelbit.com> Signed-off-by: Antonio Quartulli <antonio@openvpn.net>	2026-03-17 11:08:55 +01:00
Ralf Lici	c841b676da	ovpn: notify userspace on client float event Send a netlink notification when a client updates its remote UDP endpoint. The notification includes the new IP address, port, and scope ID (for IPv6). Cc: linux-kselftest@vger.kernel.org Cc: horms@kernel.org Cc: shuah@kernel.org Cc: donald.hunter@gmail.com Signed-off-by: Ralf Lici <ralf@mandelbit.com> Signed-off-by: Antonio Quartulli <antonio@openvpn.net> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>	2026-03-17 11:08:55 +01:00
Antonio Quartulli	a8e136b496	selftests: ovpn: allow compiling ovpn-cli.c with mbedtls3 mbedtls 3 installs headers and calls the shared object differently than version 2, therefore we must now rely on pkgconfig to fill the right C/LDFLAGS. Moreover the mbedtls3 library expects any base64 file to have their content on one line. Since this change does no break older versions, let's change the sample key file format and make mbedtls3 happy. Cc: Shuah Khan <shuah@kernel.org> Cc: linux-kselftest@vger.kernel.org Cc: horms@kernel.org Signed-off-by: Antonio Quartulli <antonio@openvpn.net>	2026-03-17 11:08:54 +01:00
Jakub Kicinski	6a33a70626	selftests: net: py: give bpftrace more time to start After commit under Fixes debug runners in the CI hit the following: # subprocess.TimeoutExpired: Command '['bpftrace', '-f', 'json', '-q', '-e', 'kprobe:netpoll_poll_dev { @hits = count(); } interval:s:10 { exit(); }']' timed out after 15 seconds # # Exception\| net.lib.py.ksft.KsftFailEx: bpftrace failed to run!?: {} in netpoll_basic.py >10% of the time. Let's give bpftool more time to start, it can take a while on a debug kernel. Fixes: `82562972b8` ("selftests: net: pass bpftrace timeout to cmd()") Reviewed-by: Breno Leitao <leitao@debian.org> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20260315160038.3187730-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-16 19:26:37 -07:00
Benjamin Tissoires	5d4c6c132e	selftests/hid: fix compilation when bpf_wq and hid_device are not exported This can happen in situations when CONFIG_HID_SUPPORT is set to no, or some complex situations where struct bpf_wq is not exported. So do the usual dance of hiding them before including vmlinux.h, and then redefining them and make use of CO-RE to have the correct offsets. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202603111558.KLCIxsZB-lkp@intel.com/ Fixes: `fe8d561db3` ("selftests/hid: add wq test for hid_bpf_input_report()") Cc: stable@vger.kernel.org Acked-by: Jiri Kosina <jkosina@suse.com> Reviewed-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2026-03-16 16:21:06 +01:00
Linus Torvalds	11e8c7e947	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "Quite a large pull request, partly due to skipping last week and therefore having material from ~all submaintainers in this one. About a fourth of it is a new selftest, and a couple more changes are large in number of files touched (fixing a -Wflex-array-member-not-at-end compiler warning) or lines changed (reformatting of a table in the API documentation, thanks rST). But who am I kidding---it's a lot of commits and there are a lot of bugs being fixed here, some of them on the nastier side like the RISC-V ones. ARM: - Correctly handle deactivation of interrupts that were activated from LRs. Since EOIcount only denotes deactivation of interrupts that are not present in an LR, start EOIcount deactivation walk after the last irq that made it into an LR - Avoid calling into the stubs to probe for ICH_VTR_EL2.TDS when pKVM is already enabled -- not only thhis isn't possible (pKVM will reject the call), but it is also useless: this can only happen for a CPU that has already booted once, and the capability will not change - Fix a couple of low-severity bugs in our S2 fault handling path, affecting the recently introduced LS64 handling and the even more esoteric handling of hwpoison in a nested context - Address yet another syzkaller finding in the vgic initialisation, where we would end-up destroying an uninitialised vgic with nasty consequences - Address an annoying case of pKVM failing to boot when some of the memblock regions that the host is faulting in are not page-aligned - Inject some sanity in the NV stage-2 walker by checking the limits against the advertised PA size, and correctly report the resulting faults PPC: - Fix a PPC e500 build error due to a long-standing wart that was exposed by the recent conversion to kmalloc_obj(); rip out all the ugliness that led to the wart RISC-V: - Prevent speculative out-of-bounds access using array_index_nospec() in APLIC interrupt handling, ONE_REG regiser access, AIA CSR access, float register access, and PMU counter access - Fix potential use-after-free issues in kvm_riscv_gstage_get_leaf(), kvm_riscv_aia_aplic_has_attr(), and kvm_riscv_aia_imsic_has_attr() - Fix potential null pointer dereference in kvm_riscv_vcpu_aia_rmw_topei() - Fix off-by-one array access in SBI PMU - Skip THP support check during dirty logging - Fix error code returned for Smstateen and Ssaia ONE_REG interface - Check host Ssaia extension when creating AIA irqchip x86: - Fix cases where CPUID mitigation features were incorrectly marked as available whenever the kernel used scattered feature words for them - Validate _all_ GVAs, rather than just the first GVA, when processing a range of GVAs for Hyper-V's TLB flush hypercalls - Fix a brown paper bug in add_atomic_switch_msr() - Use hlist_for_each_entry_srcu() when traversing mask_notifier_list, to fix a lockdep warning; KVM doesn't hold RCU, just irq_srcu - Ensure AVIC VMCB fields are initialized if the VM has an in-kernel local APIC (and AVIC is enabled at the module level) - Update CR8 write interception when AVIC is (de)activated, to fix a bug where the guest can run in perpetuity with the CR8 intercept enabled - Add a quirk to skip the consistency check on FREEZE_IN_SMM, i.e. to allow L1 hypervisors to set FREEZE_IN_SMM. This reverts (by default) an unintentional tightening of userspace ABI in 6.17, and provides some amount of backwards compatibility with hypervisors who want to freeze PMCs on VM-Entry - Validate the VMCS/VMCB on return to a nested guest from SMM, because either userspace or the guest could stash invalid values in memory and trigger the processor's consistency checks Generic: - Remove a subtle pseudo-overlay of kvm_stats_desc, which, aside from being unnecessary and confusing, triggered compiler warnings due to -Wflex-array-member-not-at-end - Document that vcpu->mutex is take outside of kvm->slots_lock and kvm->slots_arch_lock, which is intentional and desirable despite being rather unintuitive Selftests: - Increase the maximum number of NUMA nodes in the guest_memfd selftest to 64 (from 8)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits) KVM: selftests: Verify SEV+ guests can read and write EFER, CR0, CR4, and CR8 Documentation: kvm: fix formatting of the quirks table KVM: x86: clarify leave_smm() return value selftests: kvm: add a test that VMX validates controls on RSM selftests: kvm: extract common functionality out of smm_test.c KVM: SVM: check validity of VMCB controls when returning from SMM KVM: VMX: check validity of VMCS controls when returning from SMM KVM: SVM: Set/clear CR8 write interception when AVIC is (de)activated KVM: SVM: Initialize AVIC VMCB fields if AVIC is enabled with in-kernel APIC KVM: x86: Introduce KVM_X86_QUIRK_VMCS12_ALLOW_FREEZE_IN_SMM KVM: x86: Fix SRCU list traversal in kvm_fire_mask_notifiers() KVM: VMX: Fix a wrong MSR update in add_atomic_switch_msr() KVM: x86: hyper-v: Validate all GVAs during PV TLB flush KVM: x86: synthesize CPUID bits only if CPU capability is set KVM: PPC: e500: Rip out "struct tlbe_ref" KVM: PPC: e500: Fix build error due to using kmalloc_obj() with wrong type KVM: selftests: Increase 'maxnode' for guest_memfd tests KVM: arm64: pkvm: Don't reprobe for ICH_VTR_EL2.TDS on CPU hotplug KVM: arm64: vgic: Pick EOIcount deactivations from AP-list tail KVM: arm64: Remove the redundant ISB in __kvm_at_s1e2() ...	2026-03-15 12:22:10 -07:00
Linus Torvalds	4f3df2e5ea	Merge tag 'powerpc-7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Madhavan Srinivasan: - Fix KUAP warning in VMX usercopy path - Fix lockdep warning during PCI enumeration - Fix to move CMA reservations to arch_mm_preinit - Fix to check current->mm is alive before getting user callchain Thanks to Aboorva Devarajan, Christophe Leroy (CS GROUP), Dan Horák, Nicolin Chen, Nilay Shroff, Qiao Zhao, Ritesh Harjani (IBM), Saket Kumar Bhaskar, Sayali Patil, Shrikanth Hegde, Venkat Rao Bagalkote, and Viktor Malik. * tag 'powerpc-7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/iommu: fix lockdep warning during PCI enumeration powerpc/selftests/copyloops: extend selftest to exercise __copy_tofrom_user_power7_vmx powerpc: fix KUAP warning in VMX usercopy path powerpc, perf: Check that current->mm is alive before getting user callchain powerpc/mem: Move CMA reservations to arch_mm_preinit	2026-03-15 11:36:11 -07:00
Eric Dumazet	4686679a14	selftests/net: packetdrill: add tcp_disorder_fin_in_FIN_WAIT.pkt Commit `795a7dfbc3` ("net: tcp: accept old ack during closing") was fixing an old bug, add a test to make sure we won't break this case in future kernels. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Menglong Dong <menglong8.dong@gmail.com> Link: https://patch.msgid.link/20260313115429.3365751-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-14 13:15:46 -07:00
Simon Baatz	3eb371edda	selftests/net: packetdrill: add tcp_rcv_neg_window.pkt The test ensures we correctly apply the maximum advertised window limit when rcv_nxt advances past rcv_mwnd_seq, so that the "usable window" is properly clamped to zero rather than becoming negative. Signed-off-by: Simon Baatz <gmbnomis@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260309-tcp_rfc7323_retract_wnd_rfc-v3-6-4c7f96b1ec69@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-14 08:02:51 -07:00
Simon Baatz	ba58b3e70b	selftests/net: packetdrill: add tcp_rcv_wnd_shrink_allowed.pkt This test verifies the sequence number checks using the maximum advertised window sequence number when net.ipv4.tcp_shrink_window is enabled. Signed-off-by: Simon Baatz <gmbnomis@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260309-tcp_rfc7323_retract_wnd_rfc-v3-5-4c7f96b1ec69@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-14 08:02:51 -07:00
Simon Baatz	ec1adf8ecf	selftests/net: packetdrill: add tcp_rcv_wnd_shrink_nomem.pkt This test verifies - the sequence number checks using the maximum advertised window sequence number and - the logic for handling received data in tcp_data_queue() for the cases: 1. The window is reduced to zero because of memory 2. The window grows again but still does not reach the originally advertised window Signed-off-by: Simon Baatz <gmbnomis@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260309-tcp_rfc7323_retract_wnd_rfc-v3-4-4c7f96b1ec69@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-14 08:02:16 -07:00
Simon Baatz	0e24d17bd9	tcp: implement RFC 7323 window retraction receiver requirements By default, the Linux TCP implementation does not shrink the advertised window (RFC 7323 calls this "window retraction") with the following exceptions: - When an incoming segment cannot be added due to the receive buffer running out of memory. Since commit `8c670bdfa5` ("tcp: correct handling of extreme memory squeeze") a zero window will be advertised in this case. It turns out that reaching the required memory pressure is easy when window scaling is in use. In the simplest case, sending a sufficient number of segments smaller than the scale factor to a receiver that does not read data is enough. - Commit `b650d953cd` ("tcp: enforce receive buffer memory limits by allowing the tcp window to shrink") addressed the "eating memory" problem by introducing a sysctl knob that allows shrinking the window before running out of memory. However, RFC 7323 does not only state that shrinking the window is necessary in some cases, it also formulates requirements for TCP implementations when doing so (Section 2.4). This commit addresses the receiver-side requirements: After retracting the window, the peer may have a snd_nxt that lies within a previously advertised window but is now beyond the retracted window. This means that all incoming segments (including pure ACKs) will be rejected until the application happens to read enough data to let the peer's snd_nxt be in window again (which may be never). To comply with RFC 7323, the receiver MUST honor any segment that would have been in window for any ACK sent by the receiver and, when window scaling is in effect, SHOULD track the maximum window sequence number it has advertised. This patch tracks that maximum window sequence number rcv_mwnd_seq throughout the connection and uses it in tcp_sequence() when deciding whether a segment is acceptable. rcv_mwnd_seq is updated together with rcv_wup and rcv_wnd in tcp_select_window(). If we count tcp_sequence() as fast path, it is read in the fast path. Therefore, rcv_mwnd_seq is put into rcv_wnd's cacheline group. The logic for handling received data in tcp_data_queue() is already sufficient and does not need to be updated. Signed-off-by: Simon Baatz <gmbnomis@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260309-tcp_rfc7323_retract_wnd_rfc-v3-1-4c7f96b1ec69@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-14 08:01:49 -07:00
Linus Torvalds	8369b2e97d	Merge tag 'sched_ext-for-7.0-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fixes from Tejun Heo: - Fix data races flagged by KCSAN: add missing READ_ONCE()/WRITE_ONCE() annotations for lock-free accesses to module parameters and dsq->seq - Fix silent truncation of upper 32 enqueue flags (SCX_ENQ_PREEMPT and above) when passed through the int sched_class interface - Documentation updates: scheduling class precedence, task ownership state machine, example scheduler descriptions, config list cleanup - Selftest fix for format specifier and buffer length in file_write_long() * tag 'sched_ext-for-7.0-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: Use WRITE_ONCE() for the write side of scx_enable helper pointer sched_ext: Fix enqueue_task_scx() truncation of upper enqueue flags sched_ext: Documentation: Update sched-ext.rst sched_ext: Use READ_ONCE() for scx_slice_bypass_us in scx_bypass() sched_ext: Documentation: Mention scheduling class precedence sched_ext: Document task ownership state machine sched_ext: Use READ_ONCE() for lock-free reads of module param variables sched_ext/selftests: Fix format specifier and buffer length in file_write_long() sched_ext: Use WRITE_ONCE() for the write side of dsq->seq update	2026-03-13 14:54:56 -07:00
Jakub Kicinski	c1f9a89b0c	selftests: net: add test for Netlink policy dumps Add validation for the nlctrl family, accessing family info and dumping policies. TAP version 13 1..4 ok 1 nl_nlctrl.getfamily_do ok 2 nl_nlctrl.getfamily_dump ok 3 nl_nlctrl.getpolicy_dump ok 4 nl_nlctrl.getpolicy_by_op # Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0 Link: https://patch.msgid.link/20260311032839.417748-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-12 18:02:13 -07:00
Jakub Kicinski	e911be8354	selftests: net: make sure that Netlink rejects unknown attrs in dump Add a test case for rejecting attrs if policy is not set. dev_get dump has no input policy (accepts no attrs). Link: https://patch.msgid.link/20260311032839.417748-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-12 18:02:13 -07:00
Jakub Kicinski	72374257ed	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-7.0-rc4). drivers/net/ethernet/mellanox/mlx5/core/en_rx.c `db25c42c2e` ("net/mlx5e: RX, Fix XDP multi-buf frag counting for striding RQ") `dff1c3164a` ("net/mlx5e: SHAMPO, Always calculate page size") https://lore.kernel.org/aa7ORohmf67EKihj@sirena.org.uk drivers/net/ethernet/ti/am65-cpsw-nuss.c `840c9d13cb` ("net: ethernet: ti: am65-cpsw-nuss: Fix rx_filter value for PTP support") `a23c657e33` ("net: ethernet: ti: am65-cpsw: Use also port number to identify timestamps") https://lore.kernel.org/abK3EkIXuVgMyGI7@sirena.org.uk No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-12 12:53:34 -07:00
Linus Torvalds	2c7e63d702	Merge tag 'net-7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from CAN and netfilter. Current release - regressions: - eth: mana: Null service_wq on setup error to prevent double destroy Previous releases - regressions: - nexthop: fix percpu use-after-free in remove_nh_grp_entry - sched: teql: fix NULL pointer dereference in iptunnel_xmit on TEQL slave xmit - bpf: fix nd_tbl NULL dereference when IPv6 is disabled - neighbour: restore protocol != 0 check in pneigh update - tipc: fix divide-by-zero in tipc_sk_filter_connect() - eth: - mlx5: - fix crash when moving to switchdev mode - fix DMA FIFO desync on error CQE SQ recovery - iavf: fix PTP use-after-free during reset - bonding: fix type confusion in bond_setup_by_slave() - lan78xx: fix WARN in __netif_napi_del_locked on disconnect Previous releases - always broken: - core: add xmit recursion limit to tunnel xmit functions - net-shapers: don't free reply skb after genlmsg_reply() - netfilter: - fix stack out-of-bounds read in pipapo_drop() - fix OOB read in nfnl_cthelper_dump_table() - mctp: - fix device leak on probe failure - i2c: fix skb memory leak in receive path - can: keep the max bitrate error at 5% - eth: - bonding: fix nd_tbl NULL dereference when IPv6 is disabled - bnxt_en: fix RSS table size check when changing ethtool channels - amd-xgbe: prevent CRC errors during RX adaptation with AN disabled - octeontx2-af: devlink: fix NIX RAS reporter recovery condition" * tag 'net-7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (71 commits) net: prevent NULL deref in ip[6]tunnel_xmit() octeontx2-af: devlink: fix NIX RAS reporter to use RAS interrupt status octeontx2-af: devlink: fix NIX RAS reporter recovery condition net: ethernet: ti: am65-cpsw-nuss: Fix rx_filter value for PTP support net/mana: Null service_wq on setup error to prevent double destroy selftests: rtnetlink: add neighbour update test neighbour: restore protocol != 0 check in pneigh update net: dsa: realtek: Fix LED group port bit for non-zero LED group tipc: fix divide-by-zero in tipc_sk_filter_connect() net: dsa: microchip: Fix error path in PTP IRQ setup bpf: bpf_out_neigh_v6: Fix nd_tbl NULL dereference when IPv6 is disabled bpf: bpf_out_neigh_v4: Fix nd_tbl NULL dereference when IPv6 is disabled net: bonding: Fix nd_tbl NULL dereference when IPv6 is disabled ipv6: move the disable_ipv6_mod knob to core code net: bcmgenet: fix broken EEE by converting to phylib-managed state net-shapers: don't free reply skb after genlmsg_reply() net: dsa: mxl862xx: don't set user_mii_bus net: ethernet: arc: emac: quiesce interrupts before requesting IRQ page_pool: store detach_time as ktime_t to avoid false-negatives net: macb: Shuffle the tx ring before enabling tx ...	2026-03-12 11:33:35 -07:00
Sean Christopherson	d2ea4ff1ce	KVM: selftests: Verify SEV+ guests can read and write EFER, CR0, CR4, and CR8 Add "do no harm" testing of EFER, CR0, CR4, and CR8 for SEV+ guests to verify that the guest can read and write the registers, without hitting e.g. a #VC on SEV-ES guests due to KVM incorrectly trying to intercept a register. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20260310211841.2552361-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2026-03-12 17:31:53 +01:00
Sayali Patil	146c9ab38b	powerpc/selftests/copyloops: extend selftest to exercise __copy_tofrom_user_power7_vmx The new PowerPC VMX fast path (__copy_tofrom_user_power7_vmx) is not exercised by existing copyloops selftests. This patch updates the selftest to exercise the VMX variant, ensuring the VMX copy path is validated. Changes include: - COPY_LOOP=test___copy_tofrom_user_power7_vmx with -D VMX_TEST is used in existing selftest build targets. - Inclusion of ../utils.c to provide get_auxv_entry() for hardware feature detection. - At runtime, the test skips execution if Altivec is not available. - Copy sizes above VMX_COPY_THRESHOLD are used to ensure the VMX path is taken. This enables validation of the VMX fast path without affecting systems that do not support Altivec. Signed-off-by: Sayali Patil <sayalip@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20260304122201.153049-2-sayalip@linux.ibm.com	2026-03-12 11:03:48 +05:30
Gal Pressman	f0bd193166	selftests: net: fix timeout passed as positional argument to communicate() The cited commit refactored the hardcoded timeout=5 into a parameter, but dropped the keyword from the communicate() call. Since Popen.communicate()'s first positional argument is 'input' (not 'timeout'), the timeout value is silently treated as stdin input and the call never enforces a timeout. Pass timeout as a keyword argument to restore the intended behavior. Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20260310115803.2521050-3-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-11 19:11:40 -07:00
Gal Pressman	82562972b8	selftests: net: pass bpftrace timeout to cmd() The bpftrace() helper configures an interval based exit timer but does not propagate the timeout to the cmd object, which defaults to 5 seconds. Since the default BPFTRACE_TIMEOUT is 10 seconds, cmd.process() always raises a TimeoutExpired exception before bpftrace has a chance to exit gracefully. Pass timeout+5 to cmd() to allow bpftrace to complete gracefully. Note: this issue is masked by a bug in the way cmd() passes timeout, this is fixed in the next commit. Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20260310115803.2521050-2-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-11 19:11:36 -07:00
Sabrina Dubroca	68e76fc12d	selftests: rtnetlink: add neighbour update test Check that protocol and flags are updated correctly for neighbour and pneigh entries. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/d28f72b5b4ff4c9ecbbbde06146a938dcc4c264a.1772894876.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-11 19:04:55 -07:00
Daniel Golle	7e27d6202e	selftests: net: local_termination: test link-local protocols Add tests to local_termination.sh to verify that link-local frames arrive. On some switches the DSA driver uses bridges to connect the user ports to their CPU ports. More "intelligent" switches typically don't forward link-local frames, but may trap them to an internal microcontroller. The driver may have to change trapping rules, so link-local frames end up on the DSA CPU ports instead of being silently dropped or trapped to the internal microcontroller of the switch. Add two tests which help to validate this has been done correctly: - Link-local STP BPDU should arrive at the Linux netdev when the bridge has STP disabled (BR_NO_STP), in which case the bridge forwards them rather than consuming them in the control plane - Link-local LLDP should arrive at standalone ports (and the test should be skipped on bridged ports similar to how it is done for the IEEE1588v2/PTP tests) Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/1a67081b2ede1e6d2d32f7dd54ae9688f3566152.1773166131.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-11 18:58:05 -07:00
Soichiro Ueda	34c0378b15	selftests: af_unix: validate SO_PEEK_OFF advancement and reset Extend the so_peek_off selftest to ensure the socket peek offset is handled correctly after both MSG_PEEK and actual data consumption. Verify that the peek offset advances by the same amount as the number of bytes read when performing a read with MSG_PEEK. After exercising SO_PEEK_OFF via MSG_PEEK, drain the receive queue with a non-peek recv() and verify that it can receive all the content in the buffer and SO_PEEK_OFF returns back to 0. The verification after actual data consumption was suggested by Miao Wang when the original so_peek_off selftest was introduced. Link: https://lore.kernel.org/all/7B657CC7-B5CA-46D2-8A4B-8AB5FB83C6DA@gmail.com/ Suggested-by: Miao Wang <shankerwangmiao@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Soichiro Ueda <the.latticeheart@gmail.com> Link: https://patch.msgid.link/20260310072832.127848-1-the.latticeheart@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-11 18:20:17 -07:00
Paolo Bonzini	3e745694b0	selftests: kvm: add a test that VMX validates controls on RSM Add a test checking that invalid eVMCS contents are validated after an RSM instruction is emulated. The failure mode is simply that the RSM succeeds, because KVM virtualizes NMIs anyway while running L2; the two pin-based execution controls used by the test are entirely handled by KVM and not by the processor. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2026-03-11 18:41:12 +01:00
Paolo Bonzini	c52b534f26	selftests: kvm: extract common functionality out of smm_test.c Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2026-03-11 18:41:12 +01:00
Kai Huang	cf534a09fb	KVM: selftests: Increase 'maxnode' for guest_memfd tests Increase 'maxnode' when using 'get_mempolicy' syscall in guest_memfd mmap and NUMA policy tests to fix a failure on one Intel GNR platform. On a CXL-capable platform, the memory affinity of CXL memory regions may not be covered by the SRAT. Since each CXL memory region is enumerated via a CFMWS table, at early boot the kernel parses all CFMWS tables to detect all CXL memory regions and assigns a 'faked' NUMA node for each of them, starting from the highest NUMA node ID enumerated via the SRAT. This increases the 'nr_node_ids'. E.g., on the aforementioned Intel GNR platform which has 4 NUMA nodes and 18 CFMWS tables, it increases to 22. This results in the 'get_mempolicy' syscall failure on that platform, because currently 'maxnode' is hard-coded to 8 but the 'get_mempolicy' syscall requires the 'maxnode' to be not smaller than the 'nr_node_ids'. Increase the 'maxnode' to the number of bits of 'nodemask', which is 'unsigned long', to fix this. This may not cover all systems. Perhaps a better way is to always set the 'nodemask' and 'maxnode' based on the actual maximum NUMA node ID on the system, but for now just do the simple way. Reported-by: Yi Lai <yi1.lai@intel.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221014 Closes: https://lore.kernel.org/all/bug-221014-28872@https.bugzilla.kernel.org%2F Signed-off-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Yuan Yao <yaoyuan@linux.alibaba.com> Link: https://patch.msgid.link/20260302205158.178058-1-kai.huang@intel.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2026-03-11 18:41:10 +01:00
Jakub Kicinski	77a6401a87	tools: ynl: add Python API for easier access to policies The format of Netlink policy dump is a bit curious with messages in the same dump carrying both attrs and mapping info. Plus each message carries a single piece of the puzzle the caller must then reassemble. I need to do this reassembly for a test, but I think it's generally useful. So let's add proper support to YnlFamily to return more user-friendly representation. See the various docs in the patch for more details. Link: https://patch.msgid.link/20260310005337.3594225-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-10 19:32:46 -07:00
Dimitri Daskalakis	690043b95c	selftests: drv-net: rss: Add retries to test_rss_key_indir to reduce flakes The test generates 16 flows, and verifies that traffic is distributed across two queues via the NICs RSS indirection table. The likelihood of the flows skewing to a single queue is high, so we retry sending traffic up to 3 times. Alternatively, we could increase the number of generated flows. But debug kernels may struggle to ramp this many flows. During manual testing, the test passed for 10,000 consecutive runs. Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20260309204215.2110486-1-dimitri.daskalakis1@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-10 19:01:51 -07:00
Allison Henderson	87fdf57ded	selftests: rds: Fix tcpdump segfault in rds selftests net/rds/test.py sees a segfault in tcpdump when executed through the ksft runner. [ 21.903713] tcpdump[1469]: segfault at 0 ip 000072100e99126d sp 00007ffccf740fd0 error 4 [ 21.903721] in libc.so.6[16a26d,7798b149a000+188000] [ 21.905074] in libc.so.6[16a26d,72100e84f000+188000] likely on CPU 5 (core 5, socket 0) [ 21.905084] Code: 00 0f 85 a0 00 00 00 48 83 c4 38 89 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 1f 44 00 00 48 8b 05 91 8b 09 00 8b 4d ac 64 89 08 <41> 0f b6 07 83 e8 2b a8 fd 0f 84 54 ff ff ff 49 8b 36 4c 89 ff e8 [ 21.906760] likely on CPU 9 (core 9, socket 0) [ 21.913469] Code: 00 0f 85 a0 00 00 00 48 83 c4 38 89 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 1f 44 00 00 48 8b 05 91 8b 09 00 8b 4d ac 64 89 08 <41> 0f b6 07 83 e8 2b a8 fd 0f 84 54 ff ff ff 49 8b 36 4c 89 ff e8 The os.fork() call creates extra complexity because it forks the entire process including the python interpreter. ip() then calls cmd() which creates a subprocess.Popen. We can avoid the extra layering by simply calling subprocess.Popen directly. Track the process handles directly and terminate them at cleanup rather than relying on killall. Further tcpdump's -Z flag attempts to change savefile ownership, which is not supported by the 9p protocol. Fix this by writing pcap captures to "/tmp" during the test and move them to the log directory after tcpdump exits. Signed-off-by: Allison Henderson <achender@kernel.org> Link: https://patch.msgid.link/20260308055835.1338257-4-achender@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-10 18:54:24 -07:00
Allison Henderson	b873b4e160	selftests: rds: Add ksft timeout rds/run.sh sets a timer of 400s when calling test.py. However when tests are run through ksft, a default 45s timer is applied. Fix this by adding a ksft timeout in tools/testing/selftests/net/rds/settings Signed-off-by: Allison Henderson <achender@kernel.org> Link: https://patch.msgid.link/20260308055835.1338257-3-achender@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-10 18:54:15 -07:00
Allison Henderson	5a0c5702bd	selftests: rds: Fix pylint warnings Tidy up all exiting pylint errors in test.py. No functional changes are introduced in this patch Signed-off-by: Allison Henderson <achender@kernel.org> Link: https://patch.msgid.link/20260308055835.1338257-2-achender@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-10 18:53:42 -07:00
Victor Nogueira	56acc7f519	selftests/tc-testing: Adapt test's output to HFSC's iproute2 printing changes To make the printing of HFSC's defcls consistent with HTB's, iproute2 is now printing defcls prepended with "0x". This commit adapts test a4c3 to this change. Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Link: https://patch.msgid.link/20260307220724.2501212-1-victor@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-09 19:42:19 -07:00
Alok Tiwari	dc9c9193c7	selftests: fib_tests: fix link-local retrieval in fib6_nexthop() fib6_nexthop() retrieves the link-local address for two interfaces used in the test. However, both lldummy and llv1 are obtained from dummy0. llv1 is expected to be retrieved from veth1, which is the interface used later in the test. The subsequent check and error message also expect the address to be retrieved from veth1. Fix this by retrieving llv1 from veth1. Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Link: https://patch.msgid.link/20260306180830.2329477-1-alok.a.tiwari@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-09 19:17:48 -07:00
Linus Torvalds	8b7f4cd3ac	Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Alexei Starovoitov: - Fix u32/s32 bounds when ranges cross min/max boundary (Eduard Zingerman) - Fix precision backtracking with linked registers (Eduard Zingerman) - Fix linker flags detection for resolve_btfids (Ihor Solodrai) - Fix race in update_ftrace_direct_add/del (Jiri Olsa) - Fix UAF in bpf_trampoline_link_cgroup_shim (Lang Xu) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: resolve_btfids: Fix linker flags detection selftests/bpf: add reproducer for spurious precision propagation through calls bpf: collect only live registers in linked regs Revert "selftests/bpf: Update reg_bound range refinement logic" selftests/bpf: test refining u32/s32 bounds when ranges cross min/max boundary bpf: Fix u32/s32 bounds when ranges cross min/max boundary bpf: Fix a UAF issue in bpf_trampoline_link_cgroup_shim ftrace: Add missing ftrace_lock to update_ftrace_direct_add/del	2026-03-07 12:20:37 -08:00

1 2 3 4 5 ...

21729 Commits