Commit Graph

19215 Commits

Author SHA1 Message Date
Anna Emese Nyiri
c935af429e selftests: net: add support for testing SO_RCVMARK and SO_RCVPRIORITY
Introduce tests to verify the correct functionality of the SO_RCVMARK and
SO_RCVPRIORITY socket options.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Suggested-by: Ferenc Fejes <fejes@inf.elte.hu>
Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250214205828.48503-1-annaemesenyiri@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-17 16:45:19 -08:00
Pranav Tyagi
dbcbec81c9 selftests: net: fix grammar in reuseaddr_ports_exhausted.c log message
This patch fixes a grammatical error in a test log message
in reuseaddr_ports_exhausted.c for better clarity.

Signed-off-by: Pranav Tyagi <pranav.tyagi03@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250213152612.4434-1-pranav.tyagi03@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-17 16:34:36 -08:00
Yury Khrustalev
00894c3fc9 selftests/powerpc: Use PKEY_UNRESTRICTED macro
Replace literal 0 with macro PKEY_UNRESTRICTED where pkey_*() functions
are used in mm selftests for memory protection keys for ppc target.

Signed-off-by: Yury Khrustalev <yury.khrustalev@arm.com>
Suggested-by: Kevin Brodsky <kevin.brodsky@arm.com>
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20250113170619.484698-4-yury.khrustalev@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-02-17 18:16:36 +00:00
Yury Khrustalev
3809cefe93 selftests/mm: Use PKEY_UNRESTRICTED macro
Replace literal 0 with macro PKEY_UNRESTRICTED where pkey_*() functions
are used in mm selftests for memory protection keys.

Signed-off-by: Yury Khrustalev <yury.khrustalev@arm.com>
Suggested-by: Joey Gouly <joey.gouly@arm.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20250113170619.484698-3-yury.khrustalev@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-02-17 18:16:36 +00:00
David Wei
71082faa2c io_uring/zcrx: add selftest
Add a selftest for io_uring zero copy Rx. This test cannot run locally
and requires a remote host to be configured in net.config. The remote
host must have hardware support for zero copy Rx as listed in the
documentation page. The test will restore the NIC config back to before
the test and is idempotent.

liburing is required to compile the test and be installed on the remote
host running the test.

Signed-off-by: David Wei <dw@davidwei.uk>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20250215000947.789731-12-dw@davidwei.uk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-17 05:41:09 -07:00
Jens Axboe
5c496ff11d Merge commit '71f0dd5a3293d75d26d405ffbaedfdda4836af32' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next into for-6.15/io_uring-rx-zc
Merge networking zerocopy receive tree, to get the prep patches for
the io_uring rx zc support.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (63 commits)
  net: add helpers for setting a memory provider on an rx queue
  net: page_pool: add memory provider helpers
  net: prepare for non devmem TCP memory providers
  net: page_pool: add a mp hook to unregister_netdevice*
  net: page_pool: add callback for mp info printing
  netdev: add io_uring memory provider info
  net: page_pool: create hooks for custom memory providers
  net: generalise net_iov chunk owners
  net: prefix devmem specific helpers
  net: page_pool: don't cast mp param to devmem
  tools: ynl: add all headers to makefile deps
  eth: fbnic: set IFF_UNICAST_FLT to avoid enabling promiscuous mode when adding unicast addrs
  eth: fbnic: add MAC address TCAM to debugfs
  tools: ynl-gen: support limits using definitions
  tools: ynl-gen: don't output external constants
  net/mlx5e: Avoid WARN_ON when configuring MQPRIO with HTB offload enabled
  net/mlx5e: Remove unused mlx5e_tc_flow_action struct
  net/mlx5: Remove stray semicolon in LAG port selection table creation
  net/mlx5e: Support FEC settings for 200G per lane link modes
  net/mlx5: Add support for 200Gbps per lane link modes
  ...
2025-02-17 05:38:28 -07:00
Greg Kroah-Hartman
2ce177e9b3 Merge 6.14-rc3 into driver-core-next
We need the faux_device changes in here for future work.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-17 07:24:33 +01:00
Linus Torvalds
82ff316456 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - Large set of fixes for vector handling, especially in the
     interactions between host and guest state.

     This fixes a number of bugs affecting actual deployments, and
     greatly simplifies the FP/SIMD/SVE handling. Thanks to Mark Rutland
     for dealing with this thankless task.

   - Fix an ugly race between vcpu and vgic creation/init, resulting in
     unexpected behaviours

   - Fix use of kernel VAs at EL2 when emulating timers with nVHE

   - Small set of pKVM improvements and cleanups

  x86:

   - Fix broken SNP support with KVM module built-in, ensuring the PSP
     module is initialized before KVM even when the module
     infrastructure cannot be used to order initcalls

   - Reject Hyper-V SEND_IPI hypercalls if the local APIC isn't being
     emulated by KVM to fix a NULL pointer dereference

   - Enter guest mode (L2) from KVM's perspective before initializing
     the vCPU's nested NPT MMU so that the MMU is properly tagged for
     L2, not L1

   - Load the guest's DR6 outside of the innermost .vcpu_run() loop, as
     the guest's value may be stale if a VM-Exit is handled in the
     fastpath"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits)
  x86/sev: Fix broken SNP support with KVM module built-in
  KVM: SVM: Ensure PSP module is initialized if KVM module is built-in
  crypto: ccp: Add external API interface for PSP module initialization
  KVM: arm64: vgic: Hoist SGI/PPI alloc from vgic_init() to kvm_create_vgic()
  KVM: arm64: timer: Drop warning on failed interrupt signalling
  KVM: arm64: Fix alignment of kvm_hyp_memcache allocations
  KVM: arm64: Convert timer offset VA when accessed in HYP code
  KVM: arm64: Simplify warning in kvm_arch_vcpu_load_fp()
  KVM: arm64: Eagerly switch ZCR_EL{1,2}
  KVM: arm64: Mark some header functions as inline
  KVM: arm64: Refactor exit handlers
  KVM: arm64: Refactor CPTR trap deactivation
  KVM: arm64: Remove VHE host restore of CPACR_EL1.SMEN
  KVM: arm64: Remove VHE host restore of CPACR_EL1.ZEN
  KVM: arm64: Remove host FPSIMD saving for non-protected KVM
  KVM: arm64: Unconditionally save+flush host FPSIMD/SVE/SME state
  KVM: x86: Load DR6 with guest value only before entering .vcpu_run() loop
  KVM: nSVM: Enter guest mode before initializing nested NPT MMU
  KVM: selftests: Add CPUID tests for Hyper-V features that need in-kernel APIC
  KVM: selftests: Manage CPUID array in Hyper-V CPUID test's core helper
  ...
2025-02-16 10:25:12 -08:00
Thomas Weißschuh
e275f44e0a kunit: qemu_configs: sparc: use Zilog console
The driver for the 8250 console is not used, as no port is found.
Instead the prom0 bootconsole is used the whole time.
The prom driver translates '\n' to '\r\n' before handing of the message
off to the firmware. The firmware performs the same translation again.
In the final output produced by QEMU each line ends with '\r\r\n'.
This breaks the kunit parser, which can only handle '\r\n' and '\n'.

Use the Zilog console instead. It works correctly, is the one documented
by the QEMU manual and also saves a bit of codesize:
Before=4051011, After=4023326, chg -0.68%

Observed on QEMU 9.2.0.

Link: https://lore.kernel.org/r/20250214-kunit-qemu-sparc-console-v1-1-ba1dfdf8f0b1@linutronix.de
Fixes: 87c9c16317 ("kunit: tool: add support for QEMU")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-02-15 19:06:56 -07:00
Brendan Jackman
08fafac4c9 kunit: tool: Use qboot on QEMU x86_64
As noted in [0], SeaBIOS (QEMU default) makes a mess of the terminal,
qboot does not.

It turns out this is actually useful with kunit.py, since the user is
exposed to this issue if they set --raw_output=all.

qboot is also faster than SeaBIOS, but it's is marginal for this
usecase.

[0] https://lore.kernel.org/all/CA+i-1C0wYb-gZ8Mwh3WSVpbk-LF-Uo+njVbASJPe1WXDURoV7A@mail.gmail.com/

Both SeaBIOS and qboot are x86-specific.

Link: https://lore.kernel.org/r/20250124-kunit-qboot-v1-1-815e4d4c6f7c@google.com
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-02-15 19:06:41 -07:00
Sebastian Andrzej Siewior
633488947e kernfs: Use RCU to access kernfs_node::parent.
kernfs_rename_lock is used to obtain stable kernfs_node::{name|parent}
pointer. This is a preparation to access kernfs_node::parent under RCU
and ensure that the pointer remains stable under the RCU lifetime
guarantees.

For a complete path, as it is done in kernfs_path_from_node(), the
kernfs_rename_lock is still required in order to obtain a stable parent
relationship while computing the relevant node depth. This must not
change while the nodes are inspected in order to build the path.
If the kernfs user never moves the nodes (changes the parent) then the
kernfs_rename_lock is not required and the RCU guarantees are
sufficient. This "restriction" can be set with
KERNFS_ROOT_INVARIANT_PARENT. Otherwise the lock is required.

Rename kernfs_node::parent to kernfs_node::__parent to denote the RCU
access and use RCU accessor while accessing the node.
Make cgroup use KERNFS_ROOT_INVARIANT_PARENT since the parent here can
not change.

Acked-by: Tejun Heo <tj@kernel.org>
Cc: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20250213145023.2820193-6-bigeasy@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-15 17:46:32 +01:00
Andrii Nakryiko
4eb93fea59 selftests/bpf: add test for LDX/STX/ST relocations over array field
Add a simple repro for the issue of miscalculating LDX/STX/ST CO-RE
relocation size adjustment when the CO-RE relocation target type is an
ARRAY.

We need to make sure that compiler generates LDX/STX/ST instruction with
CO-RE relocation against entire ARRAY type, not ARRAY's element. With
the code pattern in selftest, we get this:

      59:       61 71 00 00 00 00 00 00 w1 = *(u32 *)(r7 + 0x0)
                00000000000001d8:  CO-RE <byte_off> [5] struct core_reloc_arrays::a (0:0)

Where offset of `int a[5]` is embedded (through CO-RE relocation) into memory
load instruction itself.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250207014809.1573841-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-14 19:58:14 -08:00
Jiayuan Chen
72266ee83f selftests/bpf: Add selftest for may_goto
Added test cases to ensure that programs with stack sizes exceeding 512
bytes are restricted in non-JITed mode, and can be executed normally in
JITed mode, even with stack sizes exceeding 512 bytes due to the presence
of may_goto instructions.

Test result:
echo "0" > /proc/sys/net/core/bpf_jit_enable
./test_progs -t verifier_stack_ptr
...
stack size 512 with may_goto with jit:SKIP
stack size 512 with may_goto without jit:OK
...
Summary: 1/27 PASSED, 25 SKIPPED, 0 FAILED

echo "1" > /proc/sys/net/core/bpf_jit_enable
./test_progs -t verifier_stack_ptr
...
stack size 512 with may_goto with jit:OK
stack size 512 with may_goto without jit:SKIP
...
Summary: 1/27 PASSED, 25 SKIPPED, 0 FAILED

Signed-off-by: Jiayuan Chen <mrpre@163.com>
Link: https://lore.kernel.org/r/20250214091823.46042-4-mrpre@163.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-14 19:55:15 -08:00
Jiayuan Chen
b38c72ab80 selftests/bpf: Introduce __load_if_JITed annotation for tests
In some cases, the verification logic under the interpreter and JIT
differs, such as may_goto, and the test program behaves differently under
different runtime modes, requiring separate verification logic for each
result.

Introduce __load_if_JITed and __load_if_no_JITed annotation for tests.

Signed-off-by: Jiayuan Chen <mrpre@163.com>
Link: https://lore.kernel.org/r/20250214091823.46042-3-mrpre@163.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-14 19:55:15 -08:00
Stafford Horne
713e788c0e rseq/selftests: Fix riscv rseq_offset_deref_addv inline asm
When working on OpenRISC support for restartable sequences I noticed
and fixed these two issues with the riscv support bits.

 1 The 'inc' argument to RSEQ_ASM_OP_R_DEREF_ADDV was being implicitly
   passed to the macro.  Fix this by adding 'inc' to the list of macro
   arguments.
 2 The inline asm input constraints for 'inc' and 'off' use "er",  The
   riscv gcc port does not have an "e" constraint, this looks to be
   copied from the x86 port.  Fix this by just using an "r" constraint.

I have compile tested this only for riscv.  However, the same fixes I
use in the OpenRISC rseq selftests and everything passes with no issues.

Fixes: 171586a6ab ("selftests/rseq: riscv: Template memory ordering and percpu access mode")
Signed-off-by: Stafford Horne <shorne@gmail.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250114170721.3613280-1-shorne@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-02-14 13:06:36 -08:00
Linus Torvalds
04f41cbf03 Merge tag 'sched_ext-for-6.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fixes from Tejun Heo:

 - Fix lock imbalance in a corner case of dispatch_to_local_dsq()

 - Migration disabled tasks were confusing some BPF schedulers and its
   handling had a bug. Fix it and simplify the default behavior by
   dispatching them automatically

 - ops.tick(), ops.disable() and ops.exit_task() were incorrectly
   disallowing kfuncs that require the task argument to be the rq
   operation is currently operating on and thus is rq-locked.
   Allow them.

 - Fix autogroup migration handling bug which was occasionally
   triggering a warning in the cgroup migration path

 - tools/sched_ext, selftest and other misc updates

* tag 'sched_ext-for-6.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  sched_ext: Use SCX_CALL_OP_TASK in task_tick_scx
  sched_ext: Fix the incorrect bpf_list kfunc API in common.bpf.h.
  sched_ext: selftests: Fix grammar in tests description
  sched_ext: Fix incorrect assumption about migration disabled tasks in task_can_run_on_remote_rq()
  sched_ext: Fix migration disabled handling in targeted dispatches
  sched_ext: Implement auto local dispatching of migration disabled tasks
  sched_ext: Fix incorrect time delta calculation in time_delta()
  sched_ext: Fix lock imbalance in dispatch_to_local_dsq()
  sched_ext: selftests/dsp_local_on: Fix selftest on UP systems
  tools/sched_ext: Add helper to check task migration state
  sched_ext: Fix incorrect autogroup migration detection
  sched_ext: selftests/dsp_local_on: Fix sporadic failures
  selftests/sched_ext: Fix enum resolution
  sched_ext: Include task weight in the error state dump
  sched_ext: Fixes typos in comments
2025-02-14 11:14:24 -08:00
Linus Torvalds
80868f5d3d Merge tag 'cgroup-for-6.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:

 - Fix a race window where a newly forked task could escape cgroup.kill

 - Remove incorrectly included steal time from cpu.stat::usage_usec

 - Minor update in selftest

* tag 'cgroup-for-6.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: Remove steal time from usage_usec
  selftests/cgroup: use bash in test_cpuset_v1_hp.sh
  cgroup: fix race between fork and cgroup.kill
2025-02-14 11:00:42 -08:00
Sean Christopherson
16fc7cb406 KVM: selftests: Add infrastructure for getting vCPU binary stats
Now that the binary stats cache infrastructure is largely scope agnostic,
add support for vCPU-scoped stats.  Like VM stats, open and cache the
stats FD when the vCPU is created so that it's guaranteed to be valid when
vcpu_get_stats() is invoked.

Account for the extra per-vCPU file descriptor in kvm_set_files_rlimit(),
so that tests that create large VMs don't run afoul of resource limits.

To sanity check that the infrastructure actually works, and to get a bit
of bonus coverage, add an assert in x86's xapic_ipi_test to verify that
the number of HLTs executed by the test matches the number of HLT exits
observed by KVM.

Tested-by: Manali Shukla <Manali.Shukla@amd.com>
Link: https://lore.kernel.org/r/20250111005049.1247555-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-14 07:02:13 -08:00
Sean Christopherson
9b56532b8a KVM: selftests: Adjust number of files rlimit for all "standard" VMs
Move the max vCPUs test's RLIMIT_NOFILE adjustments to common code, and
use the new helper to adjust the resource limit for non-barebones VMs by
default.  x86's recalc_apic_map_test creates 512 vCPUs, and a future
change will open the binary stats fd for all vCPUs, which will put the
recalc APIC test above some distros' default limit of 1024.

Link: https://lore.kernel.org/r/20250111005049.1247555-8-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-14 07:02:12 -08:00
Sean Christopherson
ea7179f995 KVM: selftests: Get VM's binary stats FD when opening VM
Get and cache a VM's binary stats FD when the VM is opened, as opposed to
waiting until the stats are first used.  Opening the stats FD outside of
__vm_get_stat() will allow converting it to a scope-agnostic helper.

Note, this doesn't interfere with kvm_binary_stats_test's testcase that
verifies a stats FD can be used after its own VM's FD is closed, as the
cached FD is also closed during kvm_vm_free().

Link: https://lore.kernel.org/r/20250111005049.1247555-7-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-14 07:02:11 -08:00
Sean Christopherson
e65faf71bd KVM: selftests: Add struct and helpers to wrap binary stats cache
Add a struct and helpers to manage the binary stats cache, which is
currently used only for VM-scoped stats.  This will allow expanding the
selftests infrastructure to provide support for vCPU-scoped binary stats,
which, except for the ioctl to get the stats FD are identical to VM-scoped
stats.

Defer converting __vm_get_stat() to a scope-agnostic helper to a future
patch, as getting the stats FD from KVM needs to be moved elsewhere
before it can be made completely scope-agnostic.

Link: https://lore.kernel.org/r/20250111005049.1247555-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-14 07:02:09 -08:00
Sean Christopherson
b0c3f5df92 KVM: selftests: Macrofy vm_get_stat() to auto-generate stat name string
Turn vm_get_stat() into a macro that generates a string for the stat name,
as opposed to taking a string.  This will allow hardening stat usage in
the future to generate errors on unknown stats at compile time.

No functional change intended.

Link: https://lore.kernel.org/r/20250111005049.1247555-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-14 07:01:55 -08:00
Sean Christopherson
eead13d493 KVM: selftests: Assert that __vm_get_stat() actually finds a stat
Fail the test if it attempts to read a stat that doesn't exist, e.g. due
to a typo (hooray, strings), or because the test tried to get a stat for
the wrong scope.  As is, there's no indiciation of failure and @data is
left untouched, e.g. holds '0' or random stack data in most cases.

Fixes: 8448ec5993 ("KVM: selftests: Add NX huge pages test")
Link: https://lore.kernel.org/r/20250111005049.1247555-4-seanjc@google.com
[sean: fixup spelling mistake, courtesy of Colin Ian King]
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-14 07:01:27 -08:00
Bharadwaj Raju
78332fdb95 selftests/landlock: Add binaries to .gitignore
Building the test creates binaries 'wait-pipe' and
'sandbox-and-launch' which need to be gitignore'd.

Signed-off-by: Bharadwaj Raju <bharadwaj.raju777@gmail.com>
Link: https://lore.kernel.org/r/20250210161101.6024-1-bharadwaj.raju777@gmail.com
[mic: Sort entries]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-02-14 09:23:11 +01:00
Mikhail Ivanov
3d4033985f selftests/landlock: Test that MPTCP actions are not restricted
Extend protocol fixture with test suits for MPTCP protocol.
Add CONFIG_MPTCP and CONFIG_MPTCP_IPV6 options in config.

Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com>
Link: https://lore.kernel.org/r/20250205093651.1424339-4-ivanov.mikhail1@huawei-partners.com
Cc: <stable@vger.kernel.org> # 6.7.x
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-02-14 09:23:10 +01:00
Mikhail Ivanov
f5534d511b selftests/landlock: Test TCP accesses with protocol=IPPROTO_TCP
Extend protocol_variant structure with protocol field (Cf. socket(2)).

Extend protocol fixture with TCP test suits with protocol=IPPROTO_TCP
which can be used as an alias for IPPROTO_IP (=0) in socket(2).

Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com>
Link: https://lore.kernel.org/r/20250205093651.1424339-3-ivanov.mikhail1@huawei-partners.com
Cc: <stable@vger.kernel.org> # 6.7.x
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-02-14 09:23:09 +01:00
Mickaël Salaün
89cb121e94 selftests/landlock: Enable the new CONFIG_AF_UNIX_OOB
Since commit 5155cbcdbf ("af_unix: Add a prompt to
CONFIG_AF_UNIX_OOB"), the Landlock selftests's configuration is not
enough to build a minimal kernel.  Because scoped_signal_test checks
with the MSG_OOB flag, we need to enable CONFIG_AF_UNIX_OOB for tests:

 #  RUN           fown.no_sandbox.sigurg_socket ...
 # scoped_signal_test.c:420:sigurg_socket:Expected 1 (1) == send(client_socket, ".", 1, MSG_OOB) (-1)
 # sigurg_socket: Test terminated by assertion
 #          FAIL  fown.no_sandbox.sigurg_socket
 ...

Cc: Günther Noack <gnoack@google.com>
Acked-by: Florent Revest <revest@chromium.org>
Link: https://lore.kernel.org/r/20250211132531.1625566-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-02-14 09:23:06 +01:00
Song Liu
60c2e1fa91 selftests/bpf: Test kfuncs that set and remove xattr from BPF programs
Two sets of tests are added to exercise the not _locked and _locked
version of the kfuncs. For both tests, user space accesses xattr
security.bpf.foo on a testfile. The BPF program is triggered by user
space access (on LSM hook inode_[set|get]_xattr) and sets or removes
xattr security.bpf.bar. Then user space then validates that xattr
security.bpf.bar is set or removed as expected.

Note that, in both tests, the BPF programs use the not _locked kfuncs.
The verifier picks the proper kfuncs based on the calling context.

Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250130213549.3353349-6-song@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-13 19:35:32 -08:00
Song Liu
ab39ad6796 selftests/bpf: Extend test fs_kfuncs to cover security.bpf. xattr names
Extend test_progs fs_kfuncs to cover different xattr names. Specifically:
xattr name "user.kfuncs" and "security.bpf.xxx" can be read from BPF
program with kfuncs bpf_get_[file|dentry]_xattr(); while "security.bpf"
and "security.selinux" cannot be read.

Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250130213549.3353349-3-song@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-13 19:35:31 -08:00
Amery Hung
b99f27e902 selftests/bpf: Fix stdout race condition in traffic monitor
Fix a race condition between the main test_progs thread and the traffic
monitoring thread. The traffic monitor thread tries to print a line
using multiple printf and use flockfile() to prevent the line from being
torn apart. Meanwhile, the main thread doing io redirection can reassign
or close stdout when going through tests. A deadlock as shown below can
happen.

       main                      traffic_monitor_thread
       ====                      ======================
                                 show_transport()
                                 -> flockfile(stdout)

stdio_hijack_init()
-> stdout = open_memstream(log_buf, log_cnt);
   ...
   env.subtest_state->stdout_saved = stdout;

                                    ...
                                    funlockfile(stdout)
stdio_restore_cleanup()
-> fclose(env.subtest_state->stdout_saved);

After the traffic monitor thread lock stdout, A new memstream can be
assigned to stdout by the main thread. Therefore, the traffic monitor
thread later will not be able to unlock the original stdout. As the
main thread tries to access the old stdout, it will hang indefinitely
as it is still locked by the traffic monitor thread.

The deadlock can be reproduced by running test_progs repeatedly with
traffic monitor enabled:

for ((i=1;i<=100;i++)); do
  ./test_progs -a flow_dissector_skb* -m '*'
done

Fix this by only calling printf once and remove flockfile()/funlockfile().

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250213233217.553258-1-ameryhung@gmail.com
2025-02-13 17:06:25 -08:00
Jakub Kicinski
7a7e019713 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.14-rc3).

No conflicts or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13 12:43:30 -08:00
Linus Torvalds
348f968b89 Merge tag 'net-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter, wireless and bluetooth.

  Kalle Valo steps down after serving as the WiFi driver maintainer for
  over a decade.

  Current release - fix to a fix:

   - vsock: orphan socket after transport release, avoid null-deref

   - Bluetooth: L2CAP: fix corrupted list in hci_chan_del

  Current release - regressions:

   - eth:
      - stmmac: correct Rx buffer layout when SPH is enabled
      - iavf: fix a locking bug in an error path

   - rxrpc: fix alteration of headers whilst zerocopy pending

   - s390/qeth: move netif_napi_add_tx() and napi_enable() from under BH

   - Revert "netfilter: flowtable: teardown flow if cached mtu is stale"

  Current release - new code bugs:

   - rxrpc: fix ipv6 path MTU discovery, only ipv4 worked

   - pse-pd: fix deadlock in current limit functions

  Previous releases - regressions:

   - rtnetlink: fix netns refleak with rtnl_setlink()

   - wifi: brcmfmac: use random seed flag for BCM4355 and BCM4364
     firmware

  Previous releases - always broken:

   - add missing RCU protection of struct net throughout the stack

   - can: rockchip: bail out if skb cannot be allocated

   - eth: ti: am65-cpsw: base XDP support fixes

  Misc:

   - ethtool: tsconfig: update the format of hwtstamp flags, changes the
     uAPI but this uAPI was not in any release yet"

* tag 'net-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits)
  net: pse-pd: Fix deadlock in current limit functions
  rxrpc: Fix ipv6 path MTU discovery
  Reapply "net: skb: introduce and use a single page frag cache"
  s390/qeth: move netif_napi_add_tx() and napi_enable() from under BH
  mlxsw: Add return value check for mlxsw_sp_port_get_stats_raw()
  ipv6: mcast: add RCU protection to mld_newpack()
  team: better TEAM_OPTION_TYPE_STRING validation
  Bluetooth: L2CAP: Fix corrupted list in hci_chan_del
  Bluetooth: btintel_pcie: Fix a potential race condition
  Bluetooth: L2CAP: Fix slab-use-after-free Read in l2cap_send_cmd
  net: ethernet: ti: am65_cpsw: fix tx_cleanup for XDP case
  net: ethernet: ti: am65-cpsw: fix RX & TX statistics for XDP_TX case
  net: ethernet: ti: am65-cpsw: fix memleak in certain XDP cases
  vsock/test: Add test for SO_LINGER null ptr deref
  vsock: Orphan socket after transport release
  MAINTAINERS: Add sctp headers to the general netdev entry
  Revert "netfilter: flowtable: teardown flow if cached mtu is stale"
  iavf: Fix a locking bug in an error path
  rxrpc: Fix alteration of headers whilst zerocopy pending
  net: phylink: make configuring clock-stop dependent on MAC support
  ...
2025-02-13 12:17:04 -08:00
Devaansh Kumar
0760d62dad sched_ext: selftests: Fix grammar in tests description
Fixed grammar for a few tests of sched_ext.

Signed-off-by: Devaansh Kumar <devaanshk840@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-13 06:46:22 -10:00
Michal Luczaj
440c9d4887 vsock/test: Add test for SO_LINGER null ptr deref
Explicitly close() a TCP_ESTABLISHED (connectible) socket with SO_LINGER
enabled.

As for now, test does not verify if close() actually lingers.
On an unpatched machine, may trigger a null pointer dereference.

Tested-by: Luigi Leonardi <leonardi@redhat.com>
Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250210-vsock-linger-nullderef-v3-2-ef6244d02b54@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-12 20:01:29 -08:00
Tamir Duberstein
313b38a6ec lib/prime_numbers: convert self-test to KUnit
Extract a private header and convert the prime_numbers self-test to a
KUnit test. I considered parameterizing the test using
`KUNIT_CASE_PARAM` but didn't see how it was possible since the test
logic is entangled with the test parameter generation logic.

Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Link: https://lore.kernel.org/r/20250208-prime_numbers-kunit-convert-v5-2-b0cb82ae7c7d@gmail.com
Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-12 14:00:11 -08:00
Thomas Weißschuh
16681bea9a selftests/nolibc: split up architecture list in run-tests.sh
The list is getting overly long and any modifications introduce a lot of
noise and are prone to conflicts. Split the string into a bash array
and break that into multiple lines.

Link: https://lore.kernel.org/r/20250211-nolibc-test-archs-v1-1-8e55aa3369cf@weissschuh.net
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-02-12 18:57:04 +01:00
Sean Christopherson
f7f232a01f KVM: selftests: Close VM's binary stats FD when releasing VM
Close/free a VM's binary stats cache when the VM is released, not when the
VM is fully freed.  When a VM is re-created, e.g. for state save/restore
tests, the stats FD and descriptor points at the old, defunct VM.  The FD
is still valid, in that the underlying stats file won't be freed until the
FD is closed, but reading stats will always pull information from the old
VM.

Note, this is a benign bug in the current code base as none of the tests
that recreate VMs use binary stats.

Fixes: 83f6e109f5 ("KVM: selftests: Cache binary stats metadata for duration of test")
Link: https://lore.kernel.org/r/20250111005049.1247555-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-12 09:02:42 -08:00
Sean Christopherson
fd546aba19 KVM: selftests: Fix mostly theoretical leak of VM's binary stats FD
When allocating and freeing a VM's cached binary stats info, check for a
NULL descriptor, not a '0' file descriptor, as '0' is a legal FD.  E.g. in
the unlikely scenario the kernel installs the stats FD at entry '0',
selftests would reallocate on the next __vm_get_stat() and/or fail to free
the stats in kvm_vm_free().

Fixes: 83f6e109f5 ("KVM: selftests: Cache binary stats metadata for duration of test")
Link: https://lore.kernel.org/r/20250111005049.1247555-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-12 09:02:42 -08:00
Sean Christopherson
dae7d81e8d KVM: selftests: Allow running a single iteration of dirty_log_test
Now that dirty_log_test doesn't require running multiple iterations to
verify dirty pages, and actually runs the requested number of iterations,
drop the requirement that the test run at least "3" (which was really "2"
at the time the test was written) iterations.

Link: https://lore.kernel.org/r/20250111003004.1235645-21-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-12 09:00:56 -08:00
Sean Christopherson
7f225650e0 KVM: selftests: Fix an off-by-one in the number of dirty_log_test iterations
Actually run all requested iterations, instead of iterations-1 (the count
starts at '1' due to the need to avoid '0' as an in-memory value for a
dirty page).

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20250111003004.1235645-20-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-12 09:00:56 -08:00
Sean Christopherson
2680dcfb34 KVM: selftests: Set per-iteration variables at the start of each iteration
Set the per-iteration variables at the start of each iteration instead of
setting them before the loop, and at the end of each iteration.  To ensure
the vCPU doesn't race ahead before the first iteration, simply have the
vCPU worker want for sem_vcpu_cont, which conveniently avoids the need to
special case posting sem_vcpu_cont from the loop.

Link: https://lore.kernel.org/r/20250111003004.1235645-19-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-12 09:00:56 -08:00
Sean Christopherson
2020d3b77a KVM: selftests: Tighten checks around prev iter's last dirty page in ring
Now that each iteration collects all dirty entries and ensures the guest
*completes* at least one write, tighten the exemptions for the last dirty
page of the previous iteration.  Specifically, the only legal value (other
than the current iteration) is N-1.

Unlike the last page for the current iteration, the in-progress write from
the previous iteration is guaranteed to have completed, otherwise the test
would have hung.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20250111003004.1235645-18-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-12 09:00:56 -08:00
Sean Christopherson
73eaa2aa14 KVM: selftests: Ensure guest writes min number of pages in dirty_log_test
Ensure the vCPU fully completes at least one write in each dirty_log_test
iteration, as failure to dirty any pages complicates verification and
forces the test to be overly conservative about possible values.  E.g.
verification needs to allow the last dirty page from a previous iteration
to have *any* value, because the vCPU could get stuck for multiple
iterations, which is unlikely but can happen in heavily overloaded and/or
nested virtualization setups.

Somewhat arbitrarily set the minimum to 0x100/256; high enough to be
interesting, but not so high as to lead to pointlessly long runtimes.

Opportunistically report the number of writes per iteration for debug
purposes, and so that a human can sanity check the test.  Due to each
write targeting a random page, the number of dirty pages will likely be
lower than the number of total writes, but it shouldn't be absurdly lower
(which would suggest the pRNG is broken)

Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20250111003004.1235645-17-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-12 09:00:56 -08:00
Sean Christopherson
485e27ed20 KVM: sefltests: Verify value of dirty_log_test last page isn't bogus
Add a sanity check that a completely garbage value wasn't written to
the last dirty page in the ring, e.g. that it doesn't contain the *next*
iteration's value.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20250111003004.1235645-16-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-12 09:00:56 -08:00
Sean Christopherson
d0bd72cb91 KVM: selftests: Collect *all* dirty entries in each dirty_log_test iteration
Collect all dirty entries during each iteration of dirty_log_test by
doing a final collection after the vCPU has been stopped.  To deal with
KVM's destructive approach to getting the dirty bitmaps, use a second
bitmap for the post-stop collection.

Collecting all entries that were dirtied during an iteration simplifies
the verification logic *and* improves test coverage.

  - If a page is written during iteration X, but not seen as dirty until
    X+1, the test can get a false pass if the page is also written during
    X+1.

  - If a dirty page used a stale value from a previous iteration, the test
    would grant a false pass.

  - If a missed dirty log occurs in the last iteration, the test would fail
    to detect the issue.

E.g. modifying mark_page_dirty_in_slot() to dirty an unwritten gfn:

	if (memslot && kvm_slot_dirty_track_enabled(memslot)) {
		unsigned long rel_gfn = gfn - memslot->base_gfn;
		u32 slot = (memslot->as_id << 16) | memslot->id;

		if (!vcpu->extra_dirty &&
		    gfn_to_memslot(kvm, gfn + 1) == memslot) {
			vcpu->extra_dirty = true;
			mark_page_dirty_in_slot(kvm, memslot, gfn + 1);
		}
		if (kvm->dirty_ring_size && vcpu)
			kvm_dirty_ring_push(vcpu, slot, rel_gfn);
		else if (memslot->dirty_bitmap)
			set_bit_le(rel_gfn, memslot->dirty_bitmap);
	}

isn't detected with the current approach, even with an interval of 1ms
(when running nested in a VM; bare metal would be even *less* likely to
detect the bug due to the vCPU being able to dirty more memory).  Whereas
collecting all dirty entries consistently detects failures with an
interval of 700ms or more (the longer interval means a higher probability
of an actual write to the prematurely-dirtied page).

Link: https://lore.kernel.org/r/20250111003004.1235645-15-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-12 09:00:56 -08:00
Sean Christopherson
24b9a2a613 KVM: selftests: Print (previous) last_page on dirty page value mismatch
Print out the last dirty pages from the current and previous iteration on
verification failures.  In many cases, bugs (especially test bugs) occur
on the edges, i.e. on or near the last pages, and being able to correlate
failures with the last pages can aid in debug.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20250111003004.1235645-14-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-12 09:00:56 -08:00
Sean Christopherson
c616f36a10 KVM: selftests: Use continue to handle all "pass" scenarios in dirty_log_test
When verifying pages in dirty_log_test, immediately continue on all "pass"
scenarios to make the logic consistent in how it handles pass vs. fail.

No functional change intended.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20250111003004.1235645-13-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-12 09:00:55 -08:00
Sean Christopherson
9a91f65424 KVM: selftests: Post to sem_vcpu_stop if and only if vcpu_stop is true
When running dirty_log_test using the dirty ring, post to sem_vcpu_stop
only when the main thread has explicitly requested that the vCPU stop.
Synchronizing the vCPU and main thread whenever the dirty ring happens to
be full is unnecessary, as KVM's ABI is to actively prevent the vCPU from
running until the ring is no longer full.  I.e. attempting to run the vCPU
will simply result in KVM_EXIT_DIRTY_RING_FULL without ever entering the
guest.  And if KVM doesn't exit, e.g. let's the vCPU dirty more pages,
then that's a KVM bug worth finding.

Posting to sem_vcpu_stop on ring full also makes it difficult to get the
test logic right, e.g. it's easy to let the vCPU keep running when it
shouldn't, as a ring full can essentially happen at any given time.

Opportunistically rework the handling of dirty_ring_vcpu_ring_full to
leave it set for the remainder of the iteration in order to simplify the
surrounding logic.

Link: https://lore.kernel.org/r/20250111003004.1235645-12-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-12 09:00:55 -08:00
Sean Christopherson
0a818b3541 KVM: selftests: Keep dirty_log_test vCPU in guest until it needs to stop
In the dirty_log_test guest code, exit to userspace only when the vCPU is
explicitly told to stop.  Periodically exiting just to check if a flag has
been set is unnecessary, weirdly complex, and wastes time handling exits
that could be used to dirty memory.

Opportunistically convert 'i' to a uint64_t to guard against the unlikely
scenario that guest_num_pages exceeds the storage of an int.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20250111003004.1235645-11-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-12 09:00:55 -08:00
Sean Christopherson
f3629c0ef1 KVM: selftests: Honor "stop" request in dirty ring test
Now that the vCPU doesn't dirty every page on the first iteration for
architectures that support the dirty ring, honor vcpu_stop in the dirty
ring's vCPU worker, i.e. stop when the main thread says "stop".  This will
allow plumbing vcpu_stop into the guest so that the vCPU doesn't need to
periodically exit to userspace just to see if it should stop.

Add a comment explaining that marking all pages as dirty is problematic
for the dirty ring, as it results in the guest getting stuck on "ring
full".  This could be addressed by adding a GUEST_SYNC() in that initial
loop, but it's not clear how that would interact with s390's behavior.

Link: https://lore.kernel.org/r/20250111003004.1235645-10-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-12 09:00:55 -08:00