linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-19 07:01:21 -04:00

Author	SHA1	Message	Date
Carolina Jubran	9ecd05a2c8	selftests: drv-net: Fix and clarify TC bandwidth split in devlink_rate_tc_bw.py Correct the documented bandwidth distribution between TC3 and TC4 from 80/20 to 20/80. Update test descriptions and printed messages to consistently reflect the intended split. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-6-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-12-01 17:18:41 -08:00
Carolina Jubran	3796e549e3	selftests: drv-net: Set shell=True for sysfs writes in devlink_rate_tc_bw.py Commit 7c32f7a2d3db ("selftests: net: py: don't default to shell=True") changed the cmd() helper to avoid spawning a shell unless explicitly requested. The devlink_rate_tc_bw test enables SR-IOV by writing to the sriov_numvfs sysfs attribute using redirection. Without shell=True the redirection is not interpreted and the VF device never appears, causing the test to fail. Fix by explicitly passing shell=True in the two places that update sriov_numvfs. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-5-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-12-01 17:18:41 -08:00
Carolina Jubran	cb1acbd30a	selftests: drv-net: Use Iperf3Runner in devlink_rate_tc_bw.py Replace the inline iperf3 subprocess and JSON parsing with Iperf3Runner. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-4-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-12-01 17:18:41 -08:00
Carolina Jubran	2a60ce94c6	selftests: drv-net: introduce Iperf3Runner for measurement use cases GenerateTraffic was added to spin up long-running iperf3 load, mainly to drive high PPS background traffic. It was never meant to provide stable throughput numbers, and trying to repurpose it for measurement does not make sense. Introduce Iperf3Runner to allow tests to split out server/client configuration, control start/stop, and collect JSON output for analysis. This makes it possible to measure bandwidth directly when validating egress shaping. GenerateTraffic stays as the background load generator, reusing the common iperf3 helpers under the hood. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-3-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-12-01 17:18:41 -08:00
Carolina Jubran	a8658f7bb6	selftests: drv-net: Add devlink_rate_tc_bw.py to TEST_PROGS This makes devlink_rate_tc_bw.py present in the Makefile under the same directory. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-2-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-12-01 17:18:40 -08:00
Andre Carvalho	e3b8cbf40c	selftests: netconsole: remove log noise due to socat exit This removes some noise that can be distracting while looking at selftests by redirecting socat stderr to /dev/null. Before this commit, netcons_basic would output: Running with target mode: basic (ipv6) 2025/11/29 12:08:03 socat[259] W exiting on signal 15 2025/11/29 12:08:03 socat[271] W exiting on signal 15 basic : ipv6 : Test passed Running with target mode: basic (ipv4) 2025/11/29 12:08:05 socat[329] W exiting on signal 15 2025/11/29 12:08:05 socat[322] W exiting on signal 15 basic : ipv4 : Test passed Running with target mode: extended (ipv6) 2025/11/29 12:08:08 socat[386] W exiting on signal 15 2025/11/29 12:08:08 socat[386] W exiting on signal 15 2025/11/29 12:08:08 socat[380] W exiting on signal 15 extended : ipv6 : Test passed Running with target mode: extended (ipv4) 2025/11/29 12:08:10 socat[440] W exiting on signal 15 2025/11/29 12:08:10 socat[435] W exiting on signal 15 2025/11/29 12:08:10 socat[435] W exiting on signal 15 extended : ipv4 : Test passed After these changes, output looks like: Running with target mode: basic (ipv6) basic : ipv6 : Test passed Running with target mode: basic (ipv4) basic : ipv4 : Test passed Running with target mode: extended (ipv6) extended : ipv6 : Test passed Running with target mode: extended (ipv4) extended : ipv4 : Test passed Signed-off-by: Andre Carvalho <asantostc@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251129-netcons-socat-noise-v1-1-605a0cea8fca@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-12-01 12:06:55 -08:00
Jakub Kicinski	aadff9f766	selftests: net: add a hint about MACAddressPolicy=persistent New NIPA installation had been reporting a few flaky tests. arp_ndisc_evict_nocarrier is most flaky of them all. I suspect that the flakiness is due to udev swapping the MAC addresses on the interfaces. Extend the message in arp_ndisc_evict_nocarrier to hint at this potential issue. Having the neigh get fail right after ping is rather unusual, unless udev changes the MAC addr causing a flush in the meantime. Link: https://patch.msgid.link/20251127194556.2409574-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-12-01 12:02:13 -08:00
Jakub Kicinski	4b1639cac0	selftests: net: py: handle interrupt during cleanup Following up on the old discussion [1]. Let the BaseExceptions out of defer()'ed cleanup. And handle it in the main loop. This allows us to exit the tests if user hit Ctrl-C during defer(). Link: https://lore.kernel.org/20251119063228.3adfd743@kernel.org # [1] Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20251128004846.2602687-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-12-01 12:01:29 -08:00
Linus Torvalds	212c4053a1	Merge tag 'vfs-6.19-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull pidfd and coredump updates from Christian Brauner: "Features: - Expose coredump signal via pidfd Expose the signal that caused the coredump through the pidfd interface. The recent changes to rework coredump handling to rely on unix sockets are in the process of being used in systemd. The previous systemd coredump container interface requires the coredump file descriptor and basic information including the signal number to be sent to the container. This means the signal number needs to be available before sending the coredump to the container. - Add supported_mask field to pidfd Add a new supported_mask field to struct pidfd_info that indicates which information fields are supported by the running kernel. This allows userspace to detect feature availability without relying on error codes or kernel version checks. Cleanups: - Drop struct pidfs_exit_info and prepare to drop exit_info pointer, simplifying the internal publication mechanism for exit and coredump information retrievable via the pidfd ioctl - Use guard() for task_lock in pidfs - Reduce wait_pidfd lock scope - Add missing PIDFD_INFO_SIZE_VER1 constant - Add missing BUILD_BUG_ON() assert on struct pidfd_info Fixes: - Fix PIDFD_INFO_COREDUMP handling Selftests: - Split out coredump socket tests and common helpers into separate files for better organization - Fix userspace coredump client detection issues - Handle edge-triggered epoll correctly - Ignore ENOSPC errors in tests - Add debug logging to coredump socket tests, socket protocol tests, and test helpers - Add tests for PIDFD_INFO_COREDUMP_SIGNAL - Add tests for supported_mask field - Update pidfd header for selftests" * tag 'vfs-6.19-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (23 commits) pidfs: reduce wait_pidfd lock scope selftests/coredump: add second PIDFD_INFO_COREDUMP_SIGNAL test selftests/coredump: add first PIDFD_INFO_COREDUMP_SIGNAL test selftests/coredump: ignore ENOSPC errors selftests/coredump: add debug logging to coredump socket protocol tests selftests/coredump: add debug logging to coredump socket tests selftests/coredump: add debug logging to test helpers selftests/coredump: handle edge-triggered epoll correctly selftests/coredump: fix userspace coredump client detection selftests/coredump: fix userspace client detection selftests/coredump: split out coredump socket tests selftests/coredump: split out common helpers selftests/pidfd: add second supported_mask test selftests/pidfd: add first supported_mask test selftests/pidfd: update pidfd header pidfs: expose coredump signal pidfs: drop struct pidfs_exit_info pidfs: prepare to drop exit_info pointer pidfd: add a new supported_mask field pidfs: add missing BUILD_BUG_ON() assert on struct pidfd_info ...	2025-12-01 10:17:39 -08:00
Linus Torvalds	415d34b92c	Merge tag 'namespace-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull namespace updates from Christian Brauner: "This contains substantial namespace infrastructure changes including a new system call, active reference counting, and extensive header cleanups. The branch depends on the shared kbuild branch for -fms-extensions support. Features: - listns() system call Add a new listns() system call that allows userspace to iterate through namespaces in the system. This provides a programmatic interface to discover and inspect namespaces, addressing longstanding limitations: Currently, there is no direct way for userspace to enumerate namespaces. Applications must resort to scanning /proc//ns/ across all processes, which is: - Inefficient - requires iterating over all processes - Incomplete - misses namespaces not attached to any running process but kept alive by file descriptors, bind mounts, or parent references - Permission-heavy - requires access to /proc for many processes - No ordering or ownership information - No filtering per namespace type The listns() system call solves these problems: ssize_t listns(const struct ns_id_req req, u64 ns_ids, size_t nr_ns_ids, unsigned int flags); struct ns_id_req { __u32 size; __u32 spare; __u64 ns_id; struct / listns / { __u32 ns_type; __u32 spare2; __u64 user_ns_id; }; }; Features include: - Pagination support for large namespace sets - Filtering by namespace type (MNT_NS, NET_NS, USER_NS, etc.) - Filtering by owning user namespace - Permission checks respecting namespace isolation - Active Reference Counting Introduce an active reference count that tracks namespace visibility to userspace. A namespace is visible in the following cases: - The namespace is in use by a task - The namespace is persisted through a VFS object (namespace file descriptor or bind-mount) - The namespace is a hierarchical type and is the parent of child namespaces The active reference count does not regulate lifetime (that's still done by the normal reference count) - it only regulates visibility to namespace file handles and listns(). This prevents resurrection of namespaces that are pinned only for internal kernel reasons (e.g., user namespaces held by file->f_cred, lazy TLB references on idle CPUs, etc.) which should not be accessible via (1)-(3). - Unified Namespace Tree Introduce a unified tree structure for all namespaces with: - Fixed IDs assigned to initial namespaces - Lookup based solely on inode number - Maintained list of owned namespaces per user namespace - Simplified rbtree comparison helpers Cleanups - Header Reorganization: - Move namespace types into separate header (ns_common_types.h) - Decouple nstree from ns_common header - Move nstree types into separate header - Switch to new ns_tree_{node,root} structures with helper functions - Use guards for ns_tree_lock - Initial Namespace Reference Count Optimization - Make all reference counts on initial namespaces a nop to avoid pointless cacheline ping-pong for namespaces that can never go away - Drop custom reference count initialization for initial namespaces - Add NS_COMMON_INIT() macro and use it for all namespaces - pid: rely on common reference count behavior - Miscellaneous Cleanups - Rename exit_task_namespaces() to exit_nsproxy_namespaces() - Rename is_initial_namespace() and make argument const - Use boolean to indicate anonymous mount namespace - Simplify owner list iteration in nstree - nsfs: raise SB_I_NODEV, SB_I_NOEXEC, and DCACHE_DONTCACHE explicitly - nsfs: use inode_just_drop() - pidfs: raise DCACHE_DONTCACHE explicitly - pidfs: simplify PIDFD_GET__NAMESPACE ioctls - libfs: allow to specify s_d_flags - cgroup: add cgroup namespace to tree after owner is set - nsproxy: fix free_nsproxy() and simplify create_new_namespaces() Fixes: - setns(pidfd, ...) race condition Fix a subtle race when using pidfds with setns(). When the target task exits after prepare_nsset() but before commit_nsset(), the namespace's active reference count might have been dropped. If setns() then installs the namespaces, it would bump the active reference count from zero without taking the required reference on the owner namespace, leading to underflow when later decremented. The fix resurrects the ownership chain if necessary - if the caller succeeded in grabbing passive references, the setns() should succeed even if the target task exits or gets reaped. - Return EFAULT on put_user() error instead of success - Make sure references are dropped outside of RCU lock (some namespaces like mount namespace sleep when putting the last reference) - Don't skip active reference count initialization for network namespace - Add asserts for active refcount underflow - Add asserts for initial namespace reference counts (both passive and active) - ipc: enable is_ns_init_id() assertions - Fix kernel-doc comments for internal nstree functions - Selftests - 15 active reference count tests - 9 listns() functionality tests - 7 listns() permission tests - 12 inactive namespace resurrection tests - 3 threaded active reference count tests - commit_creds() active reference tests - Pagination and stress tests - EFAULT handling test - nsid tests fixes" tag 'namespace-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (103 commits) pidfs: simplify PIDFD_GET_<type>_NAMESPACE ioctls nstree: fix kernel-doc comments for internal functions nsproxy: fix free_nsproxy() and simplify create_new_namespaces() selftests/namespaces: fix nsid tests ns: drop custom reference count initialization for initial namespaces pid: rely on common reference count behavior ns: add asserts for initial namespace active reference counts ns: add asserts for initial namespace reference counts ns: make all reference counts on initial namespace a nop ipc: enable is_ns_init_id() assertions fs: use boolean to indicate anonymous mount namespace ns: rename is_initial_namespace() ns: make is_initial_namespace() argument const nstree: use guards for ns_tree_lock nstree: simplify owner list iteration nstree: switch to new structures nstree: add helper to operate on struct ns_tree_{node,root} nstree: move nstree types into separate header nstree: decouple from ns_common header ns: move namespace types into separate header ...	2025-12-01 09:47:41 -08:00
Takashi Iwai	72987d2ddc	Merge branch 'for-linus' into for-next Pull remaining 6.18-devel changes. Signed-off-by: Takashi Iwai <tiwai@suse.de>	2025-12-01 16:25:31 +01:00
Oliver Upton	3eef0c83c3	Merge branch 'kvm-arm64/nv-xnx-haf' into kvmarm/next * kvm-arm64/nv-xnx-haf: (22 commits) : Support for FEAT_XNX and FEAT_HAF in nested : : Add support for a couple of MMU-related features that weren't : implemented by KVM's software page table walk: : : - FEAT_XNX: Allows the hypervisor to describe execute permissions : separately for EL0 and EL1 : : - FEAT_HAF: Hardware update of the Access Flag, which in the context of : nested means software walkers must also set the Access Flag. : : The series also adds some basic support for testing KVM's emulation of : the AT instruction, including the implementation detail that AT sets the : Access Flag in KVM. KVM: arm64: at: Update AF on software walk only if VM has FEAT_HAFDBS KVM: arm64: at: Use correct HA bit in TCR_EL2 when regime is EL2 KVM: arm64: Document KVM_PGTABLE_PROT_{UX,PX} KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected" KVM: arm64: Add break to default case in kvm_pgtable_stage2_pte_prot() KVM: arm64: Add endian casting to kvm_swap_s[12]_desc() KVM: arm64: Fix compilation when CONFIG_ARM64_USE_LSE_ATOMICS=n KVM: arm64: selftests: Add test for AT emulation KVM: arm64: nv: Expose hardware access flag management to NV guests KVM: arm64: nv: Implement HW access flag management in stage-2 SW PTW KVM: arm64: Implement HW access flag management in stage-1 SW PTW KVM: arm64: Propagate PTW errors up to AT emulation KVM: arm64: Add helper for swapping guest descriptor KVM: arm64: nv: Use pgtable definitions in stage-2 walk KVM: arm64: Handle endianness in read helper for emulated PTW KVM: arm64: nv: Stop passing vCPU through void ptr in S2 PTW KVM: arm64: Call helper for reading descriptors directly KVM: arm64: nv: Advertise support for FEAT_XNX KVM: arm64: Teach ptdump about FEAT_XNX permissions KVM: arm64: nv: Forward FEAT_XNX permissions to the shadow stage-2 ... Signed-off-by: Oliver Upton <oupton@kernel.org>	2025-12-01 00:47:41 -08:00
Oliver Upton	938309b028	Merge branch 'kvm-arm64/vgic-lr-overflow' into kvmarm/next * kvm-arm64/vgic-lr-overflow: (50 commits) : Support for VGIC LR overflows, courtesy of Marc Zyngier : : Address deficiencies in KVM's GIC emulation when a vCPU has more active : IRQs than can be represented in the VGIC list registers. Sort the AP : list to prioritize inactive and pending IRQs, potentially spilling : active IRQs outside of the LRs. : : Handle deactivation of IRQs outside of the LRs for both EOImode=0/1, : which involves special consideration for SPIs being deactivated from a : different vCPU than the one that acked it. KVM: arm64: Convert ICH_HCR_EL2_TDIR cap to EARLY_LOCAL_CPU_FEATURE KVM: arm64: selftests: vgic_irq: Add timer deactivation test KVM: arm64: selftests: vgic_irq: Add Group-0 enable test KVM: arm64: selftests: vgic_irq: Add asymmetric SPI deaectivation test KVM: arm64: selftests: vgic_irq: Perform EOImode==1 deactivation in ack order KVM: arm64: selftests: vgic_irq: Remove LR-bound limitation KVM: arm64: selftests: vgic_irq: Exclude timer-controlled interrupts KVM: arm64: selftests: vgic_irq: Change configuration before enabling interrupt KVM: arm64: selftests: vgic_irq: Fix GUEST_ASSERT_IAR_EMPTY() helper KVM: arm64: selftests: gic_v3: Disable Group-0 interrupts by default KVM: arm64: selftests: gic_v3: Add irq group setting helper KVM: arm64: GICv2: Always trap GICV_DIR register KVM: arm64: GICv2: Handle deactivation via GICV_DIR traps KVM: arm64: GICv2: Handle LR overflow when EOImode==0 KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En KVM: arm64: GICv3: nv: Plug L1 LR sync into deactivation primitive KVM: arm64: GICv3: nv: Resync LRs/VMCR/HCR early for better MI emulation KVM: arm64: GICv3: Avoid broadcast kick on CPUs lacking TDIR KVM: arm64: GICv3: Handle in-LR deactivation when possible KVM: arm64: GICv3: Add SPI tracking to handle asymmetric deactivation ... Signed-off-by: Oliver Upton <oupton@kernel.org>	2025-12-01 00:47:32 -08:00
Oliver Upton	11b8e6edc1	Merge branch 'kvm-arm64/sea-user' into kvmarm/next * kvm-arm64/sea-user: : Userspace handling of SEAs, courtesy of Jiaqi Yan : : Add support for processing external aborts in userspace in situations : where the host has failed to do so, allowing the VMM to potentially : reinject an external abort into the VM. Documentation: kvm: new UAPI for handling SEA KVM: selftests: Test for KVM_EXIT_ARM_SEA KVM: arm64: VM exit to userspace to handle SEA Signed-off-by: Oliver Upton <oupton@kernel.org>	2025-12-01 00:47:20 -08:00
Colin Ian King	05474b7bc7	KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected" There is a spelling mistake in a TEST_FAIL message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://msgid.link/20251128175124.319094-1-colin.i.king@gmail.com Signed-off-by: Oliver Upton <oupton@kernel.org>	2025-12-01 00:44:02 -08:00
Oliver Upton	66f1888583	KVM: arm64: selftests: Add test for AT emulation Add a basic test for AT emulation in the EL2&0 and EL1&0 translation regimes. Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://msgid.link/20251124190158.177318-16-oupton@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>	2025-12-01 00:44:02 -08:00
Frederic Weisbecker	9a08942f17	Merge branch 'rcu/misc' into next - In order to prepare the layout for nohz_full work deferral to user exit, the context tracking state must shrink the counter of transitions to/from RCU not watching. The only possible hazard is to trigger wrap-around more easily, delaying a bit grace periods when that happens. This should be a rare event though. Yet add debugging and torture code to test that assumption. - Fix memory leak on locktorture module - Annotate accesses in rculist_nulls.h to prevent from KCSAN warnings. On recent discussions, we also concluded that all those WRITE_ONCE() and READ_ONCE() on list APIs deserve appropriate comments. Something to be expected for the next cycle. - Provide a script to apply several configs to several commits with torture. - Allow torture to reuse a build directory in order to save needless rebuild time. - Various cleanups.	2025-11-30 22:20:33 +01:00
Ankit Khushwaha	0384c8ea96	selftests/mm/uffd: initialize char variable to Null In "uffd-stress.c" & "uffd-unit-tests.c". address of char variable having garbage value (uninitialized) is passed to 'write' syscall triggers warning. uffd-stress.c:246:39: warning: variable 'c' is uninitialized when passed as a const pointer argument here [-Wuninitialized-const-pointer] uffd-unit-tests.c:581:31: warning: variable 'c' is uninitialized when passed as a const pointer argument here [-Wuninitialized-const-pointer] so the fix is to assign char variable to '\0' to prevent writing of garbage value. Link: https://lkml.kernel.org/r/20251126160830.52124-1-ankitkhushwaha.linux@gmail.com Signed-off-by: Ankit Khushwaha <ankitkhushwaha.linux@gmail.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Bill Wendling <morbo@google.com> Cc: Justin Stitt <justinstitt@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-11-29 10:41:09 -08:00
Lorenzo Stoakes	9ea35a25d5	mm: introduce VMA flags bitmap type It is useful to transition to using a bitmap for VMA flags so we can avoid running out of flags, especially for 32-bit kernels which are constrained to 32 flags, necessitating some features to be limited to 64-bit kernels only. By doing so, we remove any constraint on the number of VMA flags moving forwards no matter the platform and can decide in future to extend beyond 64 if required. We start by declaring an opaque types, vma_flags_t (which resembles mm_struct flags of type mm_flags_t), setting it to precisely the same size as vm_flags_t, and place it in union with vm_flags in the VMA declaration. We additionally update struct vm_area_desc equivalently placing the new opaque type in union with vm_flags. This change therefore does not impact the size of struct vm_area_struct or struct vm_area_desc. In order for the change to be iterative and to avoid impacting performance, we designate VM_xxx declared bitmap flag values as those which must exist in the first system word of the VMA flags bitmap. We therefore declare vma_flags_clear_all(), vma_flags_overwrite_word(), vma_flags_overwrite_word(), vma_flags_overwrite_word_once(), vma_flags_set_word() and vma_flags_clear_word() in order to allow us to update the existing vm_flags_*() functions to utilise these helpers. This is a stepping stone towards converting users to the VMA flags bitmap and behaves precisely as before. By doing this, we can eliminate the existing private vma->__vm_flags field in the vma->vm_flags union and replace it with the newly introduced opaque type vma_flags, which we call flags so we refer to the new bitmap field as vma->flags. We update vma_flag_[test, set]_atomic() to account for the change also. We adapt vm_flags_reset_once() to only clear those bits above the first system word providing write-once semantics to the first system word (which it is presumed the caller requires - and in all current use cases this is so). As we currently only specify that the VMA flags bitmap size is equal to BITS_PER_LONG number of bits, this is a noop, but is defensive in preparation for a future change that increases this. We additionally update the VMA userland test declarations to implement the same changes there. Finally, we update the rust code to reference vma->vm_flags on update rather than vma->__vm_flags which has been removed. This is safe for now, albeit it is implicitly performing a const cast. Once we introduce flag helpers we can improve this more. No functional change intended. Link: https://lkml.kernel.org/r/bab179d7b153ac12f221b7d65caac2759282cfe9.1764064557.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Acked-by: Alice Ryhl <aliceryhl@google.com> [rust] Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Barry Song <baohua@kernel.org> Cc: Ben Segall <bsegall@google.com> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chris Li <chrisl@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Gary Guo <gary@garyguo.net> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Kairui Song <kasong@tencent.com> Cc: Kees Cook <kees@kernel.org> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Nico Pache <npache@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Wei Xu <weixugc@google.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-11-29 10:41:09 -08:00
Lorenzo Stoakes	4c613f518f	tools/testing/vma: eliminate dependency on vma->__vm_flags The userland VMA test code relied on an internal implementation detail - the existence of vma->__vm_flags to directly access VMA flags. There is no need to do so when we have the vm_flags_*() helper functions available. This is ugly, but also a subsequent commit will eliminate this field altogether so this will shortly become broken. This patch has us utilise the helper functions instead. Link: https://lkml.kernel.org/r/6275c53a6bb20743edcbe92d3e130183b47d18d0.1764064557.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Pedro Falcato <pfalcato@suse.de> Acked-by: Alice Ryhl <aliceryhl@google.com> [rust] Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Barry Song <baohua@kernel.org> Cc: Ben Segall <bsegall@google.com> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chris Li <chrisl@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Gary Guo <gary@garyguo.net> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Kairui Song <kasong@tencent.com> Cc: Kees Cook <kees@kernel.org> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Nico Pache <npache@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Wei Xu <weixugc@google.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-11-29 10:41:09 -08:00
Lorenzo Stoakes	2b6a3f061f	mm: declare VMA flags by bit Patch series "initial work on making VMA flags a bitmap", v3. We are in the rather silly situation that we are running out of VMA flags as they are currently limited to a system word in size. This leads to absurd situations where we limit features to 64-bit architectures only because we simply do not have the ability to add a flag for 32-bit ones. This is very constraining and leads to hacks or, in the worst case, simply an inability to implement features we want for entirely arbitrary reasons. This also of course gives us something of a Y2K type situation in mm where we might eventually exhaust all of the VMA flags even on 64-bit systems. This series lays the groundwork for getting away from this limitation by establishing VMA flags as a bitmap whose size we can increase in future beyond 64 bits if required. This is necessarily a highly iterative process given the extensive use of VMA flags throughout the kernel, so we start by performing basic steps. Firstly, we declare VMA flags by bit number rather than by value, retaining the VM_xxx fields but in terms of these newly introduced VMA_xxx_BIT fields. While we are here, we use sparse annotations to ensure that, when dealing with VMA bit number parameters, we cannot be passed values which are not declared as such - providing some useful type safety. We then introduce an opaque VMA flag type, much like the opaque mm_struct flag type introduced in commit `bb6525f2f8` ("mm: add bitmap mm->flags field"), which we establish in union with vma->vm_flags (but still set at system word size meaning there is no functional or data type size change). We update the vm_flags_xxx() helpers to use this new bitmap, introducing sensible helpers to do so. This series lays the foundation for further work to expand the use of bitmap VMA flags and eventually eliminate these arbitrary restrictions. This patch (of 4): In order to lay the groundwork for VMA flags being a bitmap rather than a system word in size, we need to be able to consistently refer to VMA flags by bit number rather than value. Take this opportunity to do so in an enum which we which is additionally useful for tooling to extract metadata from. This additionally makes it very clear which bits are being used for what at a glance. We use the VMA_ prefix for the bit values as it is logical to do so since these reference VMAs. We consistently suffix with _BIT to make it clear what the values refer to. We declare bit values even when the flags that use them would not be enabled by config options as this is simply clearer and clearly defines what bit numbers are used for what, at no additional cost. We declare a sparse-bitwise type vma_flag_t which ensures that users can't pass around invalid VMA flags by accident and prepares for future work towards VMA flags being a bitmap where we want to ensure bit values are type safe. To make life easier, we declare some macro helpers - DECLARE_VMA_BIT() allows us to avoid duplication in the enum bit number declarations (and maintaining the sparse __bitwise attribute), and INIT_VM_FLAG() is used to assist with declaration of flags. Unfortunately we can't declare both in the enum, as we run into issue with logic in the kernel requiring that flags are preprocessor definitions, and additionally we cannot have a macro which declares another macro so we must define each flag macro directly. Additionally, update the VMA userland testing vma_internal.h header to include these changes. We also have to fix the parameters to the vma_flag_*_atomic() functions since VMA_MAYBE_GUARD_BIT is now of type vma_flag_t and sparse will complain otherwise. We have to update some rather silly if-deffery found in mm/task_mmu.c which would otherwise break. Finally, we update the rust binding helper as now it cannot auto-detect the flags at all. Link: https://lkml.kernel.org/r/cover.1764064556.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/3a35e5a0bcfa00e84af24cbafc0653e74deda64a.1764064556.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Acked-by: Alice Ryhl <aliceryhl@google.com> [rust] Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Barry Song <baohua@kernel.org> Cc: Ben Segall <bsegall@google.com> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chris Li <chrisl@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Gary Guo <gary@garyguo.net> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Kairui Song <kasong@tencent.com> Cc: Kees Cook <kees@kernel.org> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Nico Pache <npache@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Wei Xu <weixugc@google.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-11-29 10:41:08 -08:00
Alexis Lothoré (eBPF Foundation)	1d17bcce6a	selftests/bpf: do not hardcode target rate in test_tc_edt BPF program test_tc_edt currently defines the target rate in both the userspace and BPF parts. This value could be defined once in the userspace part if we make it able to configure the BPF program before starting the test. Add a target_rate variable in the BPF part, and make the userspace part set it to the desired rate before attaching the shaping program. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-4-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-29 09:37:41 -08:00
Alexis Lothoré (eBPF Foundation)	50ce5ea5f7	selftests/bpf: remove test_tc_edt.sh Now that test_tc_edt has been integrated in test_progs, remove the legacy shell script. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-3-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-29 09:37:41 -08:00
Alexis Lothoré (eBPF Foundation)	b0f82e7ab6	selftests/bpf: integrate test_tc_edt into test_progs test_tc_edt.sh uses a pair of veth and a BPF program attached to the TX veth to shape the traffic to 5MBps. It then checks that the amount of received bytes (at interface level), compared to the TX duration, indeed matches 5Mbps. Convert this test script to the test_progs framework: - keep the double veth setup, isolated in two veths - run a small tcp server, and connect client to server - push a pre-configured amount of bytes, and measure how much time has been needed to push those - ensure that this rate is in a 2% error margin around the target rate This two percent value, while being tight, is hopefully large enough to not make the test too flaky in CI, while also turning it into a small example of BPF-based shaping. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-2-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-29 09:37:41 -08:00
Alexis Lothoré (eBPF Foundation)	4b4833acc6	selftests/bpf: rename test_tc_edt.bpf.c section to expose program type The test_tc_edt BPF program uses a custom section name, which works fine when manually loading it with tc, but prevents it from being loaded with libbpf. Update the program section name to "tc" to be able to manipulate it with a libbpf-based C test. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-1-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-29 09:37:41 -08:00
Kumar Kartikeya Dwivedi	3448375e71	selftests/bpf: Add success stats to rqspinlock stress test Add stats to observe the success and failure rate of lock acquisition attempts in various contexts. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251128232802.1031906-7-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-29 09:35:36 -08:00
Hangbin Liu	2c28ee720a	selftests: bonding: add delay before each xvlan_over_bond connectivity check Jakub reported increased flakiness in bond_macvlan_ipvlan.sh on regular kernel, while the tests consistently pass on a debug kernel. This suggests a timing-sensitive issue. To mitigate this, introduce a short sleep before each xvlan_over_bond connectivity check. The delay helps ensure neighbor and route cache have fully converged before verifying connectivity. The sleep interval is kept minimal since check_connection() is invoked nearly 100 times during the test. Fixes: `246af950b9` ("selftests: bonding: add macvlan over bond testing") Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20251114082014.750edfad@kernel.org Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20251127143310.47740-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-28 20:16:22 -08:00
Hoyeon Lee	bd5bdd200c	bpf: Remove runqslower tool runqslower was added in commit `9c01546d26` "tools/bpf: Add runqslower tool to tools/bpf" as a BCC port to showcase early BPF CO-RE + libbpf workflows. runqslower continues to live in BCC (libbpf-tools), so there is no need to keep building and maintaining it. Drop tools/bpf/runqslower and remove all build hooks in tools/bpf and selftests accordingly. Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Link: https://lore.kernel.org/r/20251126093821.373291-1-hoyeon.lee@suse.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-28 15:20:16 -08:00
Amery Hung	a3a60cc120	selftests/bpf: Remove usage of lsm/file_alloc_security in selftest file_alloc_security hook is disabled. Use other LSM hooks in selftests instead. Signed-off-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20251126202927.2584874-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-28 15:18:28 -08:00
Anton Protopopov	7feff23cdf	bpf: force BPF_F_RDONLY_PROG on insn array creation The original implementation added a hack to check_mem_access() to prevent programs from writing into insn arrays. To get rid of this hack, enforce BPF_F_RDONLY_PROG on map creation. Also fix the corresponding selftest, as the error message changes with this patch. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20251128063224.1305482-2-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-28 15:15:43 -08:00
David Matlack	d721f52e31	vfio: selftests: Add vfio_pci_device_init_perf_test Add a new VFIO selftest for measuring the time it takes to run vfio_pci_device_init() in parallel for one or more devices. This test serves as manual regression test for the performance improvement of commit `e908f58b6b` ("vfio/pci: Separate SR-IOV VF dev_set"). For example, when running this test with 64 VFs under the same PF: Before: $ ./vfio_pci_device_init_perf_test -r vfio_pci_device_init_perf_test.iommufd.init 0000:1a:00.0 0000:1a:00.1 ... ... Wall time: 6.653234463s Min init time (per device): 0.101215344s Max init time (per device): 6.652755941s Avg init time (per device): 3.377609608s After: $ ./vfio_pci_device_init_perf_test -r vfio_pci_device_init_perf_test.iommufd.init 0000:1a:00.0 0000:1a:00.1 ... ... Wall time: 0.122978332s Min init time (per device): 0.108121915s Max init time (per device): 0.122762761s Avg init time (per device): 0.113816748s This test does not make any assertions about performance, since any such assertion is likely to be flaky due to system differences and random noise. However this test can be fed into automation to detect regressions, and can be used by developers in the future to measure performance optimizations. Suggested-by: Aaron Lewis <aaronlewis@google.com> Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-19-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:58:07 -07:00
David Matlack	b8e96c8805	vfio: selftests: Eliminate INVALID_IOVA Eliminate INVALID_IOVA as there are platforms where UINT64_MAX is a valid iova. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-18-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:58:07 -07:00
David Matlack	5fabc49abf	vfio: selftests: Split libvfio.h into separate header files Split out the contents of libvfio.h into separate header files, but keep libvfio.h as the top-level include that all tests can use. Put all new header files into a libvfio/ subdirectory to avoid future name conflicts in include paths when libvfio is used by other selftests like KVM. No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-17-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:58:07 -07:00
David Matlack	19cf492c1b	vfio: selftests: Move vfio_selftests_() helpers into libvfio.c Move the vfio_selftests_() helpers into their own file libvfio.c. These helpers have nothing to do with struct vfio_pci_device, so they don't make sense in vfio_pci_device.c. No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-16-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:58:07 -07:00
David Matlack	657d241e69	vfio: selftests: Rename vfio_util.h to libvfio.h Rename vfio_util.h to libvfio.h to match the name of libvfio.mk. No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-15-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:58:07 -07:00
David Matlack	831c37a5bf	vfio: selftests: Stop passing device for IOMMU operations Drop the struct vfio_pci_device wrappers for IOMMU map/unmap functions and require tests to directly call iommu_map(), iommu_unmap(), etc. This results in more concise code, and also makes it clear the map operations are happening on a struct iommu, not necessarily on a specific device, especially when multi-device tests are introduced. Do the same for iova_allocator_init() as that function only needs the struct iommu, not struct vfio_pci_device. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-14-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:58:06 -07:00
David Matlack	2607a43613	vfio: selftests: Move IOVA allocator into iova_allocator.c Move the IOVA allocator into its own file, to provide better separation between the allocator and the struct vfio_pci_device helper code. The allocator could go into iommu.c, but it is standalone enough that a separate file seems cleaner. This also continues the trend of having a .c for every major object in VFIO selftests (vfio_pci_device.c, vfio_pci_driver.c, iommu.c, and now iova_allocator.c). No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-13-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:58:06 -07:00
David Matlack	2aca571089	vfio: selftests: Move IOMMU library code into iommu.c Move all the IOMMU related library code into their own file iommu.c. This provides a better separation between the vfio_pci_device helper code and the iommu code. No function change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-12-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:58:06 -07:00
David Matlack	9a659d74f2	vfio: selftests: Rename struct vfio_dma_region to dma_region Rename struct vfio_dma_region to dma_region. This is in preparation for separating the VFIO PCI device library code from the IOMMU library code. This name change also better reflects the fact that DMA mappings can be managed by either VFIO or IOMMUFD. i.e. the "vfio_" prefix is misleading. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-11-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:58:06 -07:00
David Matlack	28a84da744	vfio: selftests: Upgrade driver logging to dev_err() Upgrade various logging in the VFIO selftests drivers from dev_info() to dev_err(). All of these logs indicate scenarios that may be unexpected. For example, the logging during probing indicates matching devices but that aren't supported by the driver. And the memcpy errors can indicate a problem if the caller was not trying to do something like exercise I/O fault handling. Exercising I/O fault handling is certainly a valid thing to do, but the driver can't infer the caller's expectations, so better to just log with dev_err(). Suggested-by: Raghavendra Rao Ananta <rananta@google.com> Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-10-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:58:06 -07:00
David Matlack	c48545442e	vfio: selftests: Prefix logs with device BDF where relevant Prefix log messages with the device's BDF where relevant. This will help understanding VFIO selftests logs when tests are run with multiple devices. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-9-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:58:06 -07:00
David Matlack	6c74d9830d	vfio: selftests: Eliminate overly chatty logging Eliminate overly chatty logs that are printed during almost every test. These logs are adding more noise than value. If a test cares about this information it can log it itself. This is especially true as the VFIO selftests gains support for multiple devices in a single test (which multiplies all these logs). Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-8-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:58:06 -07:00
David Matlack	d8470a775c	vfio: selftests: Support multiple devices in the same container/iommufd Support tests that want to add multiple devices to the same container/iommufd by decoupling struct vfio_pci_device from struct iommu. Multi-devices tests can now put multiple devices in the same container/iommufd like so: iommu = iommu_init(iommu_mode); device1 = vfio_pci_device_init(bdf1, iommu); device2 = vfio_pci_device_init(bdf2, iommu); device3 = vfio_pci_device_init(bdf3, iommu); ... vfio_pci_device_cleanup(device3); vfio_pci_device_cleanup(device2); vfio_pci_device_cleanup(device1); iommu_cleanup(iommu); To account for the new separation of vfio_pci_device and iommu, update existing tests to initialize and cleanup a struct iommu. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-7-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:58:06 -07:00
David Matlack	c9756b4d27	vfio: selftests: Introduce struct iommu Introduce struct iommu, which logically represents either a VFIO container or an iommufd IOAS, depending on which IOMMU mode is used by the test. This will be used in a subsequent commit to allow devices to be added to the same container/iommufd. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-6-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:58:06 -07:00
David Matlack	dd56ef239d	vfio: selftests: Rename struct vfio_iommu_mode to iommu_mode Rename struct vfio_iommu_mode to struct iommu_mode since the mode can include iommufd. This also prepares for splitting out all the IOMMU code into its own structs/helpers/files which are independent from the vfio_pci_device code. No function change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-5-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:58:06 -07:00
David Matlack	6282ca8585	vfio: selftests: Allow passing multiple BDFs on the command line Add support for passing multiple device BDFs to a test via the command line. This is a prerequisite for multi-device tests. Single-device tests can continue using vfio_selftests_get_bdf(), which will continue to return argv[argc - 1] (if it is a BDF string), or the environment variable $VFIO_SELFTESTS_BDF otherwise. For multi-device tests, a new helper called vfio_selftests_get_bdfs() is introduced which will return an array of all BDFs found at the end of argv[], as well as the number of BDFs found (passed back to the caller via argument). The array of BDFs returned does not need to be freed by the caller. The environment variable VFIO_SELFTESTS_BDF continues to support only a single BDF for the time being. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-4-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:58:06 -07:00
David Matlack	fa246a1d06	vfio: selftests: Split run.sh into separate scripts Split run.sh into separate scripts (setup.sh, run.sh, cleanup.sh) to enable multi-device testing, and prepare for VFIO selftests automatically detecting which devices to use for testing by storing device metadata on the filesystem. - setup.sh takes one or more BDFs as arguments and sets up each device. Metadata about each device is stored on the filesystem in the directory: ${TMPDIR:-/tmp}/vfio-selftests-devices Within this directory is a directory for each BDF, and then files in those directories that cleanup.sh uses to cleanup the device. - run.sh runs a selftest by passing it the BDFs of all set up devices. - cleanup.sh takes zero or more BDFs as arguments and cleans up each device. If no BDFs are provided, it cleans up all devices. This split enables multi-device testing by allowing multiple BDFs to be set up and passed into tests: For example: $ tools/testing/selftests/vfio/scripts/setup.sh <BDF1> <BDF2> $ tools/testing/selftests/vfio/scripts/setup.sh <BDF3> $ tools/testing/selftests/vfio/scripts/run.sh echo <BDF1> <BDF2> <BDF3> $ tools/testing/selftests/vfio/scripts/cleanup.sh In the future, VFIO selftests can automatically detect set up devices by inspecting ${TMPDIR:-/tmp}/vfio-selftests-devices. This will avoid the need for the run.sh script. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-3-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:58:06 -07:00
David Matlack	2d5dbd3156	vfio: selftests: Move run.sh into scripts directory Move run.sh in a new sub-directory scripts/. This directory will be used to house various helper scripts to be used by humans and automation for running VFIO selftests. Opportunistically also switch run.sh from TEST_PROGS_EXTENDED to TEST_FILES. The former is for actual test executables that are just not run by default. TEST_FILES is a better fit for helper scripts. No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-2-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:58:06 -07:00
Alex Williamson	a63a03afd8	Merge tag 'vfio-v6.18-rc6' into v6.19/vfio/next Merge mainline vfio-selftest updates for ongoing v6.19 work. Signed-off-by: Alex Williamson <alex@shazbot.org>	2025-11-28 10:54:22 -07:00
Mickaël Salaün	54f9baf537	selftests/landlock: Add disconnected leafs and branch test suites Test disconnected directories with two test suites (layout4_disconnected_leafs and layout5_disconnected_branch) and 43 variants to cover the main corner cases. These tests are complementary to the previous commit. Add test_renameat() and test_exchangeat() helpers. Test coverage for security/landlock is 92.1% of 1927 lines according to LLVM 20. Cc: Günther Noack <gnoack@google.com> Cc: Song Liu <song@kernel.org> Cc: Tingmao Wang <m@maowtm.org> Link: https://lore.kernel.org/r/20251128172200.760753-5-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-11-28 18:27:07 +01:00

... 14 15 16 17 18 ...

21684 Commits