linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-21 05:35:31 -04:00

Author	SHA1	Message	Date
Sukrut Heroorkar	ff86b48d4c	selftests/kvm: remove stale TODO in xapic_state_test The TODO about using the number of vCPUs instead of vcpu.id + 1 was already addressed by commit `376bc1b458` ("KVM: selftests: Don't assume vcpu->id is '0' in xAPIC state test"). The comment is now stale and can be removed. Signed-off-by: Sukrut Heroorkar <hsukrut3@gmail.com> Link: https://lore.kernel.org/r/20250908210547.12748-1-hsukrut3@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-09-23 08:39:01 -07:00
dongsheng	c435978e4f	KVM: selftests: Handle Intel Atom errata that leads to PMU event overcount Add a PMU errata framework and use it to relax precise event counts on Atom platforms that overcount "Instruction Retired" and "Branch Instruction Retired" events, as the overcount issues on VM-Exit/VM-Entry are impossible to prevent from userspace, e.g. the test can't prevent host IRQs. Setup errata during early initialization and automatically sync the mask to VMs so that tests can check for errata without having to manually manage host=>guest variables. For Intel Atom CPUs, the PMU events "Instruction Retired" or "Branch Instruction Retired" may be overcounted for some certain instructions, like FAR CALL/JMP, RETF, IRET, VMENTRY/VMEXIT/VMPTRLD and complex SGX/SMX/CSTATE instructions/flows. The detailed information can be found in the errata (section SRF7): https://edc.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/sierra-forest/xeon-6700-series-processor-with-e-cores-specification-update/errata-details/ For the Atom platforms before Sierra Forest (including Sierra Forest), Both 2 events "Instruction Retired" and "Branch Instruction Retired" would be overcounted on these certain instructions, but for Clearwater Forest only "Instruction Retired" event is overcounted on these instructions. Signed-off-by: dongsheng <dongsheng.x.zhang@intel.com> Co-developed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Yi Lai <yi1.lai@intel.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20250919214648.1585683-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-09-23 08:38:59 -07:00
Dapeng Mi	2922b59588	KVM: selftests: Validate more arch-events in pmu_counters_test Add support for 5 new architectural events (4 topdown level 1 metrics events and LBR inserts event) that will first show up in Intel's Clearwater Forest CPUs. Detailed info about the new events can be found in SDM section 21.2.7 "Pre-defined Architectural Performance Events". Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Yi Lai <yi1.lai@intel.com> [sean: drop "unavailable_mask" changes] Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20250919214648.1585683-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-09-23 08:38:59 -07:00
Sean Christopherson	1fcd3053aa	KVM: selftests: Reduce number of "unavailable PMU events" combos tested Reduce the number of combinations of unavailable PMU events masks that are testing by the PMU counters test. In reality, testing every possible combination isn't all that interesting, and certainly not worth the tens of seconds (or worse, minutes) of runtime. Fully testing the N^2 space will be especially problematic in the near future, as 5! new arch events are on their way. Use alternating bit patterns (and 0 and -1u) in the hopes that _if_ there is ever a KVM bug, it's not something horribly convoluted that shows up only with a super specific pattern/value. Reported-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20250919214648.1585683-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-09-23 08:38:59 -07:00
Sean Christopherson	571fc2833e	KVM: selftests: Track unavailable_mask for PMU events as 32-bit value Track the mask of "unavailable" PMU events as a 32-bit value. While bits 31:9 are currently reserved, silently truncating those bits is unnecessary and asking for missed coverage. To avoid running afoul of the sanity check in vcpu_set_cpuid_property(), explicitly adjust the mask based on the non-reserved bits as reported by KVM's supported CPUID. Opportunistically update the "all ones" testcase to pass -1u instead of 0xff. Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20250919214648.1585683-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-09-23 08:38:59 -07:00
Dapeng Mi	210b09fa42	KVM: selftests: Add timing_info bit support in vmx_pmu_caps_test A new bit PERF_CAPABILITIES[17] called "PEBS_TIMING_INFO" bit is added to indicated if PEBS supports to record timing information in a new "Retried Latency" field. Since KVM requires user can only set host consistent PEBS capabilities, otherwise the PERF_CAPABILITIES setting would fail, add pebs_timing_info into the "immutable_caps" to block host inconsistent PEBS configuration and cause errors. Opportunistically drop the anythread_deprecated bit. It isn't and likely never was a PERF_CAPABILITIES flag, the test's definition snuck in when the union was copy+pasted from the kernel's definition. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Yi Lai <yi1.lai@intel.com> [sean: call out anythread_deprecated change] Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20250919214648.1585683-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-09-23 08:38:59 -07:00
Mykyta Yatsenko	c6ae18e0af	selftests/bpf: add bpf task work stress tests Add stress tests for BPF task-work scheduling kfuncs. The tests spawn multiple threads that concurrently schedule task_work callbacks against the same and different map values to exercise the kfuncs under high contention. Verify callbacks are reliably enqueued and executed with no drops. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Link: https://lore.kernel.org/r/20250923112404.668720-10-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-23 07:34:39 -07:00
Mykyta Yatsenko	39fd74dfd5	selftests/bpf: BPF task work scheduling tests Introducing selftests that check BPF task work scheduling mechanism. Validate that verifier does not accepts incorrect calls to bpf_task_work_schedule kfunc. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250923112404.668720-9-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-23 07:34:39 -07:00
Mykyta Yatsenko	5c8fd7e2b5	bpf: bpf task work plumbing This patch adds necessary plumbing in verifier, syscall and maps to support handling new kfunc bpf_task_work_schedule and kernel structure bpf_task_work. The idea is similar to how we already handle bpf_wq and bpf_timer. verifier changes validate calls to bpf_task_work_schedule to make sure it is safe and expected invariants hold. btf part is required to detect bpf_task_work structure inside map value and store its offset, which will be used in the next patch to calculate key and value addresses. arraymap and hashtab changes are needed to handle freeing of the bpf_task_work: run code needed to deinitialize it, for example cancel task_work callback if possible. The use of bpf_task_work and proper implementation for kfuncs are introduced in the next patch. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250923112404.668720-6-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-23 07:34:38 -07:00
Greg Kroah-Hartman	fc3e44e492	Merge tag 'iio-for-6.18a' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next Jonathan writes: IIO: New device support, features and cleanup for 6.18 New device support ================== ad,ade9000 - New driver for this complex energy and power monitoring ADC. infineon,tlv493d - New driver for this 3D magnetic sensor. intel,dollar - New driver for this TI PMIC (part number unknown) marvel,88pm886 - Driver for this PMIC ADC. microchip,mcp9600 - Add explicit support for the mcp9601 which has some additional features over the mcp9600. rohm,bd79112 - New driver for this ADC / GPIO Chip. Features ======== Core - New helper to multiply data expressed in IIO types. - Add KUnit tests. - New IIO_ALTCURRENT type, similar to existing IIO_ALTVOLTAGE - Add some channel modifiers related to energy and power, such as reactive. adi,ad7124 - Support external clocks sources and output of the internal clocks. - Filter control. adi,ad7173 - Add filter support. Some fiddly interactions with other parameters on this device. adi,ad7779 - Add backend support which required control of the number of lanes used. liteon,ltr390 - Add runtime PM support. microchip,mcp9600 - Add support for different thermocouple types. Cleanup and minor fixes ======================= core - Switch info_mask fields to be unsigned. Not clear why they were ever signed. - Fix handling of negative channel scale in iio_convert_raw_to_processed() - Fix offset handling for channels without a scale attribute. - Improve the precision of scaling slightly. - Drop apparent handling of IIO_CHAN_INFO_PROCESSED for devices that don't have any such channels. various - Drop many pm_runtime_mark_last_busy() calls now pm_runtime_put_autosuspend() calls it internally. - Drop dev_err_probe() calls where the error code is hard coded as -ENOMEM as they don't do anything. - Drop dev_err() calls where the error code is -ENOMEM. This will reduce error prints, but memory failures generate a lot of messages anyway so unlikely we need these prints. current-sense-amplifier - Add #io-channels property this channel to be used by a consumer driver. adi,ad7124 - Fix incorrect clocks dt-binding property. - Make the mclk clock optional in DT - this is internal to the ADC so should never have been in he binding. - Fix up sample rate to comply with ABI. - Use read_avail() callback rather than opencoding similar. - Deploy guard() to clean up some lock handling. adi,ad7768 - Use devm_regulator_get_enable_read_voltage() to replace similar code. adi,ad7816 - Drop an unnecessary dev_set_drvdata() call as nothing uses the data. ad,adxl345 - Fix missing blank line before bullet list in documentation. arm,scmi - Use devm_kcalloc() for an array allocation rather than devm_kzalloc(). bosch,bmi270 - Match an ACPI ID seen in the wild. It is not spec compliant but we can't do much about that. bosch,bmp280 - Drop overly noisy dev_info() - Allow for sleeping gpio controllers. gogle,cros-ec - Drop unused location attribute that has been replaced by label. invense,icm42600 - Simplify the power management. - Use guard() to simplify some locking. maxim,max1238 - Add io-channel-cells property to dt-binding as there is an in tree consumer. microchip,mcp9600 - Specify a default value in dt-binding for the thermocouple type - General whitespace cleanup. samsung,exynos - Drop support for the S3C2410 including bindings, and touchscreen support as nothing else uses that. - Drop platform ID based binding as not used. st,vl53l0x - Fix returning the wrong variable in an error path. ti,pac1934 - Replace open coded devm_mutex_init(). xilinx,ams - Update maintainers entry. * tag 'iio-for-6.18a' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (178 commits) MAINTAINERS: Support ROHM BD79112 ADC iio: adc: Support ROHM BD79112 ADC/GPIO dt-bindings: iio: adc: ROHM BD79112 ADC/GPIO iio: pressure: bmp280: Use gpiod_set_value_cansleep() iio: pressure: bmp280: Remove noisy dev_info() iio: ABI: add filter types for ad7173 iio: adc: ad7173: support changing filter type iio: adc: ad7173: rename odr field iio: adc: ad7173: rename ad7173_chan_spec_ext_info iio: adc: Add driver for Marvell 88PM886 PMIC ADC dt-bindings: mfd: 88pm886: Add #io-channel-cells iio: ABI: document "sinc4+rej60" filter_type iio: adc: ad7124: add filter support iio: adc: ad7124: support fractional sampling_frequency iio: adc: ad7124: use guard(mutex) to simplify return paths iio: adc: ad7124: use read_avail() for scale_available iio: adc: ad7124: use clamp() iio: adc: ad7124: fix sample rate for multi-channel use Documentation: ABI: iio: add sinc4+lp docs: iio: add documentation for ade9000 driver ...	2025-09-23 14:15:25 +02:00
Jakub Sitnicki	8a8241cdaa	selftests/net: Test tcp port reuse after unbinding a socket Exercise the scenario described in detail in the cover letter: 1) socket A: connect() from ephemeral port X 2) socket B: explicitly bind() to port X 3) check that port X is now excluded from ephemeral ports 4) close socket B to release the port bind 5) socket C: connect() from ephemeral port X As well as a corner case to test that the connect-bind flag is cleared: 1) connect() from ephemeral port X 2) disconnect the socket with connect(AF_UNSPEC) 3) bind() it explicitly to port X 4) check that port X is now excluded from ephemeral ports Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://patch.msgid.link/20250917-update-bind-bucket-state-on-unhash-v5-2-57168b661b47@cloudflare.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-09-23 10:12:15 +02:00
Lorenzo Stoakes	f7a741c53b	mm: do not assume file == vma->vm_file in compat_vma_mmap_prepare() In commit `bb666b7c27` ("mm: add mmap_prepare() compatibility layer for nested file systems") we introduced the ability for stacked drivers and file systems to correctly invoke the f_op->mmap_prepare() handler from an f_op->mmap() handler via a compatibility layer implemented in compat_vma_mmap_prepare(). This populates vm_area_desc fields according to those found in the (not yet fully initialised) VMA passed to f_op->mmap(). However this function implicitly assumes that the struct file which we are operating upon is equal to vma->vm_file. This is not a safe assumption in all cases. The only really sane situation in which this matters would be something like e.g. i915_gem_dmabuf_mmap() which invokes vfs_mmap() against obj->base.filp: ret = vfs_mmap(obj->base.filp, vma); if (ret) return ret; And then sets the VMA's file to this, should the mmap operation succeed: vma_set_file(vma, obj->base.filp); That is - it is the file that is intended to back the VMA mapping. This is not an issue currently, as so far we have only implemented f_op->mmap_prepare() handlers for some file systems and internal mm uses, and the only stacked f_op->mmap() operations that can be performed upon these are those in backing_file_mmap() and coda_file_mmap(), both of which use vma->vm_file. However, moving forward, as we convert drivers to using f_op->mmap_prepare(), this will become a problem. Resolve this issue by explicitly setting desc->file to the provided file parameter and update callers accordingly. Callers are expected to read desc->file and update desc->vm_file - the former will be the file provided by the caller (if stacked, this may differ from vma->vm_file). If the caller needs to differentiate between the two they therefore now can. While we are here, also provide a variant of compat_vma_mmap_prepare() that operates against a pointer to any file_operations struct and does not assume that the file_operations struct we are interested in is file->f_op. This function is __compat_vma_mmap_prepare() and we invoke it from compat_vma_mmap_prepare() so that we share code between the two functions. This is important, because some drivers provide hooks in a separate struct, for instance struct drm_device provides an fops field for this purpose. Also update the VMA selftests accordingly. Link: https://lkml.kernel.org/r/dd0c72df8a33e8ffaa243eeb9b01010b670610e9.1756920635.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: David Hildenbrand <david@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-22 20:17:11 -07:00
Lorenzo Stoakes	af6703838e	mm: specify separate file and vm_file params in vm_area_desc Patch series "mm: do not assume file == vma->vm_file in compat_vma_mmap_prepare()", v2. As part of the efforts to eliminate the problematic f_op->mmap callback, a new callback - f_op->mmap_prepare was provided. While we are converting these callbacks, we must deal with 'stacked' filesystems and drivers - those which in their own f_op->mmap callback invoke an inner f_op->mmap callback. To accomodate for this, a compatibility layer is provided that, via vfs_mmap(), detects if f_op->mmap_prepare is provided and if so, generates a vm_area_desc containing the VMA's metadata and invokes the call. So far, we have provided desc->file equal to vma->vm_file. However this is not necessarily valid, especially in the case of stacked drivers which wish to assign a new file after the inner hook is invoked. To account for this, we adjust vm_area_desc to have both file and vm_file fields. The .vm_file field is strictly set to vma->vm_file (or in the case of a new mapping, what will become vma->vm_file). However, .file is set to whichever file vfs_mmap() is invoked with when using the compatibilty layer. Therefore, if the VMA's file needs to be updated in .mmap_prepare, desc->vm_file should be assigned, whilst desc->file should be read. No current f_op->mmap_prepare users assign desc->file so this is safe to do. This makes the .mmap_prepare callback in the context of a stacked filesystem or driver completely consistent with the existing .mmap implementations. While we're here, we do a few small cleanups, and ensure that we const-ify things correctly in the vm_area_desc struct to avoid hooks accidentally trying to assign fields they should not. This patch (of 2): Stacked filesystems and drivers may invoke mmap hooks with a struct file pointer that differs from the overlying file. We will make this functionality possible in a subsequent patch. In order to prepare for this, let's update vm_area_struct to separately provide desc->file and desc->vm_file parameters. The desc->file parameter is the file that the hook is expected to operate upon, and is not assignable (though the hok may wish to e.g. update the file's accessed time for instance). The desc->vm_file defaults to what will become vma->vm_file and is what the hook must reassign should it wish to change the VMA"s vma->vm_file. For now we keep desc->file, vm_file the same to remain consistent. No f_op->mmap_prepare() callback sets a new vma->vm_file currently, so this is safe to change. While we're here, make the mm_struct desc->mm pointers at immutable as well as the desc->mm field itself. As part of this change, also update the single hook which this would otherwise break - mlock_future_ok(), invoked by secretmem_mmap_prepare()). We additionally update set_vma_from_desc() to compare fields in a more logical fashion, checking the (possibly) user-modified fields as the first operand against the existing value as the second one. Additionally, update VMA tests to accommodate changes. Link: https://lkml.kernel.org/r/cover.1756920635.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/3fa15a861bb7419f033d22970598aa61850ea267.1756920635.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Pedro Falcato <pfalcato@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-22 20:17:11 -07:00
KP Singh	b720903e2b	selftests/bpf: Enable signature verification for some lskel tests The test harness uses the verify_sig_setup.sh to generate the required key material for program signing. Generate key material for signing LSKEL some lskel programs and use xxd to convert the verification certificate into a C header file. Finally, update the main test runner to load this certificate into the session keyring via the add_key() syscall before executing any tests. Use the session keyring in the tests with signed programs. Signed-off-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/r/20250921160120.9711-6-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-22 19:17:55 -07:00
KP Singh	40863f4d6e	bpftool: Add support for signing BPF programs Two modes of operation being added: Add two modes of operation: * For prog load, allow signing a program immediately before loading. This is essential for command-line testing and administration. bpftool prog load -S -k <private_key> -i <identity_cert> fentry_test.bpf.o * For gen skeleton, embed a pre-generated signature into the C skeleton file. This supports the use of signed programs in compiled applications. bpftool gen skeleton -S -k <private_key> -i <identity_cert> fentry_test.bpf.o Generation of the loader program and its metadata map is implemented in libbpf (bpf_obj__gen_loader). bpftool generates a skeleton that loads the program and automates the required steps: freezing the map, creating an exclusive map, loading, and running. Users can use standard libbpf APIs directly or integrate loader program generation into their own toolchains. Signed-off-by: KP Singh <kpsingh@kernel.org> Acked-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/r/20250921160120.9711-5-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-22 19:17:55 -07:00
KP Singh	ea923080c1	libbpf: Embed and verify the metadata hash in the loader To fulfill the BPF signing contract, represented as Sig(I_loader \|\| H_meta), the generated trusted loader program must verify the integrity of the metadata. This signature cryptographically binds the loader's instructions (I_loader) to a hash of the metadata (H_meta). The verification process is embedded directly into the loader program. Upon execution, the loader loads the runtime hash from struct bpf_map i.e. BPF_PSEUDO_MAP_IDX and compares this runtime hash against an expected hash value that has been hardcoded directly by bpf_obj__gen_loader. The load from bpf_map can be improved by calling BPF_OBJ_GET_INFO_BY_FD from the kernel context after BPF_OBJ_GET_INFO_BY_FD has been updated for being called from the kernel context. The following instructions are generated: ld_imm64 r1, const_ptr_to_map // insn[0].src_reg == BPF_PSEUDO_MAP_IDX r2 = (u64 )(r1 + 0); ld_imm64 r3, sha256_of_map_part1 // constant precomputed by bpftool (part of H_meta) if r2 != r3 goto out; r2 = (u64 )(r1 + 8); ld_imm64 r3, sha256_of_map_part2 // (part of H_meta) if r2 != r3 goto out; r2 = (u64 )(r1 + 16); ld_imm64 r3, sha256_of_map_part3 // (part of H_meta) if r2 != r3 goto out; r2 = (u64 )(r1 + 24); ld_imm64 r3, sha256_of_map_part4 // (part of H_meta) if r2 != r3 goto out; ... Signed-off-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/r/20250921160120.9711-4-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-22 19:17:55 -07:00
KP Singh	fb2b0e2901	libbpf: Update light skeleton for signing * The metadata map is created with as an exclusive map (with an excl_prog_hash) This restricts map access exclusively to the signed loader program, preventing tampering by other processes. * The map is then frozen, making it read-only from userspace. * BPF_OBJ_GET_INFO_BY_ID instructs the kernel to compute the hash of the metadata map (H') and store it in bpf_map->sha. * The loader is then loaded with the signature which is then verified by the kernel. loading signed programs prebuilt into the kernel are not currently supported. These can supported by enabling BPF_OBJ_GET_INFO_BY_ID to be called from the kernel. Signed-off-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/r/20250921160120.9711-3-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-22 19:17:25 -07:00
KP Singh	3492715683	bpf: Implement signature verification for BPF programs This patch extends the BPF_PROG_LOAD command by adding three new fields to `union bpf_attr` in the user-space API: - signature: A pointer to the signature blob. - signature_size: The size of the signature blob. - keyring_id: The serial number of a loaded kernel keyring (e.g., the user or session keyring) containing the trusted public keys. When a BPF program is loaded with a signature, the kernel: 1. Retrieves the trusted keyring using the provided `keyring_id`. 2. Verifies the supplied signature against the BPF program's instruction buffer. 3. If the signature is valid and was generated by a key in the trusted keyring, the program load proceeds. 4. If no signature is provided, the load proceeds as before, allowing for backward compatibility. LSMs can chose to restrict unsigned programs and implement a security policy. 5. If signature verification fails for any reason, the program is not loaded. Tested-by: syzbot@syzkaller.appspotmail.com Signed-off-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/r/20250921160120.9711-2-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-22 18:58:03 -07:00
Matthieu Baerts (NGI0)	e6c3552945	selftests: mptcp: pm: get server-side flag server-side info linked to the MPTCP connect/established events can now come from the flags, in addition to the dedicated attribute. The attribute is now deprecated -- in favour of the new flag, and will be removed later on. Print this info only once. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250919-net-next-mptcp-server-side-flag-v1-4-a97a5d561a8b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-22 11:51:25 -07:00
Matthieu Baerts (NGI0)	c9809f03c1	mptcp: pm: netlink: only add server-side attr when true This attribute is a boolean. No need to add it to set it to 'false'. Indeed, the default value when this attribute is not set is naturally 'false'. A few bytes can then be saved by not adding this attribute if the connection is not on the server side. This prepares the future deprecation of its attribute, in favour of a new flag. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250919-net-next-mptcp-server-side-flag-v1-1-a97a5d561a8b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-22 11:51:24 -07:00
David Yang	50d51cef55	selftests: forwarding: Reorder (ar)ping arguments to obey POSIX getopt Quoted from musl wiki: GNU getopt permutes argv to pull options to the front, ahead of non-option arguments. musl and the POSIX standard getopt stop processing options at the first non-option argument with no permutation. Thus these scripts stop working on musl since non-option arguments for tools using getopt() (in this case, (ar)ping) do not always come last. Fix it by reordering arguments. Signed-off-by: David Yang <mmyangfl@gmail.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250919053538.1106753-1-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-22 11:37:20 -07:00
Linus Torvalds	d4c7fccfa7	Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd fixes from Jason Gunthorpe: "Fix two user triggerable use-after-free issues: - Possible race UAF setting up mmaps - Syzkaller found UAF when erroring an file descriptor creation ioctl due to the fput() work queue" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: iommufd/selftest: Update the fail_nth limit iommufd: WARN if an object is aborted with an elevated refcount iommufd: Fix race during abort for file descriptors iommufd: Fix refcounting race during mmap	2025-09-22 11:16:14 -07:00
Sean Christopherson	e8f85d7884	KVM: x86: Don't treat ENTER and LEAVE as branches, because they aren't Remove the IsBranch flag from ENTER and LEAVE in KVM's emulator, as ENTER and LEAVE are stack operations, not branches. Add forced emulation of said instructions to the PMU counters test to prove that KVM diverges from hardware, and to guard against regressions. Opportunistically add a missing "1 MOV" to the selftest comment regarding the number of instructions per loop, which commit `7803339fa9` ("KVM: selftests: Use data load to trigger LLC references/misses in Intel PMU") forgot to add. Fixes: `018d70ffcf` ("KVM: x86: Update vPMCs when retiring branch instructions") Cc: Jim Mattson <jmattson@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Chao Gao <chao.gao@intel.com> Link: https://lore.kernel.org/r/20250919004639.1360453-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-09-22 07:14:05 -07:00
Sergey Bashirov	c97b737ef8	sunrpc: Change ret code of xdr_stream_decode_opaque_fixed Since the opaque is fixed in size, the caller already knows how many bytes were decoded, on success. Thus, xdr_stream_decode_opaque_fixed() doesn't need to return that value. And, xdr_stream_decode_u32 and _u64 both return zero on success. This patch simplifies the caller's error checking to avoid potential integer promotion issues. Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-09-21 19:24:50 -04:00
Muhammad Usama Anjum	e75f15fb69	selftests/mm: protection_keys: fix dead code The while loop doesn't execute and following warning gets generated: protection_keys.c:561:15: warning: code will never be executed [-Wunreachable-code] int rpkey = alloc_random_pkey(); Let's enable the while loop such that it gets executed nr_iterations times. Simplify the code a bit as well. Link: https://lkml.kernel.org/r/20250912123025.1271051-3-usama.anjum@collabora.com Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mariano Pache <npache@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-21 14:22:34 -07:00
Muhammad Usama Anjum	3d5022a0f8	selftests/mm: add -Wunreachable-code and fix warnings Patch series "selftests/mm: Add -Wunreachable-code and fix warnings". Add -Wunreachable-code to selftests and remove dead code from generated warnings. This patch (of 2): Enable -Wunreachable-code flag to catch dead code and fix them. 1. Remove the dead code and write a comment instead: hmm-tests.c:2033:3: warning: code will never be executed [-Wunreachable-code] perror("Should not reach this\n"); ^~~~~~ 2. ksft_exit_fail_msg() calls exit(). So cleanup isn't done. Replace it with ksft_print_msg(). split_huge_page_test.c:301:3: warning: code will never be executed [-Wunreachable-code] goto cleanup; ^~~~~~~~~~~~ 3. Remove duplicate inline. pkey_sighandler_tests.c:44:15: warning: duplicate 'inline' declaration specifier [-Wduplicate-decl-specifier] static inline __always_inline Link: https://lkml.kernel.org/r/20250912123025.1271051-1-usama.anjum@collabora.com Link: https://lkml.kernel.org/r/20250912123025.1271051-2-usama.anjum@collabora.com Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Lance Yang <lance.yang@linux.dev> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mariano Pache <npache@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-21 14:22:34 -07:00
Muhammad Usama Anjum	5ea8ab7f93	selftests/mm: centralize the __always_unused macro This macro gets used in different tests. Add it to kselftest.h which is central location and tests use this header. Then use this new macro. Link: https://lkml.kernel.org/r/20250912125102.1309796-1-usama.anjum@collabora.com Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Antonio Quartulli <antonio@openvpn.net> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kacinski <kuba@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: "Sabrina Dubroca" <sd@queasysnail.net> Cc: Shuah Khan <shuah@kernel.org> Cc: Simon Horman <horms@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-21 14:22:33 -07:00
David Hildenbrand	a5883fa942	selftests/mm: gup_tests: option to GUP all pages in a single call We recently missed detecting an issue during early testing because the default (!all) tests would not trigger it and even when running "all" tests it only would happen sometimes because of races. So let's allow for an easy way to specify "GUP all pages in a single call", extend the test matrix and extend our default (!all) tests. By GUP'ing all pages in a single call, with the default size of 128MiB we'll cover multiple leaf page tables / PMDs on architectures with sane THP sizes. Link: https://lkml.kernel.org/r/20250910093051.1693097-1-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Peter Xu <peterx@redhat.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-21 14:22:33 -07:00
Zach O'Keefe	9b375adb39	selftests/mm: remove PROT_EXEC req from file-collapse tests As of v6.8 commit `7fbb5e1882` ("mm: remove VM_EXEC requirement for THP eligibility") thp collapse no longer requires file-backed mappings be created with PROT_EXEC. Remove the overly-strict dependency from thp collapse tests so we test the least-strict requirement for success. Link: https://lkml.kernel.org/r/20250909190534.512801-1-zokeefe@google.com Signed-off-by: Zach O'Keefe <zokeefe@google.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-21 14:22:31 -07:00
Chunyu Hu	c56325259a	selftests/mm: fix va_high_addr_switch.sh failure on x86_64 The test will fail as below on x86_64 with cpu la57 support (will skip if no la57 support). Note, the test requries nr_hugepages to be set first. # running bash ./va_high_addr_switch.sh # ------------------------------------- # mmap(addr_switch_hint - pagesize, pagesize): 0x7f55b60fa000 - OK # mmap(addr_switch_hint - pagesize, (2 * pagesize)): 0x7f55b60f9000 - OK # mmap(addr_switch_hint, pagesize): 0x800000000000 - OK # mmap(addr_switch_hint, 2 * pagesize, MAP_FIXED): 0x800000000000 - OK # mmap(NULL): 0x7f55b60f9000 - OK # mmap(low_addr): 0x40000000 - OK # mmap(high_addr): 0x1000000000000 - OK # mmap(high_addr) again: 0xffff55b6136000 - OK # mmap(high_addr, MAP_FIXED): 0x1000000000000 - OK # mmap(-1): 0xffff55b6134000 - OK # mmap(-1) again: 0xffff55b6132000 - OK # mmap(addr_switch_hint - pagesize, pagesize): 0x7f55b60fa000 - OK # mmap(addr_switch_hint - pagesize, 2 * pagesize): 0x7f55b60f9000 - OK # mmap(addr_switch_hint - pagesize/2 , 2 * pagesize): 0x7f55b60f7000 - OK # mmap(addr_switch_hint, pagesize): 0x800000000000 - OK # mmap(addr_switch_hint, 2 * pagesize, MAP_FIXED): 0x800000000000 - OK # mmap(NULL, MAP_HUGETLB): 0x7f55b5c00000 - OK # mmap(low_addr, MAP_HUGETLB): 0x40000000 - OK # mmap(high_addr, MAP_HUGETLB): 0x1000000000000 - OK # mmap(high_addr, MAP_HUGETLB) again: 0xffff55b5e00000 - OK # mmap(high_addr, MAP_FIXED \| MAP_HUGETLB): 0x1000000000000 - OK # mmap(-1, MAP_HUGETLB): 0x7f55b5c00000 - OK # mmap(-1, MAP_HUGETLB) again: 0x7f55b5a00000 - OK # mmap(addr_switch_hint - pagesize, 2hugepagesize, MAP_HUGETLB): 0x800000000000 - FAILED # mmap(addr_switch_hint , 2hugepagesize, MAP_FIXED \| MAP_HUGETLB): 0x800000000000 - OK # [FAIL] addr_switch_hint is defined as DFEFAULT_MAP_WINDOW in the failed test (for x86_64, DFEFAULT_MAP_WINDOW is defined as (1UL<<47) - pagesize) in 64 bit. Before commit `cc92882ee2` ("mm: drop hugetlb_get_unmapped_area{_} functions"), for x86_64 hugetlb_get_unmapped_area() is handled in arch code arch/x86/mm/hugetlbpage.c and addr is checked with map_address_hint_valid() after align with 'addr &= huge_page_mask(h)' which is a round down way, and it will fail the check because the addr is within the DEFAULT_MAP_WINDOW but (addr + len) is above the DFEFAULT_MAP_WINDOW. So it wil go through the hugetlb_get_unmmaped_area_top_down() to find an area within the DFEFAULT_MAP_WINDOW. After commit `cc92882ee2` ("mm: drop hugetlb_get_unmapped_area{_} functions"). The addr hint for hugetlb_get_unmmaped_area() will be rounded up and aligned to hugepage size with ALIGN() for all arches. And after the align, the addr will be above the default MAP_DEFAULT_WINDOW, and the map_addresshint_valid() check will pass because both aligned addr (addr0) and (addr + len) are above the DEFAULT_MAP_WINDOW, and the aligned hint address (0x800000000000) is returned as an suitable gap is found there, in arch_get_unmapped_area_topdown(). To still cover the case that addr is within the DEFAULT_MAP_WINDOW, and addr + len is above the DFEFAULT_MAP_WINDOW, change to choose the last hugepage aligned address within the DEFAULT_MAP_WINDOW as the hint addr, and the addr + len (2 hugepages) will be one hugepage above the DEFAULT_MAP_WINDOW. An aligned address won't be affected by the page round up or round down from kernel, so it's determistic. Link: https://lkml.kernel.org/r/20250912013711.3002969-4-chuhu@redhat.com Fixes: `cc92882ee2` ("mm: drop hugetlb_get_unmapped_area{_*} functions") Signed-off-by: Chunyu Hu <chuhu@redhat.com> Suggested-by: David Hildenbrand <david@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-21 14:22:29 -07:00
Chunyu Hu	d9d957bd7b	selftests/mm: alloc hugepages in va_high_addr_switch test Alloc hugepages in the test internally, so we don't fully rely on the run_vmtests.sh. If run_vmtests.sh does that great, free hugepages is enough for being used to run the test, leave it as it is, otherwise setup the hugepages in the test. Save the original nr_hugepages value and restore it after test finish, so leave a stable test envronment. Link: https://lkml.kernel.org/r/20250912013711.3002969-3-chuhu@redhat.com Signed-off-by: Chunyu Hu <chuhu@redhat.com> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-21 14:22:28 -07:00
Chunyu Hu	6e296bcf29	selftests/mm: fix hugepages cleanup too early Patch series "Fix va_high_addr_switch.sh test failure", v3. These three patches fix the va_high_addr_switch.sh test failure on x86_64. Patch 1 fixes the hugepage setup issue that nr_hugepages is reset too early in run_vmtests.sh and break the later va_high_addr_switch testing. Patch 2 adds hugepage setup in va_high_addr_switch test, so that it can still work if vm_runtests.sh changes the hugepage setup someday. Patch 3 fixes the test failure caused by the hint addr align method change in hugetlb_get_unmapped_area(). This patch (of 3): The nr_hugepgs variable is used to keep the original nr_hugepages at the hugepage setup step at test beginning. After userfaultfd test, a cleaup is executed, both /sys/kernel/mm/hugepages/hugepages-*/nr_hugepages and /proc/sys//vm/nr_hugepages are reset to 'original' value before userfaultfd test starts. Issue here is the value used to restore /proc/sys/vm/nr_hugepages is nr_hugepgs which is the initial value before the vm_runtests.sh runs, not the value before userfaultfd test starts. 'va_high_addr_swith.sh' tests runs after that will possibly see no hugepages available for test, and got EINVAL when mmap(HUGETLB), making the result invalid. And before pkey tests, nr_hugepgs is changed to be used as a temp variable to save nr_hugepages before pkey test, and restore it after pkey tests finish. The original nr_hugepages value is not tracked anymore, so no way to restore it after all tests finish. Add a new variable orig_nr_hugepgs to save the original nr_hugepages, and and restore it to nr_hugepages after all tests finish. And change to use the nr_hugepgs variable to save the /proc/sys/vm/nr_hugeages after hugepage setup, it's also the value before userfaultfd test starts, and the correct value to be restored after userfaultfd finishes. The va_high_addr_switch.sh broken will be resolved. Link: https://lkml.kernel.org/r/20250912013711.3002969-1-chuhu@redhat.com Link: https://lkml.kernel.org/r/20250912013711.3002969-2-chuhu@redhat.com Signed-off-by: Chunyu Hu <chuhu@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-21 14:22:28 -07:00
David Hildenbrand	24a3c7af3b	selftests/mm: split_huge_page_test: cleanups for split_pte_mapped_thp test There is room for improvement, so let's clean up a bit: (1) Define "4" as a constant. (2) SKIP if we fail to allocate all THPs (e.g., fragmented) and add recovery code for all other failure cases: no need to exit the test. (3) Rename "len" to thp_area_size, and "one_page" to "thp_area". (4) Allocate a new area "page_area" into which we will mremap the pages; add "page_area_size". Now we can easily merge the two mremap instances into a single one. (5) Iterate THPs instead of bytes when checking for missed THPs after mremap. (6) Rename "pte_mapped2" to "tmp", used to verify mremap(MAP_FIXED) result. (7) Split the corruption test from the failed-split test, so we can just iterate bytes vs. thps naturally. (8) Extend comments and clarify why we are using mremap in the first place. Link: https://lkml.kernel.org/r/20250903070253.34556-3-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mariano Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-21 14:22:20 -07:00
David Hildenbrand	0d0e03d5b8	selftests/mm: split_huge_page_test: fix occasional is_backed_by_folio() wrong results Patch series "selftests/mm: split_huge_page_test: split_pte_mapped_thp improvements", v2. One fix for occasional failures I found while testing and a bunch of cleanups that should make that test easier to digest. This patch (of 2): When checking for actual tail or head pages of a folio, we must make sure that the KPF_COMPOUND_HEAD/KPF_COMPOUND_TAIL flag is paired with KPF_THP. For example, if we have another large folio after our large folio in physical memory, our "pfn_flags & (KPF_THP \| KPF_COMPOUND_TAIL)" would trigger even though it's actually a head page of the next folio. If is_backed_by_folio() returns a wrong result, split_pte_mapped_thp() can fail with "Some THPs are missing during mremap". Fix it by checking for head/tail pages of folios properly. Add folio_tail_flags/folio_head_flags to improve readability and use these masks also when just testing for any compound page. Link: https://lkml.kernel.org/r/20250903070253.34556-1-david@redhat.com Link: https://lkml.kernel.org/r/20250903070253.34556-2-david@redhat.com Fixes: 169b456b0162 ("selftests/mm: reimplement is_backed_by_thp() with more precise check") Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mariano Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-21 14:22:20 -07:00
David Hildenbrand	84efbefa26	mm: remove nth_page() Now that all users are gone, let's remove it. Link: https://lkml.kernel.org/r/20250901150359.867252-38-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-21 14:22:10 -07:00
David Hildenbrand	3b864c8f55	wireguard: selftests: remove CONFIG_SPARSEMEM_VMEMMAP=y from qemu kernel config It's no longer user-selectable (and the default was already "y"), so let's just drop it. It was never really relevant to the wireguard selftests either way. Link: https://lkml.kernel.org/r/20250901150359.867252-6-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-21 14:22:01 -07:00
Kaushlendra Kumar	4fa5b88e77	tools/mm/slabinfo: fix access to null terminator in string boundary The current code incorrectly accesses buffer[strlen(buffer)], which points to the null terminator ('\0') at the end of the string. This is technically out-of-bounds access since valid string content ends at index strlen(buffer)-1. Fix by: 1. Declaring strlen() result variable at function scope 2. Adding bounds check (len > 0) to handle empty strings 3. Using buffer[len-1] to correctly access the last character before the null terminator [kaushlendra.kumar@intel.com: remove unnecessary blank line] Link: https://lkml.kernel.org/r/20250901044955.3902815-1-kaushlendra.kumar@intel.com Link: https://lkml.kernel.org/r/20250830172022.1927448-1-kaushlendra.kumar@intel.com Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com> Acked-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-21 14:22:00 -07:00
Ujwal Kundur	4dfd4bba85	selftests/mm/uffd: refactor non-composite global vars into struct Refactor macros and non-composite global variable definitions into a struct that is defined at the start of a test and is passed around instead of relying on global vars. Link: https://lkml.kernel.org/r/20250829155600.2000-1-ujwal.kundur@gmail.com Signed-off-by: Ujwal Kundur <ujwal.kundur@gmail.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Brendan Jackman <jackmanb@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-21 14:21:59 -07:00
Johannes Weiner	2ccd9fecd9	mm: remove unused zpool layer With zswap using zsmalloc directly, there are no more in-tree users of this code. Remove it. With zpool gone, zsmalloc is now always a simple dependency and no longer something the user needs to configure. Hide CONFIG_ZSMALLOC from the user and have zswap and zram pull it in as needed. Link: https://lkml.kernel.org/r/20250829162212.208258-3-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: SeongJae Park <sj@kernel.org> Acked-by: Yosry Ahmed <yosry.ahmed@linux.dev> Cc: Chengming Zhou <zhouchengming@bytedance.com> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Vitaly Wool <vitaly.wool@konsulko.se> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-21 14:21:59 -07:00
Uday Shankar	a3835a4410	selftests: ublk: fix behavior when fio is not installed Some ublk selftests have strange behavior when fio is not installed. While most tests behave correctly (run if they don't need fio, or skip if they need fio), the following tests have different behavior: - test_null_01, test_null_02, test_generic_01, test_generic_02, and test_generic_12 try to run fio without checking if it exists first, and fail on any failure of the fio command (including "fio command not found"). So these tests fail when they should skip. - test_stress_05 runs fio without checking if it exists first, but doesn't fail on fio command failure. This test passes, but that pass is misleading as the test doesn't do anything useful without fio installed. So this test passes when it should skip. Fix these issues by adding _have_program fio checks to the top of all of these tests. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-09-20 23:18:23 -06:00
Yonghong Song	5a427fddec	selftests/bpf: Fix selftest verifier_arena_large failure With latest llvm22, I got the following verification failure: ... ; int big_alloc2(void ctx) @ verifier_arena_large.c:207 0: (b4) w6 = 1 ; R6_w=1 ... ; if (err) @ verifier_arena_large.c:233 53: (56) if w6 != 0x0 goto pc+62 ; R6=0 54: (b7) r7 = -4 ; R7_w=-4 55: (18) r8 = 0x7f4000000000 ; R8_w=scalar() 57: (bf) r9 = addr_space_cast(r8, 0, 1) ; R8_w=scalar() R9_w=arena 58: (b4) w6 = 5 ; R6_w=5 ; pg = page[i]; @ verifier_arena_large.c:238 59: (bf) r1 = r7 ; R1_w=-4 R7_w=-4 60: (07) r1 += 4 ; R1_w=0 61: (79) r2 = (u64 )(r9 +0) ; R2_w=scalar() R9_w=arena ; if (pg != i) @ verifier_arena_large.c:239 62: (bf) r3 = addr_space_cast(r2, 0, 1) ; R2_w=scalar() R3_w=arena 63: (71) r3 = (u8 )(r3 +0) ; R3_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) 64: (5d) if r1 != r3 goto pc+51 ; R1_w=0 R3_w=0 ; bpf_arena_free_pages(&arena, (void __arena )pg, 2); @ verifier_arena_large.c:241 65: (18) r1 = 0xff11000114548000 ; R1_w=map_ptr(map=arena,ks=0,vs=0) 67: (b4) w3 = 2 ; R3_w=2 68: (85) call bpf_arena_free_pages#72675 ; 69: (b7) r1 = 0 ; R1_w=0 ; page[i + 1] = NULL; @ verifier_arena_large.c:243 70: (7b) (u64 *)(r8 +8) = r1 R8 invalid mem access 'scalar' processed 61 insns (limit 1000000) max_states_per_insn 0 total_states 6 peak_states 6 mark_read 2 ============= #489/5 verifier_arena_large/big_alloc2:FAIL The main reason is that 'r8' in insn '70' is not an arena pointer. Further debugging at llvm side shows that llvm commit ([1]) caused the failure. For the original code: page[i] = NULL; page[i + 1] = NULL; the llvm transformed it to something like below at source level: __builtin_memset(&page[i], 0, 16) Such transformation prevents llvm BPFCheckAndAdjustIR pass from generating proper addr_space_cast insns ([2]). Adding support in llvm BPFCheckAndAdjustIR pass should work, but not sure that such a pattern exists or not in real applications. At the same time, simply adding a memory barrier between two 'page' assignment can fix the issue. [1] https://github.com/llvm/llvm-project/pull/155415 [2] https://github.com/llvm/llvm-project/pull/84410 Cc: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20250920045805.3288551-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-20 15:49:35 -07:00
Colin Ian King	4386f71623	selftest/futex: Fix spelling mistake "boundarie" -> "boundary" There is a spelling mistake in a test message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2025-09-20 18:23:13 +02:00
André Almeida	520db0559d	selftests/futex: Remove logging.h file Every futex selftest uses the kselftest_harness.h helper and don't need the logging.h file. Delete it. Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2025-09-20 18:11:56 +02:00
André Almeida	b257d91c4d	selftests/futex: Drop logging.h include from futex_numa futex_numa doesn't really use logging.h helpers, it's only need two includes from this file. So drop it and include the two missing includes. Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2025-09-20 18:11:56 +02:00
André Almeida	d35ca2f642	selftests/futex: Refactor futex_numa_mpol with kselftest_harness.h To reduce the boilerplate code, refactor futex_numa_mpol test to use kselftest_harness header instead of futex's logging header. Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2025-09-20 18:11:55 +02:00
André Almeida	4ba629e6c6	selftests/futex: Refactor futex_priv_hash with kselftest_harness.h To reduce the boilerplate code, refactor futex_priv_hash test to use kselftest_harness header instead of futex's logging header. Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2025-09-20 18:11:55 +02:00
André Almeida	a91e8e372e	selftests/futex: Refactor futex_waitv with kselftest_harness.h To reduce the boilerplate code, refactor futex_waitv test to use kselftest_harness header instead of futex's logging header. Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2025-09-20 18:11:55 +02:00
André Almeida	f341a20f6d	selftests/futex: Refactor futex_requeue with kselftest_harness.h To reduce the boilerplate code, refactor futex_requeue test to use kselftest_harness header instead of futex's logging header. Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2025-09-20 18:11:55 +02:00
André Almeida	e5c04d0f3e	selftests/futex: Refactor futex_wait with kselftest_harness.h To reduce the boilerplate code, refactor futex_wait test to use kselftest_harness header instead of futex's logging header. Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2025-09-20 18:11:55 +02:00
André Almeida	14d016bd72	selftests/futex: Refactor futex_wait_private_mapped_file with kselftest_harness.h To reduce the boilerplate code, refactor futex_wait_private_mapped_file test to use kselftest_harness header instead of futex's logging header. Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2025-09-20 18:11:54 +02:00

... 8 9 10 11 12 ...

49385 Commits