linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-21 19:35:23 -04:00

Author	SHA1	Message	Date
shechenglong	7f16357378	arm64: proton-pack: Fix hard lockup due to print in scheduler context Relocate the printk() calls from spectre_v4_mitigations_off() and spectre_v2_mitigations_off() into setup_system_capabilities() function, preventing hard lockups caused by printk calls in scheduler context: \| _raw_spin_lock_nested+168 \| ttwu_queue+180 （rq_lock(rq, &rf); 2nd acquiring the rq->__lock） \| try_to_wake_up+548 \| wake_up_process+32 \| __up+88 \| up+100 \| __up_console_sem+96 \| console_unlock+696 \| vprintk_emit+428 \| vprintk_default+64 \| vprintk_func+220 \| printk+104 \| spectre_v4_enable_task_mitigation+344 \| __switch_to+100 \| __schedule+1028 (rq_lock(rq, &rf); 1st acquiring the rq->__lock) \| schedule_idle+48 \| do_idle+388 \| cpu_startup_entry+44 \| secondary_start_kernel+352 Suggested-by: Mark Rutland <mark.rutland@arm.com> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: shechenglong <shechenglong@xfusion.com> Signed-off-by: Will Deacon <will@kernel.org>	2025-11-07 14:49:12 +00:00
shechenglong	62e72463ca	arm64: proton-pack: Drop print when !CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY Following the pattern established with other Spectre mitigations, do not print a message when the CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY Kconfig option is disabled. Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: shechenglong <shechenglong@xfusion.com> Signed-off-by: Will Deacon <will@kernel.org>	2025-11-07 14:49:09 +00:00
Ryan Roberts	53357f14f9	arm64: mm: Tidy up force_pte_mapping() Tidy up the implementation of force_pte_mapping() to make it easier to read and introduce the split_leaf_mapping_possible() helper to reduce code duplication in split_kernel_leaf_mapping() and arch_kfence_init_pool(). Suggested-by: David Hildenbrand (Red Hat) <david@kernel.org> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Yang Shi <yang@os.amperecomputing.com> Signed-off-by: Will Deacon <will@kernel.org>	2025-11-07 14:43:15 +00:00
Ryan Roberts	40a292f701	arm64: mm: Optimize range_split_to_ptes() Enter lazy_mmu mode while splitting a range of memory to pte mappings. This causes barriers, which would otherwise be emitted after every pte (and pmd/pud) write, to be deferred until exiting lazy_mmu mode. For large systems, this is expected to significantly speed up fallback to pte-mapping the linear map for the case where the boot CPU has BBML2_NOABORT, but secondary CPUs do not. I haven't directly measured it, but this is equivalent to commit `1fcb7cea8a` ("arm64: mm: Batch dsb and isb when populating pgtables"). Note that for the path from arch_kfence_init_pool(), we may sleep while allocating memory inside the lazy_mmu mode. Sleeping is not allowed by generic code inside lazy_mmu, but we know that the arm64 implementation is sleep-safe. So this is ok and follows the same pattern already used by split_kernel_leaf_mapping(). Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Yang Shi <yang@os.amperecomputing.com> Signed-off-by: Will Deacon <will@kernel.org>	2025-11-07 14:43:15 +00:00
Ryan Roberts	ce2b3a50ad	arm64: mm: Don't sleep in split_kernel_leaf_mapping() when in atomic context It has been reported that split_kernel_leaf_mapping() is trying to sleep in non-sleepable context. It does this when acquiring the pgtable_split_lock mutex, when either CONFIG_DEBUG_PAGEALLOC or CONFIG_KFENCE are enabled, which change linear map permissions within softirq context during memory allocation and/or freeing. All other paths into this function are called from sleepable context and so are safe. But it turns out that the memory for which these 2 features may attempt to modify the permissions is always mapped by pte, so there is no need to attempt to split the mapping. So let's exit early in these cases and avoid attempting to take the mutex. There is one wrinkle to this approach; late-initialized kfence allocates it's pool from the buddy which may be block mapped. So we must hook that allocation and convert it to pte-mappings up front. Previously this was done as a side-effect of kfence protecting all the individual pages in its pool at init-time, but this no longer works due to the added early exit path in split_kernel_leaf_mapping(). So instead, do this via the existing arch_kfence_init_pool() arch hook, and reuse the existing linear_map_split_to_ptes() infrastructure. Closes: https://lore.kernel.org/all/f24b9032-0ec9-47b1-8b95-c0eeac7a31c5@roeck-us.net/ Fixes: `a166563e7e` ("arm64: mm: support large block mapping when rodata=full") Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <groeck@google.com> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Yang Shi <yang@os.amperecomputing.com> Signed-off-by: Will Deacon <will@kernel.org>	2025-11-07 14:43:15 +00:00
Yang Shi	0ec364c0c9	arm64: kprobes: check the return value of set_memory_rox() Since commit `a166563e7e` ("arm64: mm: support large block mapping when rodata=full"), __change_memory_common has more chance to fail due to memory allocation failure when splitting page table. So check the return value of set_memory_rox(), then bail out if it fails otherwise we may have RW memory mapping for kprobes insn page. Fixes: `195a1b7d83` ("arm64: kprobes: call set_memory_rox() for kprobe page") Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Signed-off-by: Yang Shi <yang@os.amperecomputing.com> Signed-off-by: Will Deacon <will@kernel.org>	2025-11-07 14:30:22 +00:00
Punit Agrawal	7991fda619	arm64: acpi: Drop message logging SPCR default console Commit `f5a4af3c75` ("ACPI: Add acpi=nospcr to disable ACPI SPCR as default console on ARM64") introduced a command line parameter to prevent using SPCR provided console as default. It also introduced a message to log this choice. Drop the message as it is not particularly useful and can be incorrect in situations where no SPCR is provided by the firmware. Link: https://lore.kernel.org/all/aQN0YWUYaPYWpgJM@willie-the-truck/ Signed-off-by: Punit Agrawal <punit.agrawal@oss.qualcomm.com> Signed-off-by: Will Deacon <will@kernel.org>	2025-11-07 14:23:33 +00:00
Punit Agrawal	eeb8c19896	Revert "ACPI: Suppress misleading SPCR console message when SPCR table is absent" This reverts commit `bad3fa2fb9`. Commit `bad3fa2fb9` ("ACPI: Suppress misleading SPCR console message when SPCR table is absent") mistakenly assumes acpi_parse_spcr() returning 0 to indicate a failure to parse SPCR. While addressing the resultant incorrect logging it was deemed that dropping the message is a better approach as it is not particularly useful. Roll back the commit introducing the bug as a step towards dropping the log message. Link: https://lore.kernel.org/all/aQN0YWUYaPYWpgJM@willie-the-truck/ Signed-off-by: Punit Agrawal <punit.agrawal@oss.qualcomm.com> Signed-off-by: Will Deacon <will@kernel.org>	2025-11-07 14:23:33 +00:00
Catalin Marinas	535fdfc5a2	arm64: Use load LSE atomics for the non-return per-CPU atomic operations The non-return per-CPU this_cpu_() atomic operations are implemented as STADD/STCLR/STSET when FEAT_LSE is available. On many microarchitecture implementations, these instructions tend to be executed "far" in the interconnect or memory subsystem (unless the data is already in the L1 cache). This is in general more efficient when there is contention as it avoids bouncing cache lines between CPUs. The load atomics (e.g. LDADD without XZR as destination), OTOH, tend to be executed "near" with the data loaded into the L1 cache. STADD executed back to back as in srcu_read_{lock,unlock}() incur an additional overhead due to the default posting behaviour on several CPU implementations. Since the per-CPU atomics are unlikely to be used concurrently on the same memory location, encourage the hardware to to execute them "near" by issuing load atomics - LDADD/LDCLR/LDSET - with the destination register unused (but not XZR). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/e7d539ed-ced0-4b96-8ecd-048a5b803b85@paulmck-laptop Reported-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Paul E. McKenney <paulmck@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Palmer Dabbelt <palmer@dabbelt.com> [will: Add comment and link to the discussion thread] Signed-off-by: Will Deacon <will@kernel.org>	2025-11-07 14:20:07 +00:00
Mario Limonciello (AMD)	d23550efc6	x86/microcode/AMD: Add more known models to entry sign checking Two Zen5 systems are missing from need_sha_check(). Add them. Fixes: `50cef76d5c` ("x86/microcode/AMD: Load only SHA256-checksummed patches") Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org> Link: https://patch.msgid.link/20251106182904.4143757-1-superm1@kernel.org	2025-11-07 12:12:21 +01:00
Linus Torvalds	225a97d6d4	Merge tag 'riscv-for-linus-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: - A fix to disable KASAN checks while walking a non-current task's stackframe (following x86) - A fix for a kvrealloc()-related memory leak in module_frob_arch_sections() - Two replacements of strcpy() with strscpy() - A change to use the RISC-V .insn assembler directive when possible to assemble instructions from hex opcodes - Some low-impact fixes in the ptdump code and kprobes test code * tag 'riscv-for-linus-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: cpuidle: riscv-sbi: Replace deprecated strcpy in sbi_cpuidle_init_cpu riscv: KGDB: Replace deprecated strcpy in kgdb_arch_handle_qxfer_pkt riscv: asm: use .insn for making custom instructions riscv: tests: Make RISCV_KPROBES_KUNIT tristate riscv: tests: Rename kprobes_test_riscv to kprobes_riscv riscv: Fix memory leak in module_frob_arch_sections() riscv: ptdump: use seq_puts() in pt_dump_seq_puts() macro riscv: stacktrace: Disable KASAN checks for non-current tasks	2025-11-06 15:44:18 -08:00
Linus Torvalds	c90841db35	Merge tag 'hardening-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: "This is a work-around for a (now fixed) corner case in the arm32 build with Clang KCFI enabled. - Introduce __nocfi_generic for arm32 Clang (Nathan Chancellor)" * tag 'hardening-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: libeth: xdp: Disable generic kCFI pass for libeth_xdp_tx_xmit_bulk() ARM: Select ARCH_USES_CFI_GENERIC_LLVM_PASS compiler_types: Introduce __nocfi_generic	2025-11-06 11:54:59 -08:00
Sukrit Bhatnagar	d0164c1619	KVM: VMX: Fix check for valid GVA on an EPT violation On an EPT violation, bit 7 of the exit qualification is set if the guest linear-address is valid. The derived page fault error code should not be checked for this bit. Fixes: `f300948251` ("KVM: VMX: Set PFERR_GUEST_{FINAL,PAGE}_MASK if and only if the GVA is valid") Cc: stable@vger.kernel.org Signed-off-by: Sukrit Bhatnagar <Sukrit.Bhatnagar@sony.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Link: https://patch.msgid.link/20251106052853.3071088-1-Sukrit.Bhatnagar@sony.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-11-06 06:06:18 -08:00
Jiri Olsa	20a0bc1027	x86/fgraph,bpf: Fix stack ORC unwind from kprobe_multi return probe Currently we don't get stack trace via ORC unwinder on top of fgraph exit handler. We can see that when generating stacktrace from kretprobe_multi bpf program which is based on fprobe/fgraph. The reason is that the ORC unwind code won't get pass the return_to_handler callback installed by fgraph return probe machinery. Solving this by creating stack frame in return_to_handler expected by ftrace_graph_ret_addr function to recover original return address and continue with the unwind. Also updating the pt_regs data with cs/flags/rsp which are needed for successful stack retrieval from ebpf bpf_get_stackid helper. - in get_perf_callchain we check user_mode(regs) so CS has to be set - in perf_callchain_kernel we call perf_hw_regs(regs), so EFLAGS/FIXED has to be unset Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251104215405.168643-3-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-11-05 17:05:19 -08:00
Jiri Olsa	6d08340d1e	Revert "perf/x86: Always store regs->ip in perf_callchain_kernel()" This reverts commit `83f44ae0f8`. Currently we store initial stacktrace entry twice for non-HW ot_regs, which means callers that fail perf_hw_regs(regs) condition in perf_callchain_kernel. It's easy to reproduce this bpftrace: # bpftrace -e 'tracepoint:sched:sched_process_exec { print(kstack()); }' Attaching 1 probe... bprm_execve+1767 bprm_execve+1767 do_execveat_common.isra.0+425 __x64_sys_execve+56 do_syscall_64+133 entry_SYSCALL_64_after_hwframe+118 When perf_callchain_kernel calls unwind_start with first_frame, AFAICS we do not skip regs->ip, but it's added as part of the unwind process. Hence reverting the extra perf_callchain_store for non-hw regs leg. I was not able to bisect this, so I'm not really sure why this was needed in v5.2 and why it's not working anymore, but I could see double entries as far as v5.10. I did the test for both ORC and framepointer unwind with and without the this fix and except for the initial entry the stacktraces are the same. Acked-by: Song Liu <song@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251104215405.168643-2-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-11-05 17:05:19 -08:00
Linus Torvalds	dc77806cf3	Merge tag 'rust-fixes-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux Pull rust fixes from Miguel Ojeda: - Fix/workaround a couple Rust 1.91.0 build issues when sanitizers are enabled due to extra checking performed by the compiler and an upstream issue already fixed for Rust 1.93.0 - Fix future Rust 1.93.0 builds by supporting the stabilized name for the 'no-jump-tables' flag - Fix a couple private/broken intra-doc links uncovered by the future move of pin-init to 'syn' * tag 'rust-fixes-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: rust: kbuild: support `-Cjump-tables=n` for Rust 1.93.0 rust: kbuild: workaround `rustdoc` doctests modifier bug rust: kbuild: treat `build_error` and `rustdoc` as kernel objects rust: condvar: fix broken intra-doc link rust: devres: fix private intra-doc link	2025-11-05 11:15:36 -08:00
Linus Torvalds	284922f4c5	x86: uaccess: don't use runtime-const rewriting in modules The runtime-const infrastructure was never designed to handle the modular case, because the constant fixup is only done at boot time for core kernel code. But by the time I used it for the x86-64 user space limit handling in commit `86e6b1547b` ("x86: fix user address masking non-canonical speculation issue"), I had completely repressed that fact. And it all happens to work because the only code that currently actually gets inlined by modules is for the access_ok() limit check, where the default constant value works even when not fixed up. Because at least I had intentionally made it be something that is in the non-canonical address space region. But it's technically very wrong, and it does mean that at least in theory, the use of 'access_ok()' + '__get_user()' can trigger the same speculation issue with non-canonical addresses that the original commit was all about. The pattern is unusual enough that this probably doesn't matter in practice, but very wrong is still very wrong. Also, let's fix it before the nice optimized scoped user accessor helpers that Thomas Gleixner is working on cause this pseudo-constant to then be more widely used. This all came up due to an unrelated discussion with Mateusz Guzik about using the runtime const infrastructure for names_cachep accesses too. There the modular case was much more obviously broken, and Mateusz noted it in his 'v2' of the patch series. That then made me notice how broken 'access_ok()' had been in modules all along. Mea culpa, mea maxima culpa. Fix it by simply not using the runtime-const code in modules, and just using the USER_PTR_MAX variable value instead. This is not performance-critical like the core user accessor functions (get_user() and friends) are. Also make sure this doesn't get forgotten the next time somebody wants to do runtime constant optimizations by having the x86 runtime-const.h header file error out if included by modules. Fixes: `86e6b1547b` ("x86: fix user address masking non-canonical speculation issue") Acked-by: Borislav Petkov <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Triggered-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/all/20251030105242.801528-1-mjguzik@gmail.com/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2025-11-05 10:24:36 +09:00
Miguel Ojeda	789521b471	rust: kbuild: support `-Cjump-tables=n` for Rust 1.93.0 Rust 1.93.0 (expected 2026-01-22) is stabilizing `-Zno-jump-tables` [1][2] as `-Cjump-tables=n` [3]. Without this change, one would eventually see: RUSTC L rust/core.o error: unknown unstable option: `no-jump-tables` Thus support the upcoming version. Link: https://github.com/rust-lang/rust/issues/116592 [1] Link: https://github.com/rust-lang/rust/pull/105812 [2] Link: https://github.com/rust-lang/rust/pull/145974 [3] Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Trevor Gross <tmgross@umich.edu> Acked-by: Nicolas Schier <nsc@kernel.org> Link: https://patch.msgid.link/20251101094011.1024534-1-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2025-11-04 19:11:39 +01:00
Maxim Levitsky	fd92bd3b44	KVM: SVM: switch to raw spinlock for svm->ir_list_lock Use a raw spinlock for vcpu_svm.ir_list_lock as the lock can be taken during schedule() via kvm_sched_out() => __avic_vcpu_put(), and "normal" spinlocks are sleepable locks when PREEMPT_RT=y. This fixes the following lockdep warning: ============================= [ BUG: Invalid wait context ] 6.12.0-146.1640_2124176644.el10.x86_64+debug #1 Not tainted ----------------------------- qemu-kvm/38299 is trying to lock: ff11000239725600 (&svm->ir_list_lock){....}-{3:3}, at: __avic_vcpu_put+0xfd/0x300 [kvm_amd] other info that might help us debug this: context-{5:5} 2 locks held by qemu-kvm/38299: #0: ff11000239723ba8 (&vcpu->mutex){+.+.}-{4:4}, at: kvm_vcpu_ioctl+0x240/0xe00 [kvm] #1: ff11000b906056d8 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x2e/0x130 stack backtrace: CPU: 1 UID: 0 PID: 38299 Comm: qemu-kvm Kdump: loaded Not tainted 6.12.0-146.1640_2124176644.el10.x86_64+debug #1 PREEMPT(voluntary) Hardware name: AMD Corporation QUARTZ/QUARTZ, BIOS RQZ100AB 09/14/2023 Call Trace: <TASK> dump_stack_lvl+0x6f/0xb0 __lock_acquire+0x921/0xb80 lock_acquire.part.0+0xbe/0x270 _raw_spin_lock_irqsave+0x46/0x90 __avic_vcpu_put+0xfd/0x300 [kvm_amd] svm_vcpu_put+0xfa/0x130 [kvm_amd] kvm_arch_vcpu_put+0x48c/0x790 [kvm] kvm_sched_out+0x161/0x1c0 [kvm] prepare_task_switch+0x36b/0xf60 __schedule+0x4f7/0x1890 schedule+0xd4/0x260 xfer_to_guest_mode_handle_work+0x54/0xc0 vcpu_run+0x69a/0xa70 [kvm] kvm_arch_vcpu_ioctl_run+0xdc0/0x17e0 [kvm] kvm_vcpu_ioctl+0x39f/0xe00 [kvm] Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://patch.msgid.link/20251030194130.307900-1-mlevitsk@redhat.com [sean: massage changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-11-04 09:14:28 -08:00
Sean Christopherson	aaac099459	KVM: SVM: Make avic_ga_log_notifier() local to avic.c Make amd_iommu_register_ga_log_notifier() a local symbol now that it's defined and used purely within avic.c. No functional change intended. Fixes: `4bdec12aa8` ("KVM: SVM: Detect X2APIC virtualization (x2AVIC) support") Link: https://patch.msgid.link/20251016190643.80529-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-11-04 09:14:28 -08:00
Sean Christopherson	adc6ae9729	KVM: SVM: Unregister KVM's GALog notifier on kvm-amd.ko exit Unregister the GALog notifier (used to get notified of wake events for blocking vCPUs) on kvm-amd.ko exit so that a KVM or IOMMU driver bug that results in a spurious GALog event "only" results in a spurious IRQ, and doesn't trigger a use-after-free due to executing unloaded module code. Fixes: `5881f73757` ("svm: Introduce AMD IOMMU avic_ga_log_notifier") Reported-by: Hou Wenlong <houwenlong.hwl@antgroup.com> Closes: https://lore.kernel.org/all/20250918130320.GA119526@k08j02272.eu95sqa Link: https://patch.msgid.link/20251016190643.80529-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-11-04 09:14:27 -08:00
Sean Christopherson	59a217ced3	KVM: SVM: Initialize per-CPU svm_data at the end of hardware setup Setup the per-CPU SVM data structures at the very end of hardware setup so that svm_hardware_unsetup() can be used in svm_hardware_setup() to unwind AVIC setup (for the GALog notifier). Alternatively, the error path could do an explicit, manual unwind, e.g. by adding a helper to free the per-CPU structures. But the per-CPU allocations have no interactions or dependencies, i.e. can comfortably live at the end, and so converting to a manual unwind would introduce churn and code without providing any immediate advantage. Link: https://patch.msgid.link/20251016190643.80529-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-11-04 09:14:26 -08:00
Chao Gao	cab4098be4	KVM: x86: Call out MSR_IA32_S_CET is not handled by XSAVES Update the comment above is_xstate_managed_msr() to note that MSR_IA32_S_CET isn't saved/restored by XSAVES/XRSTORS. MSR_IA32_S_CET isn't part of CET_U/S state as the SDM states: The register state used by Control-Flow Enforcement Technology (CET) comprises the two 64-bit MSRs (IA32_U_CET and IA32_PL3_SSP) that manage CET when CPL = 3 (CET_U state); and the three 64-bit MSRs (IA32_PL0_SSP–IA32_PL2_SSP) that manage CET when CPL < 3 (CET_S state). Opportunistically shift the snippet about the safety of loading certain MSRs to the function comment for kvm_access_xstate_msr(), which is where the MSRs are actually loaded into hardware. Fixes: `e44eb58334` ("KVM: x86: Load guest FPU state when access XSAVE-managed MSRs") Signed-off-by: Chao Gao <chao.gao@intel.com> Link: https://patch.msgid.link/20251028060142.29830-1-chao.gao@intel.com [sean: shift snippet about safety to kvm_access_xstate_msr()] Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-11-04 09:14:26 -08:00
Sean Christopherson	9bc610b6a2	KVM: x86: Harden KVM against imbalanced load/put of guest FPU state Assert, via KVM_BUG_ON(), that guest FPU state isn't/is in use when loading/putting the FPU to help detect KVM bugs without needing an assist from KASAN. If an imbalanced load/put is detected, skip the redundant load/put to avoid clobbering guest state and/or crashing the host. Note, kvm_access_xstate_msr() already provides a similar assertion. Reviewed-by: Yao Yuan <yaoyuan@linux.alibaba.com> Reviewed-by: Chao Gao <chao.gao@intel.com> Link: https://patch.msgid.link/20251030185802.3375059-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-11-04 09:14:21 -08:00
Sean Christopherson	8819a49f9f	KVM: x86: Unload "FPU" state on INIT if and only if its currently in-use Replace the hack added by commit `f958bd2314` ("KVM: x86: Fix potential put_fpu() w/o load_fpu() on MPX platform") with a more robust approach of unloading+reloading guest FPU state based on whether or not the vCPU's FPU is currently in-use, i.e. currently loaded. This fixes a bug on hosts that support CET but not MPX, where kvm_arch_vcpu_ioctl_get_mpstate() neglects to load FPU state (it only checks for MPX support) and leads to KVM attempting to put FPU state due to kvm_apic_accept_events() triggering INIT emulation. E.g. on a host with CET but not MPX, syzkaller+KASAN generates: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000004: 0000 [#1] SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000020-0x0000000000000027] CPU: 211 UID: 0 PID: 20451 Comm: syz.9.26 Tainted: G S 6.18.0-smp-DEV #7 NONE Tainted: [S]=CPU_OUT_OF_SPEC Hardware name: Google Izumi/izumi, BIOS 0.20250729.1-0 07/29/2025 RIP: 0010:fpu_swap_kvm_fpstate+0x3ce/0x610 ../arch/x86/kernel/fpu/core.c:377 RSP: 0018:ff1100410c167cc0 EFLAGS: 00010202 RAX: 0000000000000004 RBX: 0000000000000020 RCX: 00000000000001aa RDX: 00000000000001ab RSI: ffffffff817bb960 RDI: 0000000022600000 RBP: dffffc0000000000 R08: ff110040d23c8007 R09: 1fe220081a479000 R10: dffffc0000000000 R11: ffe21c081a479001 R12: ff110040d23c8d98 R13: 00000000fffdc578 R14: 0000000000000000 R15: ff110040d23c8d90 FS: 00007f86dd1876c0(0000) GS:ff11007fc969b000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f86dd186fa8 CR3: 00000040d1dfa003 CR4: 0000000000f73ef0 PKRU: 80000000 Call Trace: <TASK> kvm_vcpu_reset+0x80d/0x12c0 ../arch/x86/kvm/x86.c:11818 kvm_apic_accept_events+0x1cb/0x500 ../arch/x86/kvm/lapic.c:3489 kvm_arch_vcpu_ioctl_get_mpstate+0xd0/0x4e0 ../arch/x86/kvm/x86.c:12145 kvm_vcpu_ioctl+0x5e2/0xed0 ../virt/kvm/kvm_main.c:4539 __se_sys_ioctl+0x11d/0x1b0 ../fs/ioctl.c:51 do_syscall_x64 ../arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x6e/0x940 ../arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f86de71d9c9 </TASK> with a very simple reproducer: r0 = openat$kvm(0xffffffffffffff9c, &(0x7f0000000000), 0x80b00, 0x0) r1 = ioctl$KVM_CREATE_VM(r0, 0xae01, 0x0) ioctl$KVM_CREATE_IRQCHIP(r1, 0xae60) r2 = ioctl$KVM_CREATE_VCPU(r1, 0xae41, 0x0) ioctl$KVM_SET_IRQCHIP(r1, 0x8208ae63, ...) ioctl$KVM_GET_MP_STATE(r2, 0x8004ae98, &(0x7f00000000c0)) Alternatively, the MPX hack in GET_MP_STATE could be extended to cover CET, but from a "don't break existing functionality" perspective, that isn't any less risky than peeking at the state of in_use, and it's far less robust for a long term solution (as evidenced by this bug). Reported-by: Alexander Potapenko <glider@google.com> Fixes: `69cc3e8865` ("KVM: x86: Add XSS support for CET_KERNEL and CET_USER") Reviewed-by: Yao Yuan <yaoyuan@linux.alibaba.com> Reviewed-by: Chao Gao <chao.gao@intel.com> Link: https://patch.msgid.link/20251030185802.3375059-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-11-04 09:14:11 -08:00
Mario Limonciello	f1fdffe0af	x86/CPU/AMD: Add missing terminator for zen5_rdseed_microcode Running x86_match_min_microcode_rev() on a Zen5 CPU trips up KASAN for an out of bounds access. Fixes: `607b9fb2ce` ("x86/CPU/AMD: Add RDSEED fix for Zen5") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251104161007.269885-1-mario.limonciello@amd.com	2025-11-04 17:16:14 +01:00
Helge Deller	fd9f30d103	parisc: Avoid crash due to unaligned access in unwinder Guenter Roeck reported this kernel crash on his emulated B160L machine: Starting network: udhcpc: started, v1.36.1 Backtrace: [<104320d4>] unwind_once+0x1c/0x5c [<10434a00>] walk_stackframe.isra.0+0x74/0xb8 [<10434a6c>] arch_stack_walk+0x28/0x38 [<104e5efc>] stack_trace_save+0x48/0x5c [<105d1bdc>] set_track_prepare+0x44/0x6c [<105d9c80>] ___slab_alloc+0xfc4/0x1024 [<105d9d38>] __slab_alloc.isra.0+0x58/0x90 [<105dc80c>] kmem_cache_alloc_noprof+0x2ac/0x4a0 [<105b8e54>] __anon_vma_prepare+0x60/0x280 [<105a823c>] __vmf_anon_prepare+0x68/0x94 [<105a8b34>] do_wp_page+0x8cc/0xf10 [<105aad88>] handle_mm_fault+0x6c0/0xf08 [<10425568>] do_page_fault+0x110/0x440 [<10427938>] handle_interruption+0x184/0x748 [<11178398>] schedule+0x4c/0x190 BUG: spinlock recursion on CPU#0, ifconfig/2420 lock: terminate_lock.2+0x0/0x1c, .magic: dead4ead, .owner: ifconfig/2420, .owner_cpu: 0 While creating the stack trace, the unwinder uses the stack pointer to guess the previous frame to read the previous stack pointer from memory. The crash happens, because the unwinder tries to read from unaligned memory and as such triggers the unalignment trap handler which then leads to the spinlock recursion and finally to a deadlock. Fix it by checking the alignment before accessing the memory. Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Helge Deller <deller@gmx.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Cc: stable@vger.kernel.org # v6.12+	2025-11-04 12:21:59 +01:00
Yazen Ghannam	0a4b61d9c2	x86/amd_node: Fix AMD root device caching Recent AMD node rework removed the "search and count" method of caching AMD root devices. This depended on the value from a Data Fabric register that was expected to hold the PCI bus of one of the root devices attached to that fabric. However, this expectation is incorrect. The register, when read from PCI config space, returns the bitwise-OR of the buses of all attached root devices. This behavior is benign on AMD reference design boards, since the bus numbers are aligned. This results in a bitwise-OR value matching one of the buses. For example, 0x00 \| 0x40 \| 0xA0 \| 0xE0 = 0xE0. This behavior breaks on boards where the bus numbers are not exactly aligned. For example, 0x00 \| 0x07 \| 0xE0 \| 0x15 = 0x1F. The examples above are for AMD node 0. The first root device on other nodes will not be 0x00. The first root device for other nodes will depend on the total number of root devices, the system topology, and the specific PCI bus number assignment. For example, a system with 2 AMD nodes could have this: Node 0 : 0x00 0x07 0x0e 0x15 Node 1 : 0x1c 0x23 0x2a 0x31 The bus numbering style in the reference boards is not a requirement. The numbering found in other boards is not incorrect. Therefore, the root device caching method needs to be adjusted. Go back to the "search and count" method used before the recent rework. Search for root devices using PCI class code rather than fixed PCI IDs. This keeps the goal of the rework (remove dependency on PCI IDs) while being able to support various board designs. Merge helper functions to reduce code duplication. [ bp: Reflow comment. ] Fixes: `40a5f6ffdf` ("x86/amd_nb: Simplify root device search") Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/all/20251028-fix-amd-root-v2-1-843e38f8be2c@amd.com	2025-11-03 12:46:57 +01:00
Linus Torvalds	e3e0141d3d	Merge tag 'x86-urgent-2025-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 fixes from Ingo Molnar: - Limit AMD microcode Entrysign sha256 signature checking to known CPU generations - Disable AMD RDSEED32 on certain Zen5 CPUs that have a microcode version before when the microcode-based fix was issued for the AMD-SB-7055 erratum - Fix FPU AMD XFD state synchronization on signal delivery - Fix (work around) a SSE4a-disassembly related build failure on X86_NATIVE_CPU=y builds - Extend the AMD Zen6 model space with a new range of models - Fix <asm/intel-family.h> CPU model comments - Fix the CONFIG_CFI=y and CONFIG_LTO_CLANG_FULL=y build, which was unhappy due to missing kCFI type annotations of clear_page() variants * tag 'x86-urgent-2025-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Ensure clear_page() variants always have __kcfi_typeid_ symbols x86/cpu: Add/fix core comments for {Panther,Nova} Lake x86/CPU/AMD: Extend Zen6 model range x86/build: Disable SSE4a x86/fpu: Ensure XFD state on signal delivery x86/CPU/AMD: Add RDSEED fix for Zen5 x86/microcode/AMD: Limit Entrysign signature checking to known generations	2025-11-01 10:20:07 -07:00
Linus Torvalds	f9bc8e0912	Merge tag 'perf-urgent-2025-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event fixes from Ingo Molnar: "Miscellaneous fixes and CPU model updates: - Fix an out-of-bounds access on non-hybrid platforms in the Intel PMU DS code, reported by KASAN - Add WildcatLake PMU and uncore support: it's identical to the PantherLake version" * tag 'perf-urgent-2025-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Add uncore PMU support for Wildcat Lake perf/x86/intel: Add PMU support for WildcatLake perf/x86/intel: Fix KASAN global-out-of-bounds warning	2025-11-01 10:17:40 -07:00
Linus Torvalds	ba36dd5ee6	Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Alexei Starovoitov: - Mark migrate_disable/enable() as always_inline to avoid issues with partial inlining (Yonghong Song) - Fix powerpc stack register definition in libbpf bpf_tracing.h (Andrii Nakryiko) - Reject negative head_room in __bpf_skb_change_head (Daniel Borkmann) - Conditionally include dynptr copy kfuncs (Malin Jonsson) - Sync pending IRQ work before freeing BPF ring buffer (Noorain Eqbal) - Do not audit capability check in x86 do_jit() (Ondrej Mosnacek) - Fix arm64 JIT of BPF_ST insn when it writes into arena memory (Puranjay Mohan) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf/arm64: Fix BPF_ST into arena memory bpf: Make migrate_disable always inline to avoid partial inlining bpf: Reject negative head_room in __bpf_skb_change_head bpf: Conditionally include dynptr copy kfuncs libbpf: Fix powerpc's stack register definition in bpf_tracing.h bpf: Do not audit capability check in do_jit() bpf: Sync pending IRQ work before freeing ring buffer	2025-10-31 18:22:26 -07:00
Nathan Chancellor	9b041a4b66	x86/mm: Ensure clear_page() variants always have __kcfi_typeid_ symbols When building with CONFIG_CFI=y and CONFIG_LTO_CLANG_FULL=y, there is a series of errors from the various versions of clear_page() not having __kcfi_typeid_ symbols. $ cat kernel/configs/repro.config CONFIG_CFI=y # CONFIG_LTO_NONE is not set CONFIG_LTO_CLANG_FULL=y $ make -skj"$(nproc)" ARCH=x86_64 LLVM=1 clean defconfig repro.config bzImage ld.lld: error: undefined symbol: __kcfi_typeid_clear_page_rep >>> referenced by ld-temp.o >>> vmlinux.o:(__cfi_clear_page_rep) ld.lld: error: undefined symbol: __kcfi_typeid_clear_page_orig >>> referenced by ld-temp.o >>> vmlinux.o:(__cfi_clear_page_orig) ld.lld: error: undefined symbol: __kcfi_typeid_clear_page_erms >>> referenced by ld-temp.o >>> vmlinux.o:(__cfi_clear_page_erms) With full LTO, it is possible for LLVM to realize that these functions never have their address taken (as they are only used within an alternative, which will make them a direct call) across the whole kernel and either drop or skip generating their kCFI type identification symbols. clear_page_{rep,orig,erms}() are defined in clear_page_64.S with SYM_TYPED_FUNC_START as a result of `2981557cb0` ("x86,kcfi: Fix EXPORT_SYMBOL vs kCFI"), as exported functions are free to be called indirectly thus need kCFI type identifiers. Use KCFI_REFERENCE with these clear_page() functions to force LLVM to see these functions as address-taken and generate then keep the kCFI type identifiers. Fixes: `2981557cb0` ("x86,kcfi: Fix EXPORT_SYMBOL vs kCFI") Closes: https://github.com/ClangBuiltLinux/linux/issues/2128 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://patch.msgid.link/20251013-x86-fix-clear_page-cfi-full-lto-errors-v1-1-d69534c0be61@kernel.org	2025-10-31 22:47:24 +01:00
Linus Torvalds	b4f7f01ea1	Merge tag 's390-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Heiko Carstens: - Use correct locking in zPCI event code to avoid deadlock - Get rid of irqs_registered flag in zpci_dev structure and restore IRQ unconditionally for zPCI devices. This fixes sit uations where the flag was not correctly updated - Fix potential memory leak kernel page table dumper code - Disable (revert) ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP for s390 again. The optimized hugetlb vmemmap code modifies kernel page tables in a way which does not work on s390 and leads to reproducible kernel crashes due to stale TLB entries. This needs to be addressed with some larger changes. For now simply disable the feature - Update defconfigs * tag 's390-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: Disable ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP s390/mm: Fix memory leak in add_marker() when kvrealloc() fails s390/pci: Restore IRQ unconditionally for the zPCI device s390: Update defconfigs s390/pci: Avoid deadlock between PCI error recovery and mlx5 crdump	2025-10-31 12:50:35 -07:00
Puranjay Mohan	be708ed300	bpf/arm64: Fix BPF_ST into arena memory The arm64 JIT supports BPF_ST with BPF_PROBE_MEM32 (arena) by using the tmp2 register to hold the dst + arena_vm_base value and using tmp2 as the new dst register. But this is broken because in case is_lsi_offset() returns false the tmp2 will be clobbered by emit_a64_mov_i(1, tmp2, off, ctx); and hence the emitted store instruction will be of the form: strb w10, [x11, x11] Fix this by using the third temporary register to hold the dst + arena_vm_base. Fixes: `339af577ec` ("bpf: Add arm64 JIT support for PROBE_MEM32 pseudo instructions.") Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20251030121715.55214-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-31 11:20:53 -07:00
Linus Torvalds	3ad81aa520	Merge tag 'v6.18-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - Fix double free in aspeed - Fix req->nbytes clobbering in s390/phmac * tag 'v6.18-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: aspeed - fix double free caused by devm crypto: s390/phmac - Do not modify the req->nbytes value	2025-10-31 07:25:10 -07:00
Sebastian Ene	103e17aac0	KVM: arm64: Check the untrusted offset in FF-A memory share Verify the offset to prevent OOB access in the hypervisor FF-A buffer in case an untrusted large enough value [U32_MAX - sizeof(struct ffa_composite_mem_region) + 1, U32_MAX] is set from the host kernel. Signed-off-by: Sebastian Ene <sebastianene@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://patch.msgid.link/20251017075710.2605118-1-sebastianene@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-30 16:14:58 +00:00
Vincent Donnefort	f71f7afd0a	KVM: arm64: Check range args for pKVM mem transitions There's currently no verification for host issued ranges in most of the pKVM memory transitions. The end boundary might therefore be subject to overflow and later checks could be evaded. Close this loophole with an additional pfn_range_is_valid() check on a per public function basis. Once this check has passed, it is safe to convert pfn and nr_pages into a phys_addr_t and a size. host_unshare_guest transition is already protected via __check_host_shared_guest(), while assert_host_shared_guest() callers are already ignoring host checks. Signed-off-by: Vincent Donnefort <vdonnefort@google.com> Link: https://patch.msgid.link/20251016164541.3771235-1-vdonnefort@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-30 16:14:37 +00:00
Sascha Bischoff	da888524c3	KVM: arm64: vgic-v3: Trap all if no in-kernel irqchip If there is no in-kernel irqchip for a GICv3 host set all of the trap bits to block all accesses. This fixes the no-vgic-v3 selftest again. Fixes: `3193287ddf` ("KVM: arm64: gic-v3: Only set ICH_HCR traps for v2-on-v3 or v3 guests") Reported-by: Mark Brown <broonie@kernel.org> Closes: https://lore.kernel.org/all/23072856-6b8c-41e2-93d1-ea8a240a7079@sirena.org.uk Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Reviewed-by: Sebastian Ott <sebott@redhat.com> Tested-by: Mark Brown <broonie@kernel.org> Link: https://patch.msgid.link/20251021094358.1963807-1-sascha.bischoff@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-30 16:11:21 +00:00
Heiko Carstens	64e2f60f35	s390: Disable ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP As reported by Luiz Capitulino enabling HVO on s390 leads to reproducible crashes. The problem is that kernel page tables are modified without flushing corresponding TLB entries. Even if it looks like the empty flush_tlb_all() implementation on s390 is the problem, it is actually a different problem: on s390 it is not allowed to replace an active/valid page table entry with another valid page table entry without the detour over an invalid entry. A direct replacement may lead to random crashes and/or data corruption. In order to invalidate an entry special instructions have to be used (e.g. ipte or idte). Alternatively there are also special instructions available which allow to replace a valid entry with a different valid entry (e.g. crdte or cspg). Given that the HVO code currently does not provide the hooks to allow for an implementation which is compliant with the s390 architecture requirements, disable ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP again, which is basically a revert of the original patch which enabled it. Reported-by: Luiz Capitulino <luizcap@redhat.com> Closes: https://lore.kernel.org/all/20251028153930.37107-1-luizcap@redhat.com/ Fixes: `00a34d5a99` ("s390: select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP") Cc: stable@vger.kernel.org Tested-by: Luiz Capitulino <luizcap@redhat.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2025-10-30 16:59:28 +01:00
Tony Luck	89216c9051	x86/cpu: Add/fix core comments for {Panther,Nova} Lake The E-core in Panther Lake is Darkmont, not Crestmont. Nova Lake is built from Coyote Cove (P-core) and Arctic Wolf (E-core). Fixes: `43bb700cff` ("x86/cpu: Update Intel Family comments") Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20251028172948.6721-1-tony.luck@intel.com	2025-10-30 11:34:02 +01:00
Borislav Petkov (AMD)	847ebc4476	x86/CPU/AMD: Extend Zen6 model range Add some more Zen6 models. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://patch.msgid.link/20251029123056.19987-1-bp@kernel.org	2025-10-30 11:33:55 +01:00
Nathan Chancellor	1ed9e6b100	ARM: Select ARCH_USES_CFI_GENERIC_LLVM_PASS Prior to clang 22.0.0 [1], ARM did not have an architecture specific kCFI bundle lowering in the backend, which may cause issues. Select CONFIG_ARCH_USES_CFI_GENERIC_LLVM_PASS to enable use of __nocfi_generic. Link: `d130f40264` [1] Link: https://github.com/ClangBuiltLinux/linux/issues/2124 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://patch.msgid.link/20251025-idpf-fix-arm-kcfi-build-error-v1-2-ec57221153ae@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>	2025-10-29 20:04:55 -07:00
Nathan Chancellor	39c89ee6e9	compiler_types: Introduce __nocfi_generic There are two different ways that LLVM can expand kCFI operand bundles in LLVM IR: generically in the middle end or using an architecture specific sequence when lowering LLVM IR to machine code in the backend. The generic pass allows any architecture to take advantage of kCFI but the expansion of these bundles in the middle end can mess with optimizations that may turn indirect calls into direct calls when the call target is known at compile time, such as after inlining. Add __nocfi_generic, dependent on an architecture selecting CONFIG_ARCH_USES_CFI_GENERIC_LLVM_PASS, to disable kCFI bundle generation in functions where only the generic kCFI pass may cause problems. Link: https://github.com/ClangBuiltLinux/linux/issues/2124 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://patch.msgid.link/20251025-idpf-fix-arm-kcfi-build-error-v1-1-ec57221153ae@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>	2025-10-29 20:04:55 -07:00
Miaoqian Lin	07ad45e06b	s390/mm: Fix memory leak in add_marker() when kvrealloc() fails The function has a memory leak when kvrealloc() fails. The function directly assigns NULL to the markers pointer, losing the reference to the previously allocated memory. This causes kvfree() in pt_dump_init() to free NULL instead of the leaked memory. Fix by: 1. Using kvrealloc() uniformly for all allocations 2. Using a temporary variable to preserve the original pointer until allocation succeeds 3. Removing the error path that sets markers_cnt=0 to keep consistency between markers and markers_cnt Found via static analysis and this is similar to commit `42378a9ca5` ("bpf, verifier: Fix memory leak in array reallocation for stack state") Fixes: `d0e7915d2a` ("s390/mm/ptdump: Generate address marker array dynamically") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2025-10-29 14:17:50 +01:00
dongsheng	f4c12e5cef	perf/x86/intel/uncore: Add uncore PMU support for Wildcat Lake WildcatLake (WCL) is a variant of PantherLake (PTL) and shares the same uncore PMU features with PTL. Therefore, directly reuse Pantherlake's uncore PMU enabling code for WildcatLake. Signed-off-by: dongsheng <dongsheng.x.zhang@intel.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20250908061639.938105-2-dapeng1.mi@linux.intel.com	2025-10-29 11:31:44 +01:00
Dapeng Mi	b796a8feb7	perf/x86/intel: Add PMU support for WildcatLake WildcatLake is a variant of PantherLake and shares same PMU features, so directly reuse Pantherlake's code to enable PMU features for WildcatLake. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Zide Chen <zide.chen@intel.com> Link: https://patch.msgid.link/20250908061639.938105-1-dapeng1.mi@linux.intel.com	2025-10-29 11:31:44 +01:00
Dapeng Mi	0ba6502ce1	perf/x86/intel: Fix KASAN global-out-of-bounds warning When running "perf mem record" command on CWF, the below KASAN global-out-of-bounds warning is seen. ================================================================== BUG: KASAN: global-out-of-bounds in cmt_latency_data+0x176/0x1b0 Read of size 4 at addr ffffffffb721d000 by task dtlb/9850 Call Trace: kasan_report+0xb8/0xf0 cmt_latency_data+0x176/0x1b0 setup_arch_pebs_sample_data+0xf49/0x2560 intel_pmu_drain_arch_pebs+0x577/0xb00 handle_pmi_common+0x6c4/0xc80 The issue is caused by below code in __grt_latency_data(). The code tries to access x86_hybrid_pmu structure which doesn't exist on non-hybrid platform like CWF. WARN_ON_ONCE(hybrid_pmu(event->pmu)->pmu_type == hybrid_big) So add is_hybrid() check before calling this WARN_ON_ONCE to fix the global-out-of-bounds access issue. Fixes: `090262439f` ("perf/x86/intel: Rename model-specific pebs_latency_data functions") Reported-by: Xudong Hao <xudong.hao@intel.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Zide Chen <zide.chen@intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251028064214.1451968-1-dapeng1.mi@linux.intel.com	2025-10-29 10:29:52 +01:00
Peter Zijlstra	0d6e9ec80c	x86/build: Disable SSE4a Leyvi Rose reported that his X86_NATIVE_CPU=y build is failing because our instruction decoder doesn't support SSE4a and the AMDGPU code seems to be generating those with his compiler of choice (CLANG+LTO). Now, our normal build flags disable SSE MMX SSE2 3DNOW AVX, but then CC_FLAGS_FPU re-enable SSE SSE2. Since nothing mentions SSE3 or SSE4, I'm assuming that -msse (or its negative) control all SSE variants -- but why then explicitly enumerate SSE2 ? Anyway, until the instruction decoder gets fixed, explicitly disallow SSE4a (an AMD specific SSE4 extension). Fixes: `ea1dcca1de` ("x86/kbuild/64: Add the CONFIG_X86_NATIVE_CPU option to locally optimize the kernel with '-march=native'") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Arisu Tachibana <arisu.tachibana@miraclelinux.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Cc: <stable@kernel.org>	2025-10-28 20:43:36 +01:00
Chang S. Bae	388eff894d	x86/fpu: Ensure XFD state on signal delivery Sean reported [1] the following splat when running KVM tests: WARNING: CPU: 232 PID: 15391 at xfd_validate_state+0x65/0x70 Call Trace: <TASK> fpu__clear_user_states+0x9c/0x100 arch_do_signal_or_restart+0x142/0x210 exit_to_user_mode_loop+0x55/0x100 do_syscall_64+0x205/0x2c0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Chao further identified [2] a reproducible scenario involving signal delivery: a non-AMX task is preempted by an AMX-enabled task which modifies the XFD MSR. When the non-AMX task resumes and reloads XSTATE with init values, a warning is triggered due to a mismatch between fpstate::xfd and the CPU's current XFD state. fpu__clear_user_states() does not currently re-synchronize the XFD state after such preemption. Invoke xfd_update_state() which detects and corrects the mismatch if there is a dynamic feature. This also benefits the sigreturn path, as fpu__restore_sig() may call fpu__clear_user_states() when the sigframe is inaccessible. [ dhansen: minor changelog munging ] Closes: https://lore.kernel.org/lkml/aDCo_SczQOUaB2rS@google.com [1] Fixes: `672365477a` ("x86/fpu: Update XFD state where required") Reported-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Chao Gao <chao.gao@intel.com> Tested-by: Chao Gao <chao.gao@intel.com> Link: https://lore.kernel.org/all/aDWbctO%2FRfTGiCg3@intel.com [2] Cc:stable@vger.kernel.org Link: https://patch.msgid.link/20250610001700.4097-1-chang.seok.bae%40intel.com	2025-10-28 12:10:59 -07:00
Gregory Price	607b9fb2ce	x86/CPU/AMD: Add RDSEED fix for Zen5 There's an issue with RDSEED's 16-bit and 32-bit register output variants on Zen5 which return a random value of 0 "at a rate inconsistent with randomness while incorrectly signaling success (CF=1)". Search the web for AMD-SB-7055 for more detail. Add a fix glue which checks microcode revisions. [ bp: Add microcode revisions checking, rewrite. ] Cc: stable@vger.kernel.org Signed-off-by: Gregory Price <gourry@gourry.net> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20251018024010.4112396-1-gourry@gourry.net	2025-10-28 12:37:49 +01:00

1 2 3 4 5 ...

238443 Commits