linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-22 12:05:09 -04:00

Author	SHA1	Message	Date
Marek Vasut	67934f248e	arm64: dts: imx95: Describe Mali G310 GPU The instance of the GPU populated in i.MX95 is the G310, describe this GPU in the DT. Include dummy GPU voltage regulator and OPP tables. Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Marek Vasut <marek.vasut@mailbox.org> Link: https://patch.msgid.link/20251102160927.45157-2-marek.vasut@mailbox.org Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>	2025-11-03 14:25:22 +00:00
Raphael Gallais-Pou	a66e078c33	ARM: dts: sti: remove useless cells fields tvout node do not need the cells fields. Remove them. Signed-off-by: Raphael Gallais-Pou <rgallaispou@gmail.com> Acked-by: Patrice Chotard <patrice.chotard@foss.st.com> Acked-by: Alain Volmat <alain.volmat@foss.st.com> Link: https://patch.msgid.link/20250717-sti-rework-v1-4-46d516fb1ebb@gmail.com Signed-off-by: Raphael Gallais-Pou <raphael.gallais-pou@foss.st.com>	2025-10-30 10:30:10 +01:00
Raphael Gallais-Pou	7891011fbc	ARM: dts: sti: extract display subsystem out of soc The display subsystem represent how IPs are interacting together and have nothing to do within the SoC node. Extract it from the SoC node and let IPs nodes in the Soc node. Several nodes did not use conventional name: * sti-display-subsystem -> display-subsystem * sti-controller -> display-controller * sti-tvout -> encoder * sti-hda -> analog * sti-hqvdp -> plane Signed-off-by: Raphael Gallais-Pou <rgallaispou@gmail.com> Acked-by: Patrice Chotard <patrice.chotard@foss.st.com> Acked-by: Alain Volmat <alain.volmat@foss.st.com> Link: https://patch.msgid.link/20250717-sti-rework-v1-3-46d516fb1ebb@gmail.com Signed-off-by: Raphael Gallais-Pou <raphael.gallais-pou@foss.st.com>	2025-10-30 10:30:10 +01:00
Linus Torvalds	c7864eeaa4	Merge tag 'x86_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Reset the why-the-system-rebooted register on AMD to avoid stale bits remaining from previous boots - Add a missing barrier in the TLB flushing code to prevent erroneously not flushing a TLB generation - Make sure cpa_flush() does not overshoot when computing the end range of a flush region - Fix resctrl bandwidth counting on AMD systems when the amount of monitoring groups created exceeds the number the hardware can track * tag 'x86_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/CPU/AMD: Prevent reset reasons from being retained across reboot x86/mm: Fix SMP ordering in switch_mm_irqs_off() x86/mm: Fix overflow in __cpa_addr() x86/resctrl: Fix miscount of bandwidth event when reactivating previously unavailable RMID	2025-10-19 04:41:27 -10:00
Linus Torvalds	02e5f74ef0	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "ARM: - Fix the handling of ZCR_EL2 in NV VMs - Pick the correct translation regime when doing a PTW on the back of a SEA - Prevent userspace from injecting an event into a vcpu that isn't initialised yet - Move timer save/restore to the sysreg handling code, fixing EL2 timer access in the process - Add FGT-based trapping of MDSCR_EL1 to reduce the overhead of debug - Fix trapping configuration when the host isn't GICv3 - Improve the detection of HCR_EL2.E2H being RES1 - Drop a spurious 'break' statement in the S1 PTW - Don't try to access SPE when owned by EL3 Documentation updates: - Document the failure modes of event injection - Document that a GICv3 guest can be created on a GICv5 host with FEAT_GCIE_LEGACY Selftest improvements: - Add a selftest for the effective value of HCR_EL2.AMO - Address build warning in the timer selftest when building with clang - Teach irqfd selftests about non-x86 architectures - Add missing sysregs to the set_id_regs selftest - Fix vcpu allocation in the vgic_lpi_stress selftest - Correctly enable interrupts in the vgic_lpi_stress selftest x86: - Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test for the bug fixed by commit `3ccbf6f470` ("KVM: x86/mmu: Return -EAGAIN if userspace deletes/moves memslot during prefault") - Don't try to get PMU capabilities from perf when running a CPU with hybrid CPUs/PMUs, as perf will rightly WARN. guest_memfd: - Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a more generic KVM_CAP_GUEST_MEMFD_FLAGS - Add a guest_memfd INIT_SHARED flag and require userspace to explicitly set said flag to initialize memory as SHARED, irrespective of MMAP. The behavior merged in 6.18 is that enabling mmap() implicitly initializes memory as SHARED, which would result in an ABI collision for x86 CoCo VMs as their memory is currently always initialized PRIVATE. - Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with private memory, to enable testing such setups, i.e. to hopefully flush out any other lurking ABI issues before 6.18 is officially released. - Add testcases to the guest_memfd selftest to cover guest_memfd without MMAP, and host userspace accesses to mmap()'d private memory" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (46 commits) arm64: Revamp HCR_EL2.E2H RES1 detection KVM: arm64: nv: Use FGT write trap of MDSCR_EL1 when available KVM: arm64: Compute per-vCPU FGTs at vcpu_load() KVM: arm64: selftests: Fix misleading comment about virtual timer encoding KVM: arm64: selftests: Add an E2H=0-specific configuration to get_reg_list KVM: arm64: selftests: Make dependencies on VHE-specific registers explicit KVM: arm64: Kill leftovers of ad-hoc timer userspace access KVM: arm64: Fix WFxT handling of nested virt KVM: arm64: Move CNTCT_EL0 userspace accessors to generic infrastructure KVM: arm64: Move CNT_CVAL_EL0 userspace accessors to generic infrastructure KVM: arm64: Move CNT_CTL_EL0 userspace accessors to generic infrastructure KVM: arm64: Add timer UAPI workaround to sysreg infrastructure KVM: arm64: Make timer_set_offset() generally accessible KVM: arm64: Replace timer context vcpu pointer with timer_id KVM: arm64: Introduce timer_context_to_vcpu() helper KVM: arm64: Hide CNTHV__EL2 from userspace for nVHE guests Documentation: KVM: Update GICv3 docs for GICv5 hosts KVM: arm64: gic-v3: Only set ICH_HCR traps for v2-on-v3 or v3 guests KVM: arm64: selftests: Actually enable IRQs in vgic_lpi_stress KVM: arm64: selftests: Allocate vcpus with correct size ...	2025-10-18 07:07:14 -10:00
Linus Torvalds	0e622c4b0e	Merge tag 'powerpc-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Madhavan Srinivasan: - Fix to handle NULL pointer dereference at irq domain teardown - Fix for handling extraction of struct xive_irq_data - Fix to skip parameter area allocation when fadump disabled Thanks to Ganesh Goudar, Hari Bathini, Nam Cao, Ritesh Harjani (IBM), Sourabh Jain, and Venkat Rao Bagalkote, * tag 'powerpc-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/fadump: skip parameter area allocation when fadump is disabled powerpc, ocxl: Fix extraction of struct xive_irq_data powerpc/pseries/msi: Fix NULL pointer dereference at irq domain teardown	2025-10-18 07:02:28 -10:00
Paolo Bonzini	4361f5aa8b	Merge tag 'kvm-x86-fixes-6.18-rc2' of https://github.com/kvm-x86/linux into HEAD KVM x86 fixes for 6.18: - Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test for the bug fixed by commit `3ccbf6f470` ("KVM: x86/mmu: Return -EAGAIN if userspace deletes/moves memslot during prefault") - Don't try to get PMU capabbilities from perf when running a CPU with hybrid CPUs/PMUs, as perf will rightly WARN. - Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a more generic KVM_CAP_GUEST_MEMFD_FLAGS - Add a guest_memfd INIT_SHARED flag and require userspace to explicitly set said flag to initialize memory as SHARED, irrespective of MMAP. The behavior merged in 6.18 is that enabling mmap() implicitly initializes memory as SHARED, which would result in an ABI collision for x86 CoCo VMs as their memory is currently always initialized PRIVATE. - Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with private memory, to enable testing such setups, i.e. to hopefully flush out any other lurking ABI issues before 6.18 is officially released. - Add testcases to the guest_memfd selftest to cover guest_memfd without MMAP, and host userspace accesses to mmap()'d private memory.	2025-10-18 10:25:43 +02:00
Linus Torvalds	f406055cb1	Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Explicitly encode the XZR register if the value passed to write_sysreg_s() is 0. The GIC CDEOI instruction is encoded as a system register write with XZR as the source register. However, clang does not honour the "Z" register constraint, leading to incorrect code generation - Ensure the interrupts (DAIF.IF) are unmasked when completing single-step of a suspended breakpoint before calling exit_to_user_mode(). With pseudo-NMIs, interrupts are (additionally) masked at the PMR_EL1 register, handled by local_irq_() tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: debug: always unmask interrupts in el0_softstp() arm64/sysreg: Fix GIC CDEOI instruction encoding	2025-10-17 13:04:21 -10:00
Linus Torvalds	fe69107ec7	Merge tag 'riscv-for-linux-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: - Disable CFI with Rust for any platform other than x86 and ARM64 - Keep task mm_cpumasks up-to-date to avoid triggering M-mode firmware warnings if the kernel tries to send an IPI to an offline CPU - Improve kprobe address validation performance and avoid desyncs (following x86) - Avoid duplicate device probes by avoiding DT hardware probing when ACPI is enabled in early boot - Use the correct set of dependencies for CONFIG_ARCH_HAS_ELF_CORE_EFLAGS, avoiding an allnoconfig warning - Fix a few other minor issues * tag 'riscv-for-linux-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: kprobes: convert one final __ASSEMBLY__ to __ASSEMBLER__ riscv: Respect dependencies of ARCH_HAS_ELF_CORE_EFLAGS riscv: acpi: avoid errors caused by probing DT devices when ACPI is used riscv: kprobes: Fix probe address validation riscv: entry: fix typo in comment 'instruciton' -> 'instruction' RISC-V: clear hot-unplugged cores from all task mm_cpumasks to avoid rfence errors riscv: kgdb: Ensure that BUFMAX > NUMREGBYTES rust: cfi: only 64-bit arm and x86 support CFI_CLANG	2025-10-17 12:59:31 -10:00
Ada Couprie Diaz	ea0d55ae4b	arm64: debug: always unmask interrupts in el0_softstp() We intend that EL0 exception handlers unmask all DAIF exceptions before calling exit_to_user_mode(). When completing single-step of a suspended breakpoint, we do not call local_daif_restore(DAIF_PROCCTX) before calling exit_to_user_mode(), leaving all DAIF exceptions masked. When pseudo-NMIs are not in use this is benign. When pseudo-NMIs are in use, this is unsound. At this point interrupts are masked by both DAIF.IF and PMR_EL1, and subsequent irq flag manipulation may not work correctly. For example, a subsequent local_irq_enable() within exit_to_user_mode_loop() will only unmask interrupts via PMR_EL1 (leaving those masked via DAIF.IF), and anything depending on interrupts being unmasked (e.g. delivery of signals) will not work correctly. This was detected by CONFIG_ARM64_DEBUG_PRIORITY_MASKING. Move the call to `try_step_suspended_breakpoints()` outside of the check so that interrupts can be unmasked even if we don't call the step handler. Fixes: `0ac7584c08` ("arm64: debug: split single stepping exception entry") Cc: <stable@vger.kernel.org> # 6.17 Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> [catalin.marinas@arm.com: added Mark's rewritten commit log and some whitespace] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2025-10-17 18:08:05 +01:00
Lorenzo Pieralisi	e9ad390a48	arm64/sysreg: Fix GIC CDEOI instruction encoding The GIC CDEOI system instruction requires the Rt field to be set to 0b11111 otherwise the instruction behaviour becomes CONSTRAINED UNPREDICTABLE. Currenly, its usage is encoded as a system register write, with a constant 0 value: write_sysreg_s(0, GICV5_OP_GIC_CDEOI) While compiling with GCC, the 0 constant value, through these asm constraints and modifiers ('x' modifier and 'Z' constraint combo): asm volatile(__msr_s(r, "%x0") : : "rZ" (__val)); forces the compiler to issue the XZR register for the MSR operation (ie that corresponds to Rt == 0b11111) issuing the right instruction encoding. Unfortunately LLVM does not yet understand that modifier/constraint combo so it ends up issuing a different register from XZR for the MSR source, which in turns means that it encodes the GIC CDEOI instruction wrongly and the instruction behaviour becomes CONSTRAINED UNPREDICTABLE that we must prevent. Add a conditional to write_sysreg_s() macro that detects whether it is passed a constant 0 value and issues an MSR write with XZR as source register - explicitly doing what the asm modifier/constraint is meant to achieve through constraints/modifiers, fixing the LLVM compilation issue. Fixes: `7ec80fb3f0` ("irqchip/gic-v5: Add GICv5 PPI support") Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Cc: Sascha Bischoff <sascha.bischoff@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2025-10-17 17:55:01 +01:00
Rong Zhang	e6416c2dfe	x86/CPU/AMD: Prevent reset reasons from being retained across reboot The S5_RESET_STATUS register is parsed on boot and printed to kmsg. However, this could sometimes be misleading and lead to users wasting a lot of time on meaningless debugging for two reasons: * Some bits are never cleared by hardware. It's the software's responsibility to clear them as per the Processor Programming Reference (see [1]). * Some rare hardware-initiated platform resets do not update the register at all. In both cases, a previous reboot could leave its trace in the register, resulting in users seeing unrelated reboot reasons while debugging random reboots afterward. Write the read value back to the register in order to clear all reason bits since they are write-1-to-clear while the others must be preserved. [1]: https://bugzilla.kernel.org/show_bug.cgi?id=206537#attach_303991 [ bp: Massage commit message. ] Fixes: `ab81310287` ("x86/CPU/AMD: Print the reason for the last reset") Signed-off-by: Rong Zhang <i@rong.moe> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Cc: <stable@kernel.org> Link: https://lore.kernel.org/all/20250913144245.23237-1-i@rong.moe/	2025-10-15 21:38:06 +02:00
Linus Torvalds	1f4a222b0e	Remove long-stale ext3 defconfig option Inspired by commit `c065b6046b` ("Use CONFIG_EXT4_FS instead of CONFIG_EXT3_FS in all of the defconfigs") I looked around for any other left-over EXT3 config options, and found some old defconfig files still mentioned CONFIG_EXT3_DEFAULTS_TO_ORDERED. That config option was removed a decade ago in commit `c290ea01ab` ("fs: Remove ext3 filesystem driver"). It had a good run, but let's remove it for good. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2025-10-15 07:57:28 -07:00
Linus Torvalds	66f8e4df00	Merge tag 'ext4_for_linus-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bug fixes from Ted Ts'o: - Fix regression caused by removing CONFIG_EXT3_FS when testing some very old defconfigs - Avoid a BUG_ON when opening a file on a maliciously corrupted file system - Avoid mm warnings when freeing a very large orphan file metadata - Avoid a theoretical races between metadata writeback and checkpoints (it's very hard to hit in practice, since the race requires that the writeback take a very long time) * tag 'ext4_for_linus-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: Use CONFIG_EXT4_FS instead of CONFIG_EXT3_FS in all of the defconfigs ext4: free orphan info with kvfree ext4: detect invalid INLINE_DATA + EXTENTS flag combination ext4, doc: fix and improve directory hash tree description ext4: wait for ongoing I/O to complete before freeing blocks jbd2: ensure that all ongoing I/O complete before freeing blocks	2025-10-15 07:51:57 -07:00
Marc Zyngier	ca88ecdce5	arm64: Revamp HCR_EL2.E2H RES1 detection We currently have two ways to identify CPUs that only implement FEAT_VHE and not FEAT_E2H0: - either they advertise it via ID_AA64MMFR4_EL1.E2H0, - or the HCR_EL2.E2H bit is RAO/WI However, there is a third category of "cpus" that fall between these two cases: on CPUs that do not implement FEAT_FGT, it is IMPDEF whether an access to ID_AA64MMFR4_EL1 can trap to EL2 when the register value is zero. A consequence of this is that on systems such as Neoverse V2, a NV guest cannot reliably detect that it is in a VHE-only configuration (E2H is writable, and ID_AA64MMFR0_EL1 is 0), despite the hypervisor's best effort to repaint the id register. Replace the RAO/WI test by a sequence that makes use of the VHE register remnapping between EL1 and EL2 to detect this situation, and work out whether we get the VHE behaviour even after having set HCR_EL2.E2H to 0. This solves the NV problem, and provides a more reliable acid test for CPUs that do not completely follow the letter of the architecture while providing a RES1 behaviour for HCR_EL2.E2H. Suggested-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Tested-by: Jan Kotas <jank@cadence.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/15A85F2B-1A0C-4FA7-9FE4-EEC2203CC09E@global.cadence.com	2025-10-14 08:18:40 +01:00
Theodore Ts'o	c065b6046b	Use CONFIG_EXT4_FS instead of CONFIG_EXT3_FS in all of the defconfigs Commit `d6ace46c82` ("ext4: remove obsolete EXT3 config options") removed the obsolete EXT3_CONFIG options, since it had been over a decade since fs/ext3 had been removed. Unfortunately, there were a number of defconfigs that still used CONFIG_EXT3_FS which the cleanup commit didn't fix up. This led to a large number of defconfig test builds to fail. Oops. Fixes: `d6ace46c82` ("ext4: remove obsolete EXT3 config options") Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2025-10-13 21:50:40 -04:00
Ingo Molnar	83b0177a6c	x86/mm: Fix SMP ordering in switch_mm_irqs_off() Stephen noted that it is possible to not have an smp_mb() between the loaded_mm store and the tlb_gen load in switch_mm(), meaning the ordering against flush_tlb_mm_range() goes out the window, and it becomes possible for switch_mm() to not observe a recent tlb_gen update and fail to flush the TLBs. [ dhansen: merge conflict fixed by Ingo ] Fixes: `209954cbc7` ("x86/mm/tlb: Update mm_cpumask lazily") Reported-by: Stephen Dolan <sdolan@janestreet.com> Closes: https://lore.kernel.org/all/CAHDw0oGd0B4=uuv8NGqbUQ_ZVmSheU2bN70e4QhFXWvuAZdt2w@mail.gmail.com/ Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>	2025-10-13 13:55:53 -07:00
Rik van Riel	f25785f9b0	x86/mm: Fix overflow in __cpa_addr() The change to have cpa_flush() call flush_kernel_pages() introduced a bug where __cpa_addr() can access an address one larger than the largest one in the cpa->pages array. KASAN reports the issue like this: BUG: KASAN: slab-out-of-bounds in __cpa_addr arch/x86/mm/pat/set_memory.c:309 [inline] BUG: KASAN: slab-out-of-bounds in __cpa_addr+0x1d3/0x220 arch/x86/mm/pat/set_memory.c:306 Read of size 8 at addr ffff88801f75e8f8 by task syz.0.17/5978 This bug could cause cpa_flush() to not properly flush memory, which somehow never showed any symptoms in my tests, possibly because cpa_flush() is called so rarely, but could potentially cause issues for other people. Fix the issue by directly calculating the flush end address from the start address. Fixes: `86e6815b31` ("x86/mm: Change cpa_flush() to call flush_kernel_range() directly") Reported-by: syzbot+afec6555eef563c66c97@syzkaller.appspotmail.com Signed-off-by: Rik van Riel <riel@surriel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Kiryl Shutsemau <kas@kernel.org> Link: https://lore.kernel.org/all/68e2ff90.050a0220.2c17c1.0038.GAE@google.com/	2025-10-13 13:55:48 -07:00
Babu Moger	15292f1b4c	x86/resctrl: Fix miscount of bandwidth event when reactivating previously unavailable RMID Users can create as many monitoring groups as the number of RMIDs supported by the hardware. However, on AMD systems, only a limited number of RMIDs are guaranteed to be actively tracked by the hardware. RMIDs that exceed this limit are placed in an "Unavailable" state. When a bandwidth counter is read for such an RMID, the hardware sets MSR_IA32_QM_CTR.Unavailable (bit 62). When such an RMID starts being tracked again the hardware counter is reset to zero. MSR_IA32_QM_CTR.Unavailable remains set on first read after tracking re-starts and is clear on all subsequent reads as long as the RMID is tracked. resctrl miscounts the bandwidth events after an RMID transitions from the "Unavailable" state back to being tracked. This happens because when the hardware starts counting again after resetting the counter to zero, resctrl in turn compares the new count against the counter value stored from the previous time the RMID was tracked. This results in resctrl computing an event value that is either undercounting (when new counter is more than stored counter) or a mistaken overflow (when new counter is less than stored counter). Reset the stored value (arch_mbm_state::prev_msr) of MSR_IA32_QM_CTR to zero whenever the RMID is in the "Unavailable" state to ensure accurate counting after the RMID resets to zero when it starts to be tracked again. Example scenario that results in mistaken overflow ================================================== 1. The resctrl filesystem is mounted, and a task is assigned to a monitoring group. $mount -t resctrl resctrl /sys/fs/resctrl $mkdir /sys/fs/resctrl/mon_groups/test1/ $echo 1234 > /sys/fs/resctrl/mon_groups/test1/tasks $cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_/mbm_total_bytes 21323 <- Total bytes on domain 0 "Unavailable" <- Total bytes on domain 1 Task is running on domain 0. Counter on domain 1 is "Unavailable". 2. The task runs on domain 0 for a while and then moves to domain 1. The counter starts incrementing on domain 1. $cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_/mbm_total_bytes 7345357 <- Total bytes on domain 0 4545 <- Total bytes on domain 1 3. At some point, the RMID in domain 0 transitions to the "Unavailable" state because the task is no longer executing in that domain. $cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_/mbm_total_bytes "Unavailable" <- Total bytes on domain 0 434341 <- Total bytes on domain 1 4. Since the task continues to migrate between domains, it may eventually return to domain 0. $cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_/mbm_total_bytes 17592178699059 <- Overflow on domain 0 3232332 <- Total bytes on domain 1 In this case, the RMID on domain 0 transitions from "Unavailable" state to active state. The hardware sets MSR_IA32_QM_CTR.Unavailable (bit 62) when the counter is read and begins tracking the RMID counting from 0. Subsequent reads succeed but return a value smaller than the previously saved MSR value (7345357). Consequently, the resctrl's overflow logic is triggered, it compares the previous value (7345357) with the new, smaller value and incorrectly interprets this as a counter overflow, adding a large delta. In reality, this is a false positive: the counter did not overflow but was simply reset when the RMID transitioned from "Unavailable" back to active state. Here is the text from APM [1] available from [2]. "In PQOS Version 2.0 or higher, the MBM hardware will set the U bit on the first QM_CTR read when it begins tracking an RMID that it was not previously tracking. The U bit will be zero for all subsequent reads from that RMID while it is still tracked by the hardware. Therefore, a QM_CTR read with the U bit set when that RMID is in use by a processor can be considered 0 when calculating the difference with a subsequent read." [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming Publication # 24593 Revision 3.41 section 19.3.3 Monitoring L3 Memory Bandwidth (MBM). [ bp: Split commit message into smaller paragraph chunks for better consumption. ] Fixes: `4d05bf71f1` ("x86/resctrl: Introduce AMD QOS feature") Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Cc: stable@vger.kernel.org # needs adjustments for <= v6.17 Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2]	2025-10-13 21:24:39 +02:00
Oliver Upton	e0b5a7967d	KVM: arm64: nv: Use FGT write trap of MDSCR_EL1 when available Marc reports that the performance of running an L3 guest has regressed by 60% as a result of setting MDCR_EL2.TDA to hide bad architecture. That's of course terrible for the single user of recursive NV ;-) While there's nothing to be done on non-FGT systems, take advantage of the precise write trap of MDSCR_EL1 and leave the rest of the debug registers untrapped. Reported-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:44:37 +01:00
Oliver Upton	fb10ddf35c	KVM: arm64: Compute per-vCPU FGTs at vcpu_load() To date KVM has used the fine-grained traps for the sake of UNDEF enforcement (so-called FGUs), meaning the constituent parts could be computed on a per-VM basis and folded into the effective value when programmed. Prepare for traps changing based on the vCPU context by computing the whole mess of them at vcpu_load(). Aggressively inline all the helpers to preserve the build-time checks that were there before. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:44:37 +01:00
Marc Zyngier	386aac77da	KVM: arm64: Kill leftovers of ad-hoc timer userspace access Now that the whole timer infrastructure is handled as system register accesses, get rid of the now unused ad-hoc infrastructure. Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:42:41 +01:00
Marc Zyngier	892f7c38ba	KVM: arm64: Fix WFxT handling of nested virt The spec for WFxT indicates that the parameter to the WFxT instruction is relative to the reading of CNTVCT_EL0. This means that the implementation needs to take the execution context into account, as CNTVOFF_EL2 does not always affect readings of CNTVCT_EL0 (such as when HCR_EL2.E2H is 1 and that we're in host context). This also rids us of the last instance of KVM_REG_ARM_TIMER_CNT outside of the userspace interaction code. Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:42:41 +01:00
Marc Zyngier	c3be3a48fb	KVM: arm64: Move CNT*CT_EL0 userspace accessors to generic infrastructure Moving the counter registers is a bit more involved than for the control and comparator (there is no shadow data for the counter), but still pretty manageable. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:42:41 +01:00
Marc Zyngier	8af198980e	KVM: arm64: Move CNT*_CVAL_EL0 userspace accessors to generic infrastructure As for the control registers, move the comparator registers to the common infrastructure. Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:42:41 +01:00
Marc Zyngier	09424d5d7d	KVM: arm64: Move CNT_CTL_EL0 userspace accessors to generic infrastructure Remove the handling of CNT_CTL_EL0 from guest.c, and move it to sys_regs.c, using a new TIMER_REG() definition to encapsulate it. Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:42:40 +01:00
Marc Zyngier	77a0c42eaf	KVM: arm64: Add timer UAPI workaround to sysreg infrastructure Amongst the numerous bugs that plague the KVM/arm64 UAPI, one of the most annoying thing is that the userspace view of the virtual timer has its CVAL and CNT encodings swapped. In order to reduce the amount of code that has to know about this, start by adding handling for this bug in the sys_reg code. Nothing is making use of it yet, as the code responsible for userspace interaction is catching the accesses early. Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:42:40 +01:00
Marc Zyngier	a92d552266	KVM: arm64: Make timer_set_offset() generally accessible Move the timer_set_offset() helper to arm_arch_timer.h, so that it is next to timer_get_offset(), and accessible by the rest of KVM. Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:42:40 +01:00
Marc Zyngier	8625a670af	KVM: arm64: Replace timer context vcpu pointer with timer_id Having to follow a pointer to a vcpu is pretty dumb, when the timers are are a fixed offset in the vcpu structure itself. Trade the vcpu pointer for a timer_id, which can then be used to compute the vcpu address as needed. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:42:40 +01:00
Marc Zyngier	aa68975c97	KVM: arm64: Introduce timer_context_to_vcpu() helper We currently have a vcpu pointer nested into each timer context. As we are about to remove this pointer, introduce a helper (aptly named timer_context_to_vcpu()) that returns this pointer, at least until we repaint the data structure. Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:42:40 +01:00
Marc Zyngier	4cab5c857d	KVM: arm64: Hide CNTHV__EL2 from userspace for nVHE guests Although we correctly UNDEF any CNTHV__EL2 access from the guest when E2H==0, we still expose these registers to userspace, which is a bad idea. Drop the ad-hoc UNDEF injection and switch to a .visibility() callback which will also hide the register from userspace. Fixes: `0e45981028` ("KVM: arm64: timer: Don't adjust the EL2 virtual timer offset") Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:42:40 +01:00
Sascha Bischoff	3193287ddf	KVM: arm64: gic-v3: Only set ICH_HCR traps for v2-on-v3 or v3 guests The ICH_HCR_EL2 traps are used when running on GICv3 hardware, or when running a GICv3-based guest using FEAT_GCIE_LEGACY on GICv5 hardware. When running a GICv2 guest on GICv3 hardware the traps are used to ensure that the guest never sees any part of GICv3 (only GICv2 is visible to the guest), and when running a GICv3 guest they are used to trap in specific scenarios. They are not applicable for a GICv2-native guest, and won't be applicable for a(n upcoming) GICv5 guest. The traps themselves are configured in the vGIC CPU IF state, which is stored as a union. Updating the wrong aperture of the union risks corrupting state, and therefore needs to be avoided at all costs. Bail early if we're not running a compatible guest (GICv2 on GICv3 hardware, GICv3 native, GICv3 on GICv5 hardware). Trap everything unconditionally if we're running a GICv2 guest on GICv3 hardware. Otherwise, conditionally set up GICv3-native trapping. Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:40:33 +01:00
Mukesh Ojha	c35dd83866	KVM: arm64: Guard PMSCR_EL1 initialization with SPE presence check Commit `efad60e460` ("KVM: arm64: Initialize PMSCR_EL1 when in VHE") does not perform sufficient check before initializing PMSCR_EL1 to 0 when running in VHE mode. On some platforms, this causes the system to hang during boot, as EL3 has not delegated access to the Profiling Buffer to the Non-secure world, nor does it reinject an UNDEF on sysreg trap. To avoid this issue, restrict the PMSCR_EL1 initialization to CPUs that support Statistical Profiling Extension (FEAT_SPE) and have the Profiling Buffer accessible in Non-secure EL1. This is determined via a new helper `cpu_has_spe()` which checks both PMSVer and PMBIDR_EL1.P. This ensures the initialization only affects CPUs where SPE is implemented and usable, preventing boot failures on platforms where SPE is not properly configured. Fixes: `efad60e460` ("KVM: arm64: Initialize PMSCR_EL1 when in VHE") Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:26:36 +01:00
Osama Abdelkader	05a02490fa	KVM: arm64: Remove unreachable break after return Remove an unnecessary 'break' statement that follows a 'return' in arch/arm64/kvm/at.c. The break is unreachable. Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:17:03 +01:00
Oliver Upton	0aa1b76fe1	KVM: arm64: Prevent access to vCPU events before init Another day, another syzkaller bug. KVM erroneously allows userspace to pend vCPU events for a vCPU that hasn't been initialized yet, leading to KVM interpreting a bunch of uninitialized garbage for routing / injecting the exception. In one case the injection code and the hyp disagree on whether the vCPU has a 32bit EL1 and put the vCPU into an illegal mode for AArch64, tripping the BUG() in exception_target_el() during the next injection: kernel BUG at arch/arm64/kvm/inject_fault.c:40! Internal error: Oops - BUG: 00000000f2000800 [#1] SMP CPU: 3 UID: 0 PID: 318 Comm: repro Not tainted 6.17.0-rc4-00104-g10fd0285305d #6 PREEMPT Hardware name: linux,dummy-virt (DT) pstate: 21402009 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) pc : exception_target_el+0x88/0x8c lr : pend_serror_exception+0x18/0x13c sp : ffff800082f03a10 x29: ffff800082f03a10 x28: ffff0000cb132280 x27: 0000000000000000 x26: 0000000000000000 x25: ffff0000c2a99c20 x24: 0000000000000000 x23: 0000000000008000 x22: 0000000000000002 x21: 0000000000000004 x20: 0000000000008000 x19: ffff0000c2a99c20 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 00000000200000c0 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 x8 : ffff800082f03af8 x7 : 0000000000000000 x6 : 0000000000000000 x5 : ffff800080f621f0 x4 : 0000000000000000 x3 : 0000000000000000 x2 : 000000000040009b x1 : 0000000000000003 x0 : ffff0000c2a99c20 Call trace: exception_target_el+0x88/0x8c (P) kvm_inject_serror_esr+0x40/0x3b4 __kvm_arm_vcpu_set_events+0xf0/0x100 kvm_arch_vcpu_ioctl+0x180/0x9d4 kvm_vcpu_ioctl+0x60c/0x9f4 __arm64_sys_ioctl+0xac/0x104 invoke_syscall+0x48/0x110 el0_svc_common.constprop.0+0x40/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x34/0xf0 el0t_64_sync_handler+0xa0/0xe4 el0t_64_sync+0x198/0x19c Code: f946bc01 b4fffe61 9101e020 17fffff2 (d4210000) Reject the ioctls outright as no sane VMM would call these before KVM_ARM_VCPU_INIT anyway. Even if it did the exception would've been thrown away by the eventual reset of the vCPU's state. Cc: stable@vger.kernel.org # 6.17 Fixes: `b7b27facc7` ("arm/arm64: KVM: Add KVM_GET/SET_VCPU_EVENTS") Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:17:03 +01:00
Oliver Upton	a46c09b382	KVM: arm64: Use the in-context stage-1 in __kvm_find_s1_desc_level() Running the external_aborts selftest at EL2 leads to an ugly splat due to the stage-1 MMU being disabled for the walked context, owing to the fact that __kvm_find_s1_desc_level() is hardcoded to the EL1&0 regime. Select the appropriate translation regime for the stage-1 walk based on the current vCPU context. Fixes: `b8e625167a` ("KVM: arm64: Add S1 IPA to page table level walker") Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:17:03 +01:00
Marc Zyngier	9a1950f977	KVM: arm64: nv: Don't advance PC when pending an SVE exception Jan reports that running a nested guest on Neoverse-V2 leads to a WARN in the host due to simultaneously pending an exception and PC increment after an access to ZCR_EL2. Returning true from a sysreg accessor is an indication that the sysreg instruction has been retired. Of course this isn't the case when we've pended a synchronous SVE exception for the guest. Fix the return value and let the exception propagate to the guest as usual. Reported-by: Jan Kotas <jank@cadence.com> Closes: https://lore.kernel.org/kvmarm/865xd61tt5.wl-maz@kernel.org/ Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:17:03 +01:00
Oliver Upton	ed25dcfbc4	KVM: arm64: nv: Don't treat ZCR_EL2 as a 'mapped' register Unlike the other mapped EL2 sysregs ZCR_EL2 isn't guaranteed to be resident when a vCPU is loaded as it actually follows the SVE context. As such, the contents of ZCR_EL1 may belong to another guest if the vCPU has been preempted before reaching sysreg emulation. Unconditionally use the in-memory value of ZCR_EL2 and switch to the memory-only accessors. The in-memory value is guaranteed to be valid as fpsimd_lazy_switch_to_{guest,host}() will restore/save the register appropriately. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-10-13 14:17:02 +01:00
Sourabh Jain	0843ba4584	powerpc/fadump: skip parameter area allocation when fadump is disabled Fadump allocates memory to pass additional kernel command-line argument to the fadump kernel. However, this allocation is not needed when fadump is disabled. So avoid allocating memory for the additional parameter area in such cases. Fixes: `f4892c68ec` ("powerpc/fadump: allocate memory for additional parameters early") Reviewed-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Fixes: `f4892c68ec` ("powerpc/fadump: allocate memory for additional parameters early") Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20251008032934.262683-1-sourabhjain@linux.ibm.com	2025-10-13 09:41:31 +05:30
Nam Cao	2743cf75f7	powerpc, ocxl: Fix extraction of struct xive_irq_data Commit `cc0cc23bab` ("powerpc/xive: Untangle xive from child interrupt controller drivers") changed xive_irq_data to be stashed to chip_data instead of handler_data. However, multiple places are still attempting to read xive_irq_data from handler_data and get a NULL pointer deference bug. Update them to read xive_irq_data from chip_data. Non-XIVE files which touch xive_irq_data seem quite strange to me, especially the ocxl driver. I think there ought to be an alternative platform-independent solution, instead of touching XIVE's data directly. Therefore, I think this whole thing should be cleaned up. But perhaps I just misunderstand something. In any case, this cleanup would not be trivial; for now, just get things working again. Fixes: `cc0cc23bab` ("powerpc/xive: Untangle xive from child interrupt controller drivers") Reported-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Closes: https://lore.kernel.org/linuxppc-dev/68e48df8.170a0220.4b4b0.217d@mx.google.com/ Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Ganesh Goudar <ganeshgr@linux.ibm.com> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> # ocxl Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20251008081359.1382699-1-namcao@linutronix.de	2025-10-13 09:40:55 +05:30
Nam Cao	ef3e73a917	powerpc/pseries/msi: Fix NULL pointer dereference at irq domain teardown pseries_msi_ops_teardown() reads pci_dev* from msi_alloc_info_t. However, pseries_msi_ops_prepare() does not populate this structure, thus it is all zeros. Consequently, pseries_msi_ops_teardown() triggers a NULL pointer dereference crash. struct pci_dev is available in struct irq_domain. Read it there instead. Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Closes: https://lore.kernel.org/linuxppc-dev/878d7651-433a-46fe-a28b-1b7e893fcbe0@linux.ibm.com/ Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20251010120307.3281720-1-namcao@linutronix.de	2025-10-13 09:39:02 +05:30
Linus Torvalds	c04022dccb	Merge tag 'kbuild-fixes-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux Pull Kbuild fixes from Nathan Chancellor: - Fix UAPI types check in headers_check.pl - Only enable -Werror for hostprogs with CONFIG_WERROR / W=e - Ignore fsync() error when output of gen_init_cpio is a pipe - Several little build fixes for recent modules.builtin.modinfo series * tag 'kbuild-fixes-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: kbuild: Use '--strip-unneeded-symbol' for removing module device table symbols s390/vmlinux.lds.S: Move .vmlinux.info to end of allocatable sections kbuild: Add '.rel.*' strip pattern for vmlinux kbuild: Restore pattern to avoid stripping .rela.dyn from vmlinux gen_init_cpio: Ignore fsync() returning EINVAL on pipes scripts/Makefile.extrawarn: Respect CONFIG_WERROR / W=e for hostprogs kbuild: uapi: Strip comments before size type check	2025-10-11 15:47:12 -07:00
Linus Torvalds	9591fdb061	Merge tag 'x86_core_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull more x86 updates from Borislav Petkov: - Remove a bunch of asm implementing condition flags testing in KVM's emulator in favor of int3_emulate_jcc() which is written in C - Replace KVM fastops with C-based stubs which avoids problems with the fastop infra related to latter not adhering to the C ABI due to their special calling convention and, more importantly, bypassing compiler control-flow integrity checking because they're written in asm - Remove wrongly used static branches and other ugliness accumulated over time in hyperv's hypercall implementation with a proper static function call to the correct hypervisor call variant - Add some fixes and modifications to allow running FRED-enabled kernels in KVM even on non-FRED hardware - Add kCFI improvements like validating indirect calls and prepare for enabling kCFI with GCC. Add cmdline params documentation and other code cleanups - Use the single-byte 0xd6 insn as the official #UD single-byte undefined opcode instruction as agreed upon by both x86 vendors - Other smaller cleanups and touchups all over the place * tag 'x86_core_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86,retpoline: Optimize patch_retpoline() x86,ibt: Use UDB instead of 0xEA x86/cfi: Remove __noinitretpoline and __noretpoline x86/cfi: Add "debug" option to "cfi=" bootparam x86/cfi: Standardize on common "CFI:" prefix for CFI reports x86/cfi: Document the "cfi=" bootparam options x86/traps: Clarify KCFI instruction layout compiler_types.h: Move __nocfi out of compiler-specific header objtool: Validate kCFI calls x86/fred: KVM: VMX: Always use FRED for IRQs when CONFIG_X86_FRED=y x86/fred: Play nice with invoking asm_fred_entry_from_kvm() on non-FRED hardware x86/fred: Install system vector handlers even if FRED isn't fully enabled x86/hyperv: Use direct call to hypercall-page x86/hyperv: Clean up hv_do_hypercall() KVM: x86: Remove fastops KVM: x86: Convert em_salc() to C KVM: x86: Introduce EM_ASM_3WCL KVM: x86: Introduce EM_ASM_1SRC2 KVM: x86: Introduce EM_ASM_2CL KVM: x86: Introduce EM_ASM_2W ...	2025-10-11 11:19:16 -07:00
Linus Torvalds	2f0a750453	Merge tag 'x86_cleanups_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Borislav Petkov: - Simplify inline asm flag output operands now that the minimum compiler version supports the =@ccCOND syntax - Remove a bunch of AS_* Kconfig symbols which detect assembler support for various instruction mnemonics now that the minimum assembler version supports them all - The usual cleanups all over the place * tag 'x86_cleanups_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm: Remove code depending on __GCC_ASM_FLAG_OUTPUTS__ x86/sgx: Use ENCLS mnemonic in <kernel/cpu/sgx/encls.h> x86/mtrr: Remove license boilerplate text with bad FSF address x86/asm: Use RDPKRU and WRPKRU mnemonics in <asm/special_insns.h> x86/idle: Use MONITORX and MWAITX mnemonics in <asm/mwait.h> x86/entry/fred: Push __KERNEL_CS directly x86/kconfig: Remove CONFIG_AS_AVX512 crypto: x86 - Remove CONFIG_AS_VPCLMULQDQ crypto: X86 - Remove CONFIG_AS_VAES crypto: x86 - Remove CONFIG_AS_GFNI x86/kconfig: Drop unused and needless config X86_64_SMP	2025-10-11 10:51:14 -07:00
Paul Walmsley	852947be66	riscv: kprobes: convert one final __ASSEMBLY__ to __ASSEMBLER__ Per the reasoning in commit `f811f58597` ("riscv: Replace __ASSEMBLY__ with __ASSEMBLER__ in non-uapi headers"), convert one last remaining instance of __ASSEMBLY__ in the arch/riscv kprobes code. This entered the tree from patches that were sent before Thomas' changes; and when I reviewed the kprobes patches before queuing them, I missed this instance. Cc: Nam Cao <namcao@linutronix.dev> Cc: Thomas Huth <thuth@redhat.com> Link: https://lore.kernel.org/linux-riscv/16b74b63-f223-4f0b-b6e5-31cea5e620b4@redhat.com/ Link: https://lore.kernel.org/linux-riscv/20250606070952.498274-1-thuth@redhat.com/ Signed-off-by: Paul Walmsley <pjw@kernel.org>	2025-10-10 16:04:25 -06:00
Sean Christopherson	44c6cb9fe9	KVM: guest_memfd: Allow mmap() on guest_memfd for x86 VMs with private memory Allow mmap() on guest_memfd instances for x86 VMs with private memory as the need to track private vs. shared state in the guest_memfd instance is only pertinent to INIT_SHARED. Doing mmap() on private memory isn't terrible useful (yet!), but it's now possible, and will be desirable when guest_memfd gains support for other VMA-based syscalls, e.g. mbind() to set NUMA policy. Lift the restriction now, before MMAP support is officially released, so that KVM doesn't need to add another capability to enumerate support for mmap() on private memory. Fixes: `3d3a04fad2` ("KVM: Allow and advertise support for host mmap() on guest_memfd files") Reviewed-by: Ackerley Tng <ackerleytng@google.com> Tested-by: Ackerley Tng <ackerleytng@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20251003232606.4070510-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-10-10 14:25:25 -07:00
Dapeng Mi	034417c143	KVM: x86/pmu: Don't try to get perf capabilities for hybrid CPUs Explicitly zero kvm_host_pmu instead of attempting to get the perf PMU capabilities when running on a hybrid CPU to avoid running afoul of perf's sanity check. ------------[ cut here ]------------ WARNING: arch/x86/events/core.c:3089 at perf_get_x86_pmu_capability+0xd/0xc0, Call Trace: <TASK> kvm_x86_vendor_init+0x1b0/0x1a40 [kvm] vmx_init+0xdb/0x260 [kvm_intel] vt_init+0x12/0x9d0 [kvm_intel] do_one_initcall+0x60/0x3f0 do_init_module+0x97/0x2b0 load_module+0x2d08/0x2e30 init_module_from_file+0x96/0xe0 idempotent_init_module+0x117/0x330 __x64_sys_finit_module+0x73/0xe0 Always read the capabilities for non-hybrid CPUs, i.e. don't entirely revert to reading capabilities if and only if KVM wants to use a PMU, as it may be useful to have the host PMU capabilities available, e.g. if only or debug. Reported-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Closes: https://lore.kernel.org/all/70b64347-2aca-4511-af78-a767d5fa8226@intel.com/ Fixes: `51f34b1e65` ("KVM: x86/pmu: Snapshot host (i.e. perf's) reported PMU capabilities") Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20251010005239.146953-1-dapeng1.mi@linux.intel.com [sean: rework changelog, call out hybrid CPUs in shortlog] Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-10-10 14:25:12 -07:00
Linus Torvalds	917167ed12	Merge tag 'xtensa-20251010' of https://github.com/jcmvbkbc/linux-xtensa Pull Xtensa updates from Max Filippov: - minor cleanups * tag 'xtensa-20251010' of https://github.com/jcmvbkbc/linux-xtensa: xtensa: use HZ_PER_MHZ in platform_calibrate_ccount xtensa: simdisk: add input size check in proc_write_simdisk	2025-10-10 11:20:19 -07:00
Nathan Chancellor	9338d660b7	s390/vmlinux.lds.S: Move .vmlinux.info to end of allocatable sections When building s390 defconfig with binutils older than 2.32, there are several warnings during the final linking stage: s390-linux-ld: .tmp_vmlinux1: warning: allocated section `.got.plt' not in segment s390-linux-ld: .tmp_vmlinux2: warning: allocated section `.got.plt' not in segment s390-linux-ld: vmlinux.unstripped: warning: allocated section `.got.plt' not in segment s390-linux-objcopy: vmlinux: warning: allocated section `.got.plt' not in segment s390-linux-objcopy: st7afZyb: warning: allocated section `.got.plt' not in segment binutils commit afca762f598 ("S/390: Improve partial relro support for 64 bit") [1] in 2.32 changed where .got.plt is emitted, avoiding the warning. The :NONE in the .vmlinux.info output section description changes the segment for subsequent allocated sections. Move .vmlinux.info right above the discards section to place all other sections in the previously defined segment, .data. Fixes: `30226853d6` ("s390: vmlinux.lds.S: explicitly handle '.got' and '.plt' sections") Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=afca762f598d453c563f244cd3777715b1a0cb72 [1] Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Acked-by: Alexey Gladkov <legion@kernel.org> Acked-by: Nicolas Schier <nsc@kernel.org> Link: https://patch.msgid.link/20251008-kbuild-fix-modinfo-regressions-v1-3-9fc776c5887c@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org>	2025-10-10 10:22:08 -07:00
Linus Torvalds	8cc8ea228c	Merge tag 'parisc-for-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc updates from Helge Deller: "Minor enhancements and fixes, specifically: - report emulation and alignment faults via perf - add initial kernel-side support for perf_events - small initialization fixes in the parisc firmware layer - adjust TC* constants and avoid referencing termio structs to avoid userspace build errors" * tag 'parisc-for-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Fix iodc and device path return values on old machines parisc: Firmware: Fix returned path for PDC_MODULE_FIND on older machines parisc: Add initial kernel-side perf_event support parisc: Report software alignment faults via perf parisc: Report emulation faults via perf parisc: don't reference obsolete termio struct for TC* constants parisc: Remove spurious if statement from raw_copy_from_user()	2025-10-10 10:01:55 -07:00

1 2 3 4 5 ...

238301 Commits