linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-06-03 23:12:57 -04:00

Author	SHA1	Message	Date
Linus Torvalds	5862221fdd	Merge tag 'parisc-for-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: - Revert "parisc: led: fix reference leak on failed device registration" - Fix build failures introduced when allowing to build 32-/64-bit only VDSO - Switch to dynamic parisc root device to avoid upcoming warnings - Fix IRQ leak in LASI driver * tag 'parisc-for-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Fix IRQ leak in LASI driver parisc: Fix 64-bit kernel build when CONFIG_COMPAT=n parisc: Fix build failure for 32-bit kernel with PA2.0 instruction set parisc: drivers: switch to dynamic root device Revert "parisc: led: fix reference leak on failed device registration"	2026-05-06 12:51:07 -07:00
Linus Torvalds	adc1e5c620	Merge tag 'efi-fixes-for-v7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: - Fix issues in EFI graceful recovery on x86 introduced by changes to the kernel mode FPU APIs - I-cache coherency fixes for the LoongArch EFI stub - Locking fix for EFI pstore - Code tweak for efivarfs * tag 'efi-fixes-for-v7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: x86/efi: Restore IRQ state in EFI page fault handler x86/efi: Fix graceful fault handling after FPU softirq changes efi/libstub: Synchronize instruction cache after kernel relocation efi/loongarch: Implement efi_cache_sync_image() efi/libstub: Move efi_relocate_kernel() into its only remaining user efi: pstore: Drop efivar lock when efi_pstore_open() returns with an error efivarfs: use QSTR() in efivarfs_alloc_dentry	2026-05-06 07:27:30 -07:00
Linus Torvalds	e80948062d	Merge tag 'loongarch-fixes-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Fix some build and runtime issues after 32BIT Kconfig option enabled, improve the platform-specific PCI controller compatibility, drop custom __arch_vdso_hres_capable(), and fix a lot of KVM bugs" * tag 'loongarch-fixes-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: KVM: Move unconditional delay into timer clear scenery LoongArch: KVM: Fix HW timer interrupt lost when inject interrupt by software LoongArch: KVM: Move AVEC interrupt injection into switch loop LoongArch: KVM: Use kvm_set_pte() in kvm_flush_pte() LoongArch: KVM: Fix missing EMULATE_FAIL in kvm_emu_mmio_read() LoongArch: KVM: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS LoongArch: KVM: Fix "unreliable stack" for kvm_exc_entry LoongArch: KVM: Compile switch.S directly into the kernel LoongArch: vDSO: Drop custom __arch_vdso_hres_capable() LoongArch: Fix potential ADE in loongson_gpu_fixup_dma_hang() LoongArch: Use per-root-bridge PCIH flag to skip mem resource fixup LoongArch: Fix SYM_SIGFUNC_START definition for 32BIT LoongArch: Specify -m32/-m64 explicitly for 32BIT/64BIT LoongArch: Make CONFIG_64BIT as the default option	2026-05-05 19:44:46 -07:00
Ard Biesheuvel	2c340aab54	x86/efi: Restore IRQ state in EFI page fault handler The kernel's softirq API does not permit re-enabling softirqs while IRQs are disabled. The reason for this is that local_bh_enable() will not only re-enable delivery of softirqs over the back of IRQs, it will also handle any pending softirqs immediately, regardless of whether IRQs are enabled at that point. For this reason, commit `d021985504` ("x86/fpu: Improve crypto performance by making kernel-mode FPU reliably usable in softirqs") disables softirqs only when IRQs are enabled, as it is not permitted otherwise, but also unnecessary, given that asynchronous softirq delivery never happens to begin with while IRQs are disabled. However, this does mean that entering a kernel mode FPU section with IRQs enabled and leaving it with IRQs disabled leads to problems, as identified by Sashiko [0]: the EFI page fault handler is called from page_fault_oops() with IRQs disabled, and thus ends the kernel mode FPU section with IRQs disabled as well, regardless of whether IRQs were enabled when it was started. This may result in schedule() being called with a non-zero preempt_count, causing a BUG(). So take care to re-enable IRQs when handling any EFI page faults if they were taken with IRQs enabled. [0] https://sashiko.dev/#/patchset/20260430074107.27051-1-ivan.hu%40canonical.com Cc: Eric Biggers <ebiggers@kernel.org> Cc: Ivan Hu <ivan.hu@canonical.com> Cc: x86@kernel.org Cc: <stable@vger.kernel.org> Fixes: `d021985504` ("x86/fpu: Improve crypto performance by making kernel-mode FPU reliably usable in softirqs") Reviewed-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2026-05-05 09:31:28 +02:00
Ivan Hu	088f65e206	x86/efi: Fix graceful fault handling after FPU softirq changes Since commit `d021985504` ("x86/fpu: Improve crypto performance by making kernel-mode FPU reliably usable in softirqs"), kernel_fpu_begin() calls fpregs_lock() which uses local_bh_disable() instead of the previous preempt_disable(). This sets SOFTIRQ_OFFSET in preempt_count during the entire EFI runtime service call, causing in_interrupt() to return true in normal task context. The graceful page fault handler efi_crash_gracefully_on_page_fault() uses in_interrupt() to bail out for faults in real interrupt context. With SOFTIRQ_OFFSET now set, the handler always bails out, leaving EFI firmware page faults unhandled. This escalates to die() which also sees in_interrupt() as true and calls panic("Fatal exception in interrupt"), resulting in a hard system freeze. On systems with buggy firmware that triggers page faults during EFI runtime calls (e.g., accessing unmapped memory in GetTime()), this causes an unrecoverable hang instead of the expected graceful EFI_ABORTED recovery. Fix by replacing in_interrupt() with !in_task(). This preserves the original intent of bailing for interrupts or NMI faults, while no longer falsely triggering from the FPU code path's local_bh_disable(). Fixes: `d021985504` ("x86/fpu: Improve crypto performance by making kernel-mode FPU reliably usable in softirqs") Cc: <stable@vger.kernel.org> Signed-off-by: Ivan Hu <ivan.hu@canonical.com> [ardb: Sashiko spotted that using 'in_hardirq() \|\| in_nmi()' leaves a window where a softirq may be taken before fpregs_lock() is called, but after efi_rts_work.efi_rts_id has been assigned, and any page faults occurring in that window will then be misidentified as having been caused by the firmware. Instead, use !in_task(), which incorporates in_serving_softirq(). ] Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2026-05-04 12:41:51 +02:00
Bibo Mao	5a873d77ba	LoongArch: KVM: Move unconditional delay into timer clear scenery When timer interrupt arrives in guest kernel, guest kernel clears the timer interrupt and program timer with the next incoming event. During this stage, timer tick is -1 and timer interrupt status is disabled in ESTAT register. KVM hypervisor need write zero with timer tick register and wait timer interrupt injection from HW side, and then clear timer interrupt. So there is 2 cycle delay in KVM hypervisor to emulate such scenery, and the delay is unnecessary if there is no need to clear the timer interrupt. Here move 2 cycle delay into timer clear scenery and add timer ESTAT checking after delay, and set max timer expire value if timer interrupt does not arrive still. Cc: stable@vger.kernel.org Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-05-04 09:00:48 +08:00
Bibo Mao	2433f3f572	LoongArch: KVM: Fix HW timer interrupt lost when inject interrupt by software With passthrough HW timer, timer interrupt is injected by HW. When inject emulated CPU interrupt by software such SIP0/SIP1/IPI, HW timer interrupt may be lost. Here check whether there is timer tick value inversion before and after injecting emulated CPU interrupt by software, timer enabling by reading timer cfg register is skipped. If the timer tick value is detected with changing, then timer should be enabled. And inject a timer interrupt by software if there is. Cc: <stable@vger.kernel.org> Fixes: `f45ad5b8aa` ("LoongArch: KVM: Implement vcpu interrupt operations"). Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-05-04 09:00:48 +08:00
Bibo Mao	6debfff785	LoongArch: KVM: Move AVEC interrupt injection into switch loop When AVEC interrupt controller is emulated in user space, AVEC interrupt is injected by software like SIP0/SIP1/TI/IPI interrupts. Here also move the AVEC interrupt injection in switch loop. Cc: stable@vger.kernel.org Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-05-04 09:00:48 +08:00
Tao Cui	81e18777d6	LoongArch: KVM: Use kvm_set_pte() in kvm_flush_pte() kvm_flush_pte() is the only caller that directly assigns *pte instead of using the kvm_set_pte() wrapper. Use the wrapper for consistency with the rest of the file. No functional change intended. Cc: stable@vger.kernel.org Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Tao Cui <cuitao@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-05-04 09:00:38 +08:00
Tao Cui	f26faae96c	LoongArch: KVM: Fix missing EMULATE_FAIL in kvm_emu_mmio_read() In the ldptr (0x24...0x27) opcode decoding path, the default case only breaks out but without setting "ret" value to EMULATE_FAIL. This leaves run->mmio.len uninitialized (stale from a previous MMIO operation) while "ret" value remains EMULATE_DO_MMIO, causing the code to proceed with an incorrect MMIO length. Add "ret = EMULATE_FAIL" to match the other default branches in the same function (e.g. the 0x28...0x2e and 0x38 cases). Cc: stable@vger.kernel.org Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Tao Cui <cuitao@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-05-04 09:00:38 +08:00
Qiang Ma	b3e31a6650	LoongArch: KVM: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS It doesn't make sense to return the recommended maximum number of vCPUs which exceeds the maximum possible number of vCPUs. Other architectures have already done this, such as commit `57a2e13ebd` ("KVM: MIPS: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS") Cc: stable@vger.kernel.org Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Qiang Ma <maqianga@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-05-04 09:00:37 +08:00
Xianglai Li	b323a441da	LoongArch: KVM: Fix "unreliable stack" for kvm_exc_entry Insert the appropriate UNWIND hint into the kvm_exc_entry assembly function to guide the generation of correct ORC table entries, thereby solving the timeout problem ("unreliable stack") while loading the livepatch-sample module on a physical machine running virtual machines with multiple vcpus. Cc: stable@vger.kernel.org Signed-off-by: Xianglai Li <lixianglai@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-05-04 09:00:37 +08:00
Xianglai Li	5203012fa6	LoongArch: KVM: Compile switch.S directly into the kernel If we directly compile the switch.S file into the kernel, the address of the kvm_exc_entry function will definitely be within the DMW memory area. Therefore, we will no longer need to perform a copy relocation of the kvm_exc_entry. So this patch compiles switch.S directly into the kernel, and then remove the copy relocation execution logic for the kvm_exc_entry function. Cc: stable@vger.kernel.org Signed-off-by: Xianglai Li <lixianglai@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-05-04 09:00:37 +08:00
Thomas Weißschuh	7e2c41bc62	LoongArch: vDSO: Drop custom __arch_vdso_hres_capable() The custom definition is identical to the generic fallback one. So remove it. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-05-04 09:00:20 +08:00
Wentao Guan	8dfa2f8780	LoongArch: Fix potential ADE in loongson_gpu_fixup_dma_hang() The switch case in loongson_gpu_fixup_dma_hang() may not DC2 or DC3, and readl(crtc_reg) will access with random address, because the "device" is from "base+PCI_DEVICE_ID", "base" is from "pdev->devfn+1". This is wrong when my platform inserts a discrete GPU: lspci -tv -[0000:00]-+-00.0 Loongson Technology LLC Hyper Transport Bridge Controller ... +-06.0 Loongson Technology LLC LG100 GPU +-06.2 Loongson Technology LLC Device 7a37 ... Add a default switch case to fix the panic as below: Kernel ade access[#1]: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.6.136-loong64-desktop-hwe+ #4 pc 90000000017e5534 ra 90000000017e54c0 tp 90000001002f8000 sp 90000001002fb6c0 a0 80000efe00003100 a1 0000000000003100 a2 0000000000000000 a3 0000000000000002 a4 90000001002fb6b4 a5 900000087cdb58fd a6 90000000027af000 a7 0000000000000001 t0 00000000000085b9 t1 000000000000ffff t2 0000000000000000 t3 0000000000000000 t4 fffffffffffffffd t5 00000000fffb6d9c t6 0000000000083b00 t7 00000000000070c0 t8 900000087cdb4d94 u0 900000087cdb58fd s9 90000001002fb826 s0 90000000031c12c8 s1 7fffffffffffff00 s2 90000000031c12d0 s3 0000000000002710 s4 0000000000000000 s5 0000000000000000 s6 9000000100053000 s7 7fffffffffffff00 s8 90000000030d4000 ra: 90000000017e54c0 loongson_gpu_fixup_dma_hang+0x40/0x210 ERA: 90000000017e5534 loongson_gpu_fixup_dma_hang+0xb4/0x210 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 00000004 (PPLV0 +PIE -PWE) EUEN: 00000000 (-FPE -SXE -ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00480000 [ADEM] (IS= ECode=8 EsubCode=1) BADV: 7fffffffffffff00 PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV) Modules linked in: Process swapper/0 (pid: 1, threadinfo=(____ptrval____), task=(____ptrval____)) Stack : 0000000000000006 90000001002fb778 90000001002fb704 0000000000000007 0000000016a65700 90000000017e5690 000000000000ffff ffffffffffffffff 900000000209f7c0 9000000100053000 900000000209f7a8 9000000000eebc08 0000000000000000 0000000000000000 0000000000000006 90000001002fb778 90000001000530b8 90000000027af000 0000000000000000 9000000100054000 9000000100053000 9000000000ebb70c 9000000100004c00 9000000004000001 90000001002fb7e4 bae765461f31cb12 0000000000000000 0000000000000000 0000000000000006 90000000027af000 0000000000000030 90000000027af000 900000087cd6f800 9000000100053000 0000000000000000 9000000000ebc560 7a2500147cdaf720 bae765461f31cb12 0000000000000001 0000000000000030 ... Call Trace: [<90000000017e5534>] loongson_gpu_fixup_dma_hang+0xb4/0x210 [<9000000000eebc08>] pci_fixup_device+0x108/0x280 [<9000000000ebb70c>] pci_setup_device+0x24c/0x690 [<9000000000ebc560>] pci_scan_single_device+0xe0/0x140 [<9000000000ebc684>] pci_scan_slot+0xc4/0x280 [<9000000000ebdd00>] pci_scan_child_bus_extend+0x60/0x3f0 [<9000000000f5bc94>] acpi_pci_root_create+0x2b4/0x420 [<90000000017e5e74>] pci_acpi_scan_root+0x2d4/0x440 [<9000000000f5b02c>] acpi_pci_root_add+0x21c/0x3a0 [<9000000000f4ee54>] acpi_bus_attach+0x1a4/0x3c0 [<90000000010e200c>] device_for_each_child+0x6c/0xe0 [<9000000000f4bbf4>] acpi_dev_for_each_child+0x44/0x70 [<9000000000f4ef40>] acpi_bus_attach+0x290/0x3c0 [<90000000010e200c>] device_for_each_child+0x6c/0xe0 [<9000000000f4bbf4>] acpi_dev_for_each_child+0x44/0x70 [<9000000000f4ef40>] acpi_bus_attach+0x290/0x3c0 [<9000000000f5211c>] acpi_bus_scan+0x6c/0x280 [<900000000189c028>] acpi_scan_init+0x194/0x310 [<900000000189bc6c>] acpi_init+0xcc/0x140 [<9000000000220cdc>] do_one_initcall+0x4c/0x310 [<90000000018618fc>] kernel_init_freeable+0x258/0x2d4 [<900000000184326c>] kernel_init+0x28/0x13c [<9000000000222008>] ret_from_kernel_thread+0xc/0xa4 Cc: stable@vger.kernel.org Fixes: `95db0c9f52` ("LoongArch: Workaround LS2K/LS7A GPU DMA hang bug") Link: https://gist.github.com/opsiff/ebf2dac51b4013d22462f2124c55f807 Link: https://gist.github.com/opsiff/a62f2a73db0492b3c49bf223a339b133 Signed-off-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-05-04 09:00:20 +08:00
Huacai Chen	49f33840dc	LoongArch: Use per-root-bridge PCIH flag to skip mem resource fixup When firmware enables 64-bit PCI host bridge support, some root bridges already provide valid 64-bit mem resource windows through ACPI. In this case, the LoongArch-specific mem resource high-bits fixup in acpi_prepare_root_resources() should not be applied unconditionally. Otherwise, the kernel may override the native resource layout derived from firmware, and later BAR assignment can fail to place device BARs into the intended 64-bit address space correctly. Add a per-root-bridge ACPI flag, PCIH, and evaluate it from the current root bridge device scope. When PCIH is set, skip the mem resource high- bits fixup path and let the kernel use the firmware-provided resource description directly. When PCIH is absent or cleared, keep the existing behavior and continue filling the high address bits from the host bridge address. This makes the behavior per-root-bridge configurable and avoids breaking valid 64-bit BAR space allocation on bridges whose 64-bit windows have already been fully described by firmware. Cc: stable@vger.kernel.org Suggested-by: Chao Li <lichao@loongson.cn> Tested-by: Dongyan Qian <qiandongyan@loongson.cn> Signed-off-by: Dongyan Qian <qiandongyan@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-05-04 09:00:20 +08:00
Huacai Chen	98b8aebb14	LoongArch: Fix SYM_SIGFUNC_START definition for 32BIT The SYM_SIGFUNC_START definition should match sigcontext that the length of GPRs are 8 bytes for both 32BIT and 64BIT. So replace SZREG with 8 to fix it. Cc: stable@vger.kernel.org Fixes: `e4878c37f6` ("LoongArch: vDSO: Emit GNU_EH_FRAME correctly") Suggested-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-05-04 09:00:01 +08:00
Huacai Chen	5643c6b2c8	LoongArch: Specify -m32/-m64 explicitly for 32BIT/64BIT Clang/LLVM build needs -m32/-m64 to switch triple variants (i.e. the --target=xxx parameter). Otherwise we get build errors for CONFIG_32BIT. GCC doesn't support -m32/-m64 now, but maybe support in future, so use cc-option to specify them. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202604232041.ESJDwVG4-lkp@intel.com/ Suggested-by: Nathan Chancellor <nathan@kernel.org Tested-by: WANG Rui <wangrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-05-04 09:00:01 +08:00
Huacai Chen	4808e5cc4f	LoongArch: Make CONFIG_64BIT as the default option CONFIG_64BIT is the mandatory option before v7.0, but in v7.1-rc1 both CONFIG_32BIT and CONFIG_64BIT are selectable and CONFIG_32BIT became the default option. This breaks existing configurations, so explicitly make CONFIG_64BIT as the default option to keep existing behavior. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-05-04 09:00:00 +08:00
Linus Torvalds	6d35786de2	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "Three bug fixes for x86: - Check that nEPT/nNPT is enabled in slow flush hypercalls. If it is not, the hypercalls can be processed as usual even while running a nested guest - Fix shadow paging use-after-free due to page tables changing outside execution of the guest. A bug that is 16 years old and stems from an imprecision in the very first KVM series - Scan IRR whenever PID.ON is true, even if PIR is empty, which avoids a somewhat rare WARN" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Fix shadow paging use-after-free due to unexpected GFN KVM: x86: Fix misleading variable names and add more comments for PIR=>IRR flow KVM: x86: Do IRR scan in __kvm_apic_update_irr even if PIR is empty KVM: x86: check for nEPT/nNPT in slow flush hypercalls	2026-05-03 15:25:47 -07:00
Sean Christopherson	0cb2af2ea6	KVM: x86: Fix shadow paging use-after-free due to unexpected GFN The shadow MMU computes GFNs for direct shadow pages using sp->gfn plus the SPTE index. This assumption breaks for shadow paging if the guest page tables are modified between VM entries (similar to commit `aad885e774`, "KVM: x86/mmu: Drop/zap existing present SPTE even when creating an MMIO SPTE", 2026-03-27). The flow is as follows: - a PDE is installed for a 2MB mapping, and a page in that area is accessed. KVM creates a kvm_mmu_page consisting of 512 4KB pages; the kvm_mmu_page is marked by FNAME(fetch) as direct-mapped because the guest's mapping is a huge page (and thus contiguous). - the PDE mapping is changed from outside the guest. - the guest accesses another page in the same 2MB area. KVM installs a new leaf SPTE and rmap entry; the SPTE uses the "correct" GFN (i.e. based on the new mapping, as changed in the previous step) but that GFN is outside of the [sp->gfn, sp->gfn + 511] range; therefore the rmap entry cannot be found and removed when the kvm_mmu_page is zapped. - the memslot that covers the first 2MB mapping is deleted, and the kvm_mmu_page for the now-invalid GPA is zapped. However, rmap_remove() only looks at the [sp->gfn, sp->gfn + 511] range established in step 1, and fails to find the rmap entry that was recorded by step 3. - any operation that causes an rmap walk for the same page accessed by step 3 then walks a stale rmap and dereferences a freed kvm_mmu_page. This includes dirty logging or MMU notifier invalidations (e.g., from MADV_DONTNEED). The underlying issue is that KVM's walking of shadow PTEs assumes that if a SPTE is present when KVM wants to install a non-leaf SPTE, then the existing kvm_mmu_page must be for the correct gfn. Because the only way for the gfn to be wrong is if KVM messed up and failed to zap a SPTE... which shouldn't happen, but actually only happens in response to a guest write. That bug dates back literally forever, as even the first version of KVM assumes that the GFN matches and walks into the "wrong" shadow page. However, that was only an imprecision until `2032a93d66` ("KVM: MMU: Don't allocate gfns page for direct mmu pages") came along. Fix it by checking for a target gfn mismatch and zapping the existing SPTE. That way the old SP and rmap entries are gone, KVM installs the rmap in the right location, and everyone is happy. Fixes: `2032a93d66` ("KVM: MMU: Don't allocate gfns page for direct mmu pages") Fixes: `6aa8b732ca` ("kvm: userspace interface") Reported-by: Alexander Bulekov <bkov@amazon.com> Reported-by: Fred Griffoul <fgriffo@amazon.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://patch.msgid.link/20260503201029.106481-1-pbonzini@redhat.com/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2026-05-03 22:32:53 +02:00
Sean Christopherson	0aec99f9bf	KVM: x86: Fix misleading variable names and add more comments for PIR=>IRR flow Rename kvm_apic_update_irr()'s "irr_updated" and vmx_sync_pir_to_irr()'s "got_posted_interrupt" to a more accurate "max_irr_is_from_pir", as neither "irr_updated" nor "got_posted_interrupt" is accurate. __kvm_apic_update_irr() and thus kvm_apic_update_irr() specifically return true if and only if the highest priority IRQ, i.e. max_irr, is a "new" pending IRQ from the PIR. I.e. it's possible for the IRR to be updated, i.e. for a posted IRQ to be "got", without the APIs returning true. Expand vmx_sync_pir_to_irr()'s comment to explain why it's necessary to set KVM_REQ_EVENT only if a "new" IRQ was found, and to explain why it's safe to do so only if a new IRQ is also the highest priority pending IRQ. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://patch.msgid.link/20260503201703.108231-3-pbonzini@redhat.com/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2026-05-03 22:32:41 +02:00
Paolo Bonzini	33fd0ccd25	KVM: x86: Do IRR scan in __kvm_apic_update_irr even if PIR is empty Fall back to apic_find_highest_vector() when PID.ON is set but PIR turns out to be empty, to correctly report the highest pending interrupt from the existing IRR. In a nested VM stress test, the following WARNING fires in vmx_check_nested_events() when kvm_cpu_has_interrupt() reports a pending interrupt but the subsequent kvm_apic_has_interrupt() (which invokes vmx_sync_pir_to_irr() again) returns -1: WARNING: CPU: 99 PID: 57767 at arch/x86/kvm/vmx/nested.c:4449 vmx_check_nested_events+0x6bf/0x6e0 [kvm_intel] Call Trace: kvm_check_and_inject_events vcpu_enter_guest.constprop.0 vcpu_run kvm_arch_vcpu_ioctl_run kvm_vcpu_ioctl __x64_sys_ioctl do_syscall_64 entry_SYSCALL_64_after_hwframe The root cause is a race between vmx_sync_pir_to_irr() on the target vCPU and __vmx_deliver_posted_interrupt() on a sender vCPU. The sender performs two individually-atomic operations that are not a single transaction: 1. pi_test_and_set_pir(vector) -- sets the PIR bit 2. pi_test_and_set_on() -- sets PID.ON The following interleaving triggers the bug: Sender vCPU (IPI): Target vCPU (1st sync_pir_to_irr): B1: set PIR[vector] A1: pi_clear_on() A2: pi_harvest_pir() -> sees B1 bit A3: xchg() -> consumes bit, PIR=0 (1st sync returns correct max_irr) B2: set PID.ON = 1 Target vCPU (2nd sync_pir_to_irr): C1: pi_test_on() -> TRUE (from B2) C2: pi_clear_on() -> ON=0 C3: pi_harvest_pir() -> PIR empty C4: *max_irr = -1, early return IRR NOT SCANNED The interrupt is not lost (it resides in the IRR from the first sync and is recovered on the next vcpu_enter_guest() iteration), but the incorrect max_irr causes a spurious WARNING and a wasted L2 VM-Enter/VM-Exit cycle. Fixes: `b41f8638b9` ("KVM: VMX: Isolate pure loads from atomic XCHG when processing PIR") Reported-by: Farrah Chen <farrah.chen@intel.com> Analyzed-by: Chenyi Qiang <chenyi.qiang@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/kvm/20260428070349.1633238-1-chenyi.qiang@intel.com/T/ Link: https://patch.msgid.link/20260503201703.108231-2-pbonzini@redhat.com/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2026-05-03 22:18:15 +02:00
Paolo Bonzini	464af6fc2b	KVM: x86: check for nEPT/nNPT in slow flush hypercalls Checking is_guest_mode(vcpu) is incorrect, because translate_nested_gpa() is only valid if an L2 guest is running with nested EPT/NPT enabled. Instead use the same condition as translate_nested_gpa() itself. Cc: stable@vger.kernel.org Reviewed-by: Sean Christopherson <seanjc@google.com> Fixes: `aee738236d` ("KVM: x86: Prepare kvm_hv_flush_tlb() to handle L2's GPAs", 2022-11-18) Link: https://patch.msgid.link/20260503200905.106077-1-pbonzini@redhat.com/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2026-05-03 22:17:30 +02:00
Linus Torvalds	f377d0025e	Merge tag 'sh-for-v7.1-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux Pull sh fix from John Paul Adrian Glaubitz: "The ZERO_PAGE consolidation in v7.1, introduced a regression on sh which made these systems unbootable. The problem was that on sh, the initial boot parameters were previously referenced as an array and after `6215d9f447` ("arch, mm: consolidate empty_zero_page"), they were referenced as a pointer which caused wrong code generation and boot hang. This changes the declaration back to being an array which fixes the boot hang" * tag 'sh-for-v7.1-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux: sh: Fix fallout from ZERO_PAGE consolidation	2026-05-03 08:58:42 -07:00
Mike Rapoport (Microsoft)	b0aa5e4b08	sh: Fix fallout from ZERO_PAGE consolidation Consolidation of empty_zero_page declarations broke boot on sh. sh stores its initial boot parameters in a page reserved in arch/sh/kernel/head_32.S. Before commit `6215d9f447` ("arch, mm: consolidate empty_zero_page") this page was referenced in C code as an array and after that commit it is referenced as a pointer. This causes wrong code generation and boot hang. Declare boot_params_page as an array to fix the issue. Reported-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Tested-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Fixes: `6215d9f447` ("arch, mm: consolidate empty_zero_page") Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Artur Rojek <contact@artur-rojek.eu> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>	2026-05-03 16:35:40 +02:00
Linus Torvalds	cd546f7ae2	Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Avoid writing an uninitialised stack variable to POR_EL0 on sigreturn if the poe_context record is absent - Reserve one more page for the early 4K-page kernel mapping to cover the extra [_text, _stext) split introduced by the non-executable read-only mapping - Force the arch_local_irq_() wrappers to be __always_inline so that noinstr entry and idle paths cannot call out-of-line, instrumentable copies - Fix potential sign extension in the arm64 SCS unwinder's DWARF advance_loc4 decoding - Tolerate arm64 ACPI platforms with only WFI and no deeper PSCI idle states, restoring cpuidle registration on such systems - Include the UAPI <asm/ptrace.h> header in the arm64 GCS libc test rather than carrying a duplicate struct user_gcs definition (the original #ifdef NT_ARM_GCS was wrong to cover the structure definition as it would be masked out if the toolchain defined it) tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: signal: Preserve POR_EL0 if poe_context is missing arm64: Reserve an extra page for early kernel mapping kselftest/arm64: Include <asm/ptrace.h> for user_gcs definition ACPI: arm64: cpuidle: Tolerate platforms with no deep PSCI idle states arm64/irqflags: __always_inline the arch_local_irq_*() helpers arm64/scs: Fix potential sign extension issue of advance_loc4	2026-05-01 16:32:42 -07:00
Linus Torvalds	bb1d73f2cd	Merge tag 's390-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Alexander Gordeev: - Reject zero-length writes from userspace that corrupt Debug Facility buffers - Replace one s390 PCI maintainer - Remove SCLP_OFB Kconfig option and enable the guarded code unconditionally - Replace incorrect use of phys_to_folio() to virt_to_folio() in do_secure_storage_access() * tag 's390-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/mm: Fix phys_to_folio() usage in do_secure_storage_access() s390/sclp: Remove SCLP_OFB Kconfig option MAINTAINERS: Replace one of the maintainers for s390/pci s390/debug: Reject zero-length input in debug_input_flush_fn() s390/debug: Reject zero-length input before trimming a newline	2026-05-01 12:58:02 -07:00
Helge Deller	41ca998fbe	parisc: Fix 64-bit kernel build when CONFIG_COMPAT=n VDSO32_SYMBOL() is used in signal.c, defining the value to zero avoids liker issues when CONFIG_COMPAT=n. Signed-off-by: Helge Deller <deller@gmx.de>	2026-05-01 19:09:30 +02:00
Kevin Brodsky	030e8a40ff	arm64: signal: Preserve POR_EL0 if poe_context is missing Commit `2e8a1acea8` ("arm64: signal: Improve POR_EL0 handling to avoid uaccess failures") delayed the write to POR_EL0 in rt_sigreturn to avoid spurious uaccess failures. This change however relies on the poe_context frame record being present: on a system supporting POE, calling sigreturn without a poe_context record now results in writing arbitrary data from the kernel stack into POR_EL0. Fix this by adding a __valid_fields member to struct user_access_state, and zeroing the struct on allocation. restore_poe_context() then indicates that the por_el0 field is valid by setting the corresponding bit in __valid_fields, and restore_user_access_state() only touches POR_EL0 if there is a valid value to set it to. This is in line with how POR_EL0 was originally handled; all frame records are currently optional, except fpsimd_context. To ensure that __valid_fields is kept in sync, fields (currently just por_el0) are now accessed via accessors and prefixed with __ to discourage direct access. Fixes: `2e8a1acea8` ("arm64: signal: Improve POR_EL0 handling to avoid uaccess failures") Cc: <stable@vger.kernel.org> Reported-by: Will Deacon <will@kernel.org> Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-05-01 17:44:25 +01:00
Zhaoyang Huang	4d8e74ad45	arm64: Reserve an extra page for early kernel mapping The final part of [data, end) segment may overflow into the next page of init_pg_end[1] which is the gap page before early_init_stack[2]: [1] crash_arm64_v9.0.1> vtop ffffffed00601000 VIRTUAL PHYSICAL ffffffed00601000 83401000 PAGE DIRECTORY: ffffffecffd62000 PGD: ffffffecffd62da0 => 10000000833fb003 PMD: ffffff80033fb018 => 10000000833fe003 PTE: ffffff80033fe008 => 68000083401f03 PAGE: 83401000 PTE PHYSICAL FLAGS 68000083401f03 83401000 (VALID\|SHARED\|AF\|NG\|PXN\|UXN) PAGE PHYSICAL MAPPING INDEX CNT FLAGS fffffffec00d0040 83401000 0 0 1 4000 reserved [2] ffffffed002c8000 (r) __pi__data ffffffed0054e000 (d) __pi___bss_start ffffffed005f5000 (b) __pi_init_pg_dir ffffffed005fe000 (b) __pi_init_pg_end ffffffed005ff000 (B) early_init_stack ffffffed00608000 (b) __pi__end For 4K pages, the early kernel mapping may use 2MB block entries but the kernel segments are only 64KB aligned. Segment boundaries that fall within a 2MB block therefore require a PTE table so that different attributes can be applied on either side of the boundary. KERNEL_SEGMENT_COUNT still correctly counts the five permanent kernel VMAs registered by declare_kernel_vmas(). However, since commit `5973a62efa` ("arm64: map [_text, _stext) virtual address range non-executable+read-only"), the early mapper also maps [_text, _stext) separately from [_stext, _etext). This adds one more early-only split and can require one more page-table page than the existing EARLY_SEGMENT_EXTRA_PAGES allowance reserves. Increase the 4K-page early mapping allowance by one page to cover that additional split. Fixes: `5973a62efa` ("arm64: map [_text, _stext) virtual address range non-executable+read-only") Assisted-by: TRAE:GLM-5.1 Suggested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> [catalin.marinas@arm.com: rewrote part of the commit log] [catalin.marinas@arm.com: expanded the code comment] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-05-01 16:20:35 +01:00
Helge Deller	e6a650acbd	parisc: Fix build failure for 32-bit kernel with PA2.0 instruction set The CONFIG_PA11 option can not be used as a reliable check if we build a 32-bit kernel which needs the 32-bit VDSO. Instead depend on CONFIG_64BIT and CONFIG_COMPAT only. Reported-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Tested-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Signed-off-by: Helge Deller <deller@gmx.de>	2026-04-30 09:10:07 +02:00
Johan Hovold	b8425ceefe	parisc: drivers: switch to dynamic root device Driver core expects devices to be dynamically allocated and will, for example, complain loudly if a device that lacks a release function is ever freed. Use root_device_register() to allocate and register the root device instead of open coding using a static device. While at it, drop the redundant additional reference taken at init. Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Helge Deller <deller@gmx.de>	2026-04-28 17:18:52 +02:00
Heiko Carstens	b95e0e7928	s390/mm: Fix phys_to_folio() usage in do_secure_storage_access() In case of a Secure-Storage-Access exception the effective aka virtual address which caused the exception is contained within the TEID. do_secure_storage_access() incorrectly uses phys_to_folio() instead of virt_to_folio() to translate the virtual address to the corresponding folio. Fix this by using virt_to_folio() instead of phys_to_folio(). Fixes: `084ea4d611` ("s390/mm: add (non)secure page access exceptions handlers") Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>	2026-04-28 14:45:03 +02:00
Vasily Gorbik	e14622a758	s390/debug: Reject zero-length input in debug_input_flush_fn() debug_input_flush_fn() always copies one byte from the userspace buffer with copy_from_user() regardless of the supplied write length. A zero-length write therefore reads one byte beyond the caller's buffer. If the stale byte happens to be '-' or a digit the debug log is silently flushed. With an unmapped buffer the call returns -EFAULT. Reject zero-length writes before copying from userspace. Cc: stable@vger.kernel.org # v5.10+ Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>	2026-04-28 14:45:02 +02:00
Pengpeng Hou	c366a7b5ed	s390/debug: Reject zero-length input before trimming a newline debug_get_user_string() duplicates the userspace buffer with memdup_user_nul() and then unconditionally looks at buffer[user_len - 1] to strip a trailing newline. A zero-length write reaches this helper unchanged, so the newline trim reads before the start of the allocated buffer. Reject empty writes before accessing the last input byte. Fixes: `66a464dbc8` ("[PATCH] s390: debug feature changes") Cc: stable@vger.kernel.org Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Tested-by: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20260417073530.96002-1-pengpeng@iscas.ac.cn Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>	2026-04-28 14:45:02 +02:00
Breno Leitao	caecde119e	arm64/irqflags: __always_inline the arch_local_irq_() helpers The arch_local_irq_() wrappers in <asm/irqflags.h> dispatch between two underlying primitives: the __daif_* path on most systems, and the __pmr_* path on builds that use GIC PMR-based masking (Pseudo-NMI). The leaf primitives are already __always_inline, but the wrappers themselves are plain "static inline". That is unsafe for noinstr callers: nothing prevents the compiler from emitting an out-of-line copy of e.g. arch_local_irq_disable(), and an out-of-line copy can be instrumented (ftrace, kcov, sanitizers), which breaks the noinstr contract on the entry/idle paths that rely on these helpers. x86 hit and fixed exactly this class of bug in commit `7a745be1cc` ("x86/entry: __always_inline irqflags for noinstr"). Force-inline all of the arch_local_irq_*() wrappers so they cannot be emitted out-of-line: - arch_local_irq_enable() - arch_local_irq_disable() - arch_local_save_flags() - arch_irqs_disabled_flags() - arch_irqs_disabled() - arch_local_irq_save() - arch_local_irq_restore() The primary motivation is noinstr safety. There is a useful side effect for fleet-wide profiling: when the wrapper is emitted out-of-line, samples taken inside it during the post-WFI IRQ unmask in default_idle_call() are attributed to arch_local_irq_enable rather than default_idle_call(), and the FP-unwinder loses default_idle_call() from the chain. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Leonardo Bras <leo.bras@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-04-27 13:13:36 +01:00
Wentao Guan	4023b7424e	arm64/scs: Fix potential sign extension issue of advance_loc4 The expression (opcode++ << 24) and exp code_alignment_factor may overflow signed int and becomes negative. Fix this by casting each byte to u64 before shifting. Also fix the misaligned break statement while we are here. Example of the result can be seen here: Link: https://godbolt.org/z/zhY8d3595 It maybe not a real problem, but could be a issue in future. Fixes: `d499e9627d` ("arm64/scs: Fix handling of advance_loc4") Signed-off-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-04-27 12:16:26 +01:00
Paolo Bonzini	909eac682c	Merge tag 'kvmarm-fixes-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 7.1, take #1 - Allow tracing for non-pKVM, which was accidentally disabled when the series was merged - Rationalise the way the pKVM hypercall ranges are defined by using the same mechanism as already used for the vcpu_sysreg enum - Enforce that SMCCC function numbers relayed by the pKVM proxy are actually compliant with the specification - Fix a couple of feature to idreg mappings which resulted in the wrong sanitisation being applied - Fix the GICD_IIDR revision number field that could never been written correctly by userspace - Make kvm_vcpu_initialized() correctly use its parameter instead of relying on the surrounding context - Enforce correct ordering in __pkvm_init_vcpu(), plugging a potential pin leak at the same time - Move __pkvm_init_finalise() to a less dangerous spot, avoiding future problems - Restore functional userspace irqchip support after a four year breakage (last functional kernel was 5.18...). This is obviously ripe for garbage collection. - ... and the usual lot of spelling fixes	2026-04-27 04:24:34 -04:00
Linus Torvalds	129d6eb266	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux Pull ARM updates from Russell King: - fix a race condition handling PG_dcache_clean - further cleanups for the fault handling, allowing RT to be enabled - fixing nzones validation in adfs filesystem driver - fix for module unwinding * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux: ARM: 9463/1: Allow to enable RT ARM: 9472/1: fix race condition on PG_dcache_clean in __sync_icache_dcache() ARM: 9471/1: module: fix unwind section relocation out of range error fs/adfs: validate nzones in adfs_validate_bblk() ARM: provide individual is_translation_fault() and is_permission_fault() ARM: move FSR fault status definitions before fsr_fs() ARM: use BIT() and GENMASK() for fault status register fields ARM: move is_permission_fault() and is_translation_fault() to fault.h ARM: move vmalloc() lazy-page table population ARM: ensure interrupts are enabled in __do_user_fault()	2026-04-25 07:44:26 -07:00
Linus Torvalds	8f4e8687c8	Merge tag 'x86-urgent-2026-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: - Prevent deadlock during shstk sigreturn (Rick Edgecombe) - Disable FRED when PTI is forced on (Dave Hansen) - Revert a CPA INVLPGB optimization that did not properly handle discontiguous virtual addresses (Dave Hansen) * tag 'x86-urgent-2026-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Revert INVLPGB optimization for set_memory code x86/cpu: Disable FRED when PTI is forced on x86/shstk: Prevent deadlock during shstk sigreturn	2026-04-24 10:05:42 -07:00
Linus Torvalds	feff82eb5f	Merge tag 'riscv-for-linus-7.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Paul Walmsley: "There is one significant change outside arch/riscv in this pull request: the addition of a set of KUnit tests for strlen(), strnlen(), and strrchr(). Otherwise, the most notable changes are to add some RISC-V-specific string function implementations, to remove XIP kernel support, to add hardware error exception handling, and to optimize our runtime unaligned access speed testing. A few comments on the motivation for removing XIP support. It's been broken in the RISC-V kernel for months. The code is not easy to maintain. Furthermore, for XIP support to truly be useful for RISC-V, we think that compile-time feature switches would need to be added for many of the RISC-V ISA features and microarchitectural properties that are currently implemented with runtime patching. No one has stepped forward to take responsibility for that work, so many of us think it's best to remove it until clear use cases and champions emerge. Summary: - Add Kunit correctness testing and microbenchmarks for strlen(), strnlen(), and strrchr() - Add RISC-V-specific strnlen(), strchr(), strrchr() implementations - Add hardware error exception handling - Clean up and optimize our unaligned access probe code - Enable HAVE_IOREMAP_PROT to be able to use generic_access_phys() - Remove XIP kernel support - Warn when addresses outside the vmemmap range are passed to vmemmap_populate() - Update the ACPI FADT revision check to warn if it's not at least ACPI v6.6, which is when key RISC-V-specific tables were added to the specification - Increase COMMAND_LINE_SIZE to 2048 to match ARM64, x86, PowerPC, etc. - Make kaslr_offset() a static inline function, since there's no need for it to show up in the symbol table - Add KASLR offset and SATP to the VMCOREINFO ELF notes to improve kdump support - Add Makefile cleanup rule for vdso_cfi copied source files, and add a .gitignore for the build artifacts in that directory - Remove some redundant ifdefs that check Kconfig macros - Add missing SPDX license tag to the CFI selftest - Simplify UTS_MACHINE assignment in the RISC-V Makefile - Clarify some unclear comments and remove some superfluous comments - Fix various English typos across the RISC-V codebase" * tag 'riscv-for-linus-7.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (31 commits) riscv: Remove support for XIP kernel riscv: Reuse compare_unaligned_access() in check_vector_unaligned_access() riscv: Split out compare_unaligned_access() riscv: Reuse measure_cycles() in check_vector_unaligned_access() riscv: Split out measure_cycles() for reuse riscv: Clean up & optimize unaligned scalar access probe riscv: lib: add strrchr() implementation riscv: lib: add strchr() implementation riscv: lib: add strnlen() implementation lib/string_kunit: extend benchmarks to strnlen() and chr searches lib/string_kunit: add performance benchmark for strlen() lib/string_kunit: add correctness test for strrchr() lib/string_kunit: add correctness test for strnlen() lib/string_kunit: add correctness test for strlen() riscv: vdso_cfi: Add .gitignore for build artifacts riscv: vdso_cfi: Add clean rule for copied sources riscv: enable HAVE_IOREMAP_PROT riscv: mm: WARN_ON() for bad addresses in vmemmap_populate() riscv: acpi: update FADT revision check to 6.6 riscv: add hardware error trap handler support ...	2026-04-24 10:00:37 -07:00
Linus Torvalds	ff57d59200	Merge tag 'loongarch-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Adjust build infrastructure for 32BIT/64BIT - Add HIGHMEM (PKMAP and FIX_KMAP) support - Show and handle CPU vulnerabilites correctly - Batch the icache maintenance for jump_label - Add more atomic instructions support for BPF JIT - Add more features (e.g. fsession) support for BPF trampoline - Some bug fixes and other small changes * tag 'loongarch-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (21 commits) selftests/bpf: Enable CAN_USE_LOAD_ACQ_STORE_REL for LoongArch LoongArch: BPF: Add fsession support for trampolines LoongArch: BPF: Introduce emit_store_stack_imm64() helper LoongArch: BPF: Support up to 12 function arguments for trampoline LoongArch: BPF: Support small struct arguments for trampoline LoongArch: BPF: Open code and remove invoke_bpf_mod_ret() LoongArch: BPF: Support load-acquire and store-release instructions LoongArch: BPF: Support 8 and 16 bit read-modify-write instructions LoongArch: BPF: Add the default case in emit_atomic() and rename it LoongArch: Define instruction formats for AM{SWAP/ADD}.{B/H} and DBAR LoongArch: Batch the icache maintenance for jump_label LoongArch: Add flush_icache_all()/local_flush_icache_all() LoongArch: Add spectre boundry for syscall dispatch table LoongArch: Show CPU vulnerabilites correctly LoongArch: Make arch_irq_work_has_interrupt() true only if IPI HW exist LoongArch: Use get_random_canary() for stack canary init LoongArch: Improve the logging of disabling KASLR LoongArch: Align FPU register state to 32 bytes LoongArch: Handle CONFIG_32BIT in syscall_get_arch() LoongArch: Add HIGHMEM (PKMAP and FIX_KMAP) support ...	2026-04-24 09:54:45 -07:00
Linus Torvalds	64edfa6506	Merge tag 'net-deletions' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking deletions from Jakub Kicinski: "Delete some obsolete networking code Old code like amateur radio and NFC have long been a burden to core networking developers. syzbot loves to find bugs in BKL-era code, and noobs try to fix them. If we want to have a fighting chance of surviving the LLM-pocalypse this code needs to find a dedicated owner or get deleted. We've talked about these deletions multiple times in the past and every time someone wanted the code to stay. It is never very clear to me how many of those people actually use the code vs are just nostalgic to see it go. Amateur radio did have occasional users (or so I think) but most users switched to user space implementations since its all super slow stuff. Nobody stepped up to maintain the kernel code. We were lucky enough to find someone who wants to help with NFC so we're giving that a chance. Let's try to put the rest of this code behind us" * tag 'net-deletions' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: drivers: net: 8390: wd80x3: Remove this driver drivers: net: 8390: ultra: Remove this driver drivers: net: 8390: AX88190: Remove this driver drivers: net: fujitsu: fmvj18x: Remove this driver drivers: net: smsc: smc91c92: Remove this driver drivers: net: smsc: smc9194: Remove this driver drivers: net: amd: nmclan: Remove this driver drivers: net: amd: lance: Remove this driver drivers: net: 3com: 3c589: Remove this driver drivers: net: 3com: 3c574: Remove this driver drivers: net: 3com: 3c515: Remove this driver drivers: net: 3com: 3c509: Remove this driver net: packetengines: remove obsolete yellowfin driver and vendor dir net: packetengines: remove obsolete hamachi driver net: remove unused ATM protocols and legacy ATM device drivers net: remove ax25 and amateur radio (hamradio) subsystem net: remove ISDN subsystem and Bluetooth CMTP caif: remove CAIF NETWORK LAYER	2026-04-24 09:41:58 -07:00
Sebastian Andrzej Siewior	c6e61c06d6	ARM: 9463/1: Allow to enable RT All known issues have been adressed. Allow to select RT. Acked-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>	2026-04-24 15:14:59 +01:00
Russell King (Oracle)	6f685f12fd	Merge branches 'adfs', 'arm-fault-handling', 'fixes' and 'misc'	2026-04-24 15:14:44 +01:00
Brian Ruley	75f9a484e8	ARM: 9472/1: fix race condition on PG_dcache_clean in __sync_icache_dcache() This bug was already discovered and fixed for arm64 in commit `588a513d34` ("arm64: Fix race condition on PG_dcache_clean in __sync_icache_dcache()"). Verified with added instrumentation to track dcache flushes in a ring buffer, as shown by the (distilled) output: kernel: SIGILL at b6b80ac0 cpu 1 pid 32663 linux_pte=8eff659f hw_pte=8eff6e7e young=1 exec=1 kernel: dcache flush START cpu0 pfn=8eff6 ts=48629557020154 kernel: dcache flush SKIPPED cpu1 pfn=8eff6 ts=48629557020154 kernel: dcache flush FINISH cpu0 pfn=8eff6 ts=48629557036154 audisp-syslog: comm="journalctl" exe="/usr/bin/journalctl" sig=4 [...] Discussions in the mailing list mentioned that arch/arm is also affected but the fix was never applied to it [1][2]. Apply the change now, since the race condition can cause sporadic SIGILL's and SEGV's especially while under high memory pressure. Link: https://lore.kernel.org/all/adzMOdySgMIePcue@willie-the-truck [1] Link: https://lore.kernel.org/all/20210514095001.13236-1-catalin.marinas@arm.com [2] Signed-off-by: Brian Ruley <brian.ruley@gehealthcare.com> Reviewed-by: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> Fixes: `6012191aa9` ("ARM: 6380/1: Introduce __sync_icache_dcache() for VIPT caches") Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>	2026-04-24 15:12:52 +01:00
Dave Hansen	a39a701482	x86/mm: Revert INVLPGB optimization for set_memory code tl;dr: Revert an INVLPGB optimization that did not properly handle discontiguous virtual addresses. Full story: I got a report from some graphics (i915) folks that bisected a regression in their test suite to `86e6815b31` ("x86/mm: Change cpa_flush() to call flush_kernel_range() directly"). There was a bit of flip-flopping on the exact bisect, but the code here does seem wrong to me. The i915 folks were calling set_pages_array_wc(), so using the CPA_PAGES_ARRAY mode. Basically, the 'struct cpa_data' can wrap up all kinds of page table changes. Some of these are virtually contiguous, but some are very much not which is one reason why there are ->vaddr and ->pages arrays. `86e6815b31` made the mistake of assuming that the virtual addresses in the cpa_data are always contiguous. It got things right when neither CPA_ARRAY/CPA_PAGES_ARRAY is used, but theoretically wrong when either of those is used. In the i915 case, it probably failed to flush some WB TLB entries and install WC ones, leaving some data in the caches and not flushing it out to where the device could see it. That eventually caused graphics problems. Revert the INVLPGB optimization. It can be reintroduced later, but it will need to be a bit careful about the array modes. Fixes: `86e6815b31` ("x86/mm: Change cpa_flush() to call flush_kernel_range()") Reported-by: Cui, Ling <ling.cui@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20260421151909.6B3281C6@davehans-spike.ostc.intel.com	2026-04-24 15:42:48 +02:00
Marc Zyngier	4ce98bf086	KVM: arm64: Wake-up from WFI when iqrchip is in userspace It appears that there is nothing in the wake-up path that evaluates whether the in-kernel interrupts are pending unless we have a vgic. This means that the userspace irqchip support has been broken for about four years, and nobody noticed. It was also broken before as we wouldn't wake-up on a PMU interrupt, but hey, who cares... It is probably time to remove the feature altogether, because it was a terrible idea 10 years ago, and it still is. Fixes: `b57de4ffd7` ("KVM: arm64: Simplify kvm_cpu_has_pending_timer()") Link: https://patch.msgid.link/20260423163607.486345-1-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org	2026-04-24 12:03:57 +01:00
Quentin Perret	5bb0aed57b	KVM: arm64: Fix initialisation order in __pkvm_init_finalise() fix_host_ownership() walks the hypervisor's stage-1 page-table to adjust the host's stage-2 accordingly. Any such adjustment that requires cache maintenance operations depends on the per-CPU hyp fixmap being present. However, fix_host_ownership() is currently called before fix_hyp_pgtable_refcnt() and hyp_create_fixmap(), so the fixmap does not yet exist when it runs. This is benign today because the host stage-2 starts empty and no CMOs are needed, but it becomes a latent crash as soon as fix_host_ownership() is extended to operate on a non-empty page-table. Reorder the calls so that fix_hyp_pgtable_refcnt() and hyp_create_fixmap() complete before fix_host_ownership() is invoked. Fixes: `0d16d12eb2` ("KVM: arm64: Fix-up hyp stage-1 refcounts for all pages mapped at EL2") Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20260424084908.370776-7-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org	2026-04-24 12:03:57 +01:00

1 2 3 4 5 ...

244117 Commits