linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-21 20:45:27 -04:00

Author	SHA1	Message	Date
Thorsten Blum	5228ed2c62	riscv: KGDB: Replace deprecated strcpy in kgdb_arch_handle_qxfer_pkt strcpy() is deprecated because it can cause a buffer overflow when the sizes of the source and the destination are not known at compile time. Use strscpy() instead. Link: https://github.com/KSPP/linux/issues/88 Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://lore.kernel.org/r/20251011004750.461954-1-thorsten.blum@linux.dev Signed-off-by: Paul Walmsley <pjw@kernel.org>	2025-10-27 23:30:01 -06:00
Ben Dooks	44aa25c000	riscv: asm: use .insn for making custom instructions The assembler has .insn for building custom instructions now, so change the .4byte to .insn. This ensures the output is marked as an instruction and not as data which may confuse both debuggers and anything else that relies on this sort of marking. Add an ASM_INSN_I() wrapper in asm.h to allow the selecting of how this is output so older assemblers are still good. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Link: https://lore.kernel.org/r/20251024171640.65232-1-ben.dooks@codethink.co.uk Signed-off-by: Paul Walmsley <pjw@kernel.org>	2025-10-27 18:58:37 -06:00
Vivian Wang	5fada16057	riscv: tests: Make RISCV_KPROBES_KUNIT tristate This disallows KUNIT=m and RISCV_KPROBES_KUNIT=y, which produces these relocs_check.sh warnings when RELOCATABLE=y: WARNING: 3 bad relocations ffffffff81e24118 R_RISCV_64 kunit_unary_assert_format ffffffff81e24a60 R_RISCV_64 kunit_binary_assert_format ffffffff81e269d0 R_RISCV_JUMP_SLOT __kunit_do_failed_assertion This fixes allmodconfig build. Reported-by: Inochi Amaoto <inochiama@gmail.com> Fixes: `f2fab61282` ("riscv: Add kprobes KUnit test") Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn> Tested-by: Inochi Amaoto <inochiama@gmail.com> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20251020-riscv-kunit-kconfig-fix-6-18-v1-2-d773b5d5ce48@iscas.ac.cn Signed-off-by: Paul Walmsley <pjw@kernel.org>	2025-10-27 18:49:34 -06:00
Vivian Wang	2176603255	riscv: tests: Rename kprobes_test_riscv to kprobes_riscv According to Documentation/dev-tools/kunit/style.rst a KUnit test suite normally should not have "test" in the name. Rename it to follow the style guide. Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn> Tested-by: Inochi Amaoto <inochiama@gmail.com> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20251020-riscv-kunit-kconfig-fix-6-18-v1-1-d773b5d5ce48@iscas.ac.cn Signed-off-by: Paul Walmsley <pjw@kernel.org>	2025-10-27 18:48:14 -06:00
Miaoqian Lin	c42458fcf5	riscv: Fix memory leak in module_frob_arch_sections() The current code directly overwrites the scratch pointer with the return value of kvrealloc(). If kvrealloc() fails and returns NULL, the original buffer becomes unreachable, causing a memory leak. Fix this by using a temporary variable to store kvrealloc()'s return value and only update the scratch pointer on success. Found via static anlaysis and this is similar to commit `42378a9ca5` ("bpf, verifier: Fix memory leak in array reallocation for stack state") Fixes: `be17c0df67` ("riscv: module: Optimize PLT/GOT entry counting") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Link: https://lore.kernel.org/r/20251026091912.39727-1-linmq006@gmail.com Signed-off-by: Paul Walmsley <pjw@kernel.org>	2025-10-27 11:40:44 -06:00
Josephine Pfeiffer	a74f038fa5	riscv: ptdump: use seq_puts() in pt_dump_seq_puts() macro The pt_dump_seq_puts() macro incorrectly uses seq_printf() instead of seq_puts(). This is both a performance issue and conceptually wrong, as the macro name suggests plain string output (puts) but the implementation uses formatted output (printf). The macro is used in ptdump.c:301 to output a newline character. Using seq_printf() adds unnecessary overhead for format string parsing when outputting this constant string. This bug was introduced in commit `59c4da8640` ("riscv: Add support to dump the kernel page tables") in 2020, which copied the implementation pattern from other architectures that had the same bug. Fixes: `59c4da8640` ("riscv: Add support to dump the kernel page tables") Signed-off-by: Josephine Pfeiffer <hi@josie.lol> Link: https://lore.kernel.org/r/20251018170451.3355496-1-hi@josie.lol Signed-off-by: Paul Walmsley <pjw@kernel.org>	2025-10-27 11:40:43 -06:00
Chunyan Zhang	060ea84a48	riscv: stacktrace: Disable KASAN checks for non-current tasks Unwinding the stack of a task other than current, KASAN would report "BUG: KASAN: out-of-bounds in walk_stackframe+0x41c/0x460" There is a same issue on x86 and has been resolved by the commit `84936118bd` ("x86/unwind: Disable KASAN checks for non-current tasks") The solution could be applied to RISC-V too. This patch also can solve the issue: https://seclists.org/oss-sec/2025/q4/23 Fixes: `5d8544e2d0` ("RISC-V: Generic library routines and assembly") Co-developed-by: Jiakai Xu <xujiakai2025@iscas.ac.cn> Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn> Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn> Link: https://lore.kernel.org/r/20251022072608.743484-1-zhangchunyan@iscas.ac.cn [pjw@kernel.org: clean up checkpatch issues] Signed-off-by: Paul Walmsley <pjw@kernel.org>	2025-10-27 11:40:43 -06:00
Borislav Petkov (AMD)	8a9fb5129e	x86/microcode/AMD: Limit Entrysign signature checking to known generations Limit Entrysign sha256 signature checking to CPUs in the range Zen1-Zen5. X86_BUG cannot be used here because the loading on the BSP happens way too early, before the cpufeatures machinery has been set up. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/all/20251023124629.5385-1-bp@kernel.org	2025-10-27 17:07:17 +01:00
Linus Torvalds	dbfc6422a3	Merge tag 'x86_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Remove dead code leftovers after a recent mitigations cleanup which fail a Clang build - Make sure a Retbleed mitigation message is printed only when necessary - Correct the last Zen1 microcode revision for which Entrysign sha256 check is needed - Fix a NULL ptr deref when mounting the resctrl fs on a system which supports assignable counters but where L3 total and local bandwidth monitoring has been disabled at boot * tag 'x86_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/bugs: Remove dead code which might prevent from building x86/bugs: Qualify RETBLEED_INTEL_MSG x86/microcode: Fix Entrysign revision check for Zen1/Naples x86,fs/resctrl: Fix NULL pointer dereference with events force-disabled in mbm_event mode	2025-10-26 09:57:18 -07:00
Linus Torvalds	9bb956508c	Merge tag 'riscv-for-linus-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: - Close a race during boot between userspace vDSO usage and some late-initialized vDSO data - Improve performance on systems with non-CPU-cache-coherent DMA-capable peripherals by enabling write combining on pgprot_dmacoherent() allocations - Add human-readable detail for RISC-V IPI tracing - Provide more information to zsmalloc on 64-bit RISC-V to improve allocation - Silence useless boot messages about CPUs that have been disabled in DT - Resolve some compiler and smatch warnings and remove a redundant macro * tag 'riscv-for-linus-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: hwprobe: avoid uninitialized variable use in hwprobe_arch_id() riscv: cpufeature: avoid uninitialized variable in has_thead_homogeneous_vlenb() riscv: hwprobe: Fix stale vDSO data for late-initialized keys at boot riscv: add a forward declaration for cpuinfo_op RISC-V: Don't print details of CPUs disabled in DT riscv: Remove the PER_CPU_OFFSET_SHIFT macro riscv: mm: Define MAX_POSSIBLE_PHYSMEM_BITS for zsmalloc riscv: Register IPI IRQs with unique names ACPI: RIMT: Fix unused function warnings when CONFIG_IOMMU_API is disabled RISC-V: Define pgprot_dmacoherent() for non-coherent devices	2025-10-25 09:35:26 -07:00
Linus Torvalds	31009296f8	Merge tag 'pci-v6.18-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Add DWC custom pci_ops for the root bus instead of overwriting the DBI base address, which broke drivers that rely on the DBI address for iATU programming; fixes an FU740 probe regression (Krishna Chaitanya Chundru) - Revert qcom ECAM enablement, which is rendered unnecessary by the DWC custom pci_ops (Krishna Chaitanya Chundru) - Fix longstanding MIPS Malta resource registration issues to avoid exposing them when the next commit fixes the boot failure (Maciej W. Rozycki) - Use pcibios_align_resource() on MIPS Malta to fix boot failure caused by using the generic pci_enable_resources() (Ilpo Järvinen) - Enable only ASPM L0s and L1, not L1 PM Substates, for devicetree platforms because we lack information required to configure L1 Substates; fixes regressions on powerpc and rockchip. A qcom regression (L1 Substates no longer enabled) remains and will be addressed next (Bjorn Helgaas) * tag 'pci-v6.18-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI/ASPM: Enable only L0s and L1 for devicetree platforms MIPS: Malta: Use pcibios_align_resource() to block io range MIPS: Malta: Fix PCI southbridge legacy resource reservations MIPS: Malta: Fix keyboard resource preventing i8042 driver from registering Revert "PCI: qcom: Prepare for the DWC ECAM enablement" PCI: dwc: Use custom pci_ops for root bus DBI vs ECAM config access	2025-10-24 16:43:08 -07:00
Linus Torvalds	9b9b6e71ee	Merge tag 'soc-fixes-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC fixes from Arnd Bergmann: "The main change this time is an update to the MAINTAINERS file, listing Krzysztof Kozlowski, Alexandre Belloni, and Linus Walleij as additional maintainers for the SoC tree, in order to go back to a group maintainership. Drew Fustini joins as an additional reviewer for the SoC tree. Thanks to all of you for volunteering to help out. On the actual bugfixes, we have a few correctness changes for firmware drivers (qtee, arm-ffa, scmi) and two devicetree fixes for Raspberry Pi" * tag 'soc-fixes-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: soc: officially expand maintainership team firmware: arm_scmi: Fix premature SCMI_XFER_FLAG_IS_RAW clearing in raw mode firmware: arm_scmi: Skip RAW initialization on failure include: trace: Fix inflight count helper on failed initialization firmware: arm_scmi: Account for failed debug initialization ARM: dts: broadcom: rpi: Switch to V3D firmware clock arm64: dts: broadcom: bcm2712: Define VGIC interrupt firmware: arm_ffa: Add support for IMPDEF value in the memory access descriptor tee: QCOMTEE should depend on ARCH_QCOM tee: qcom: return -EFAULT instead of -EINVAL if copy_from_user() fails tee: qcom: prevent potential off by one read	2025-10-24 11:15:17 -07:00
Andy Shevchenko	84dfce65a7	x86/bugs: Remove dead code which might prevent from building Clang, in particular, is not happy about dead code: arch/x86/kernel/cpu/bugs.c:1830:20: error: unused function 'match_option' [-Werror,-Wunused-function] 1830 \| static inline bool match_option(const char arg, int arglen, const char opt) \| ^~~~~~~~~~~~ 1 error generated. Remove a leftover from the previous cleanup. Fixes: `02ac6cc8c5` ("x86/bugs: Simplify SSB cmdline parsing") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://patch.msgid.link/20251024125959.1526277-1-andriy.shevchenko%40linux.intel.com	2025-10-24 09:42:00 -07:00
Fangyu Yu	8c5fa3764f	RISC-V: KVM: Remove automatic I/O mapping for VM_PFNMAP As of commit `aac6db75a9` ("vfio/pci: Use unmap_mapping_range()"), vm_pgoff may no longer guaranteed to hold the PFN for VM_PFNMAP regions. Using vma->vm_pgoff to derive the HPA here may therefore produce incorrect mappings. Instead, I/O mappings for such regions can be established on-demand during g-stage page faults, making the upfront ioremap in this path is unnecessary. Fixes: `aac6db75a9` ("vfio/pci: Use unmap_mapping_range()") Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com> Tested-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Reviewed-by: Guo Ren <guoren@kernel.org> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20251021142131.78796-1-fangyu.yu@linux.alibaba.com Signed-off-by: Anup Patel <anup@brainfault.org>	2025-10-24 21:24:36 +05:30
Farhan Ali	b45873c3f0	s390/pci: Restore IRQ unconditionally for the zPCI device Commit `c1e18c17bd` ("s390/pci: add zpci_set_irq()/zpci_clear_irq()"), introduced the zpci_set_irq() and zpci_clear_irq(), to be used while resetting a zPCI device. Commit `da995d538d` ("s390/pci: implement reset_slot for hotplug slot"), mentions zpci_clear_irq() being called in the path for zpci_hot_reset_device(). But that is not the case anymore and these functions are not called outside of this file. Instead zpci_hot_reset_device() relies on zpci_disable_device() also clearing the IRQs, but misses to reset the zdev->irqs_registered flag. However after a CLP disable/enable reset, the device's IRQ are unregistered, but the flag zdev->irq_registered does not get cleared. It creates an inconsistent state and so arch_restore_msi_irqs() doesn't correctly restore the device's IRQ. This becomes a problem when a PCI driver tries to restore the state of the device through pci_restore_state(). Restore IRQ unconditionally for the device and remove the irq_registered flag as its redundant. Fixes: `c1e18c17bd` ("s390/pci: add zpci_set_irq()/zpci_clear_irq()") Cc: stable@vger.kernnel.org Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2025-10-24 15:25:43 +02:00
Heiko Carstens	840bc67cf0	s390: Update defconfigs Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2025-10-24 15:25:43 +02:00
Arnd Bergmann	e58794dbf7	Merge tag 'arm-soc/for-6.18/devicetree-arm64-fixes' of https://github.com/Broadcom/stblinux into arm/fixes This pull request contains Broadcom ARM64-based SoCs Device Tree fixes for 6.18, please pull the following: - Peter describes the VGIC interrupt line such that KVM can be used on Raspberry Pi 5 systems. * tag 'arm-soc/for-6.18/devicetree-arm64-fixes' of https://github.com/Broadcom/stblinux: arm64: dts: broadcom: bcm2712: Define VGIC interrupt Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2025-10-23 22:30:41 +02:00
Linus Torvalds	266ee584e5	Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Do not make a clean PTE dirty in pte_mkwrite() The Arm architecture, for backwards compatibility reasons (ARMv8.0 before in-hardware dirty bit management - DBM), uses the PTE_RDONLY bit to mean !dirty while the PTE_WRITE bit means DBM enabled. The arm64 pte_mkwrite() simply clears the PTE_RDONLY bit and this inadvertently makes the PTE pte_hw_dirty(). Most places making a PTE writable also invoke pte_mkdirty() but do_swap_page() does not and we end up with dirty, freshly swapped in, writeable pages. - Do not warn if the destination page is already MTE-tagged in copy_highpage() In the majority of the cases, a destination page copied into is freshly allocated without the PG_mte_tagged flag set. However, the folio migration may be restarted if __folio_migrate_mapping() failed, triggering the benign WARN_ON_ONCE(). * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: mte: Do not warn if the page is already tagged in copy_highpage() arm64, mm: avoid always making PTE dirty in pte_mkwrite()	2025-10-23 09:26:47 -10:00
Catalin Marinas	b98c94eed4	arm64: mte: Do not warn if the page is already tagged in copy_highpage() The arm64 copy_highpage() assumes that the destination page is newly allocated and not MTE-tagged (PG_mte_tagged unset) and warns accordingly. However, following commit `060913999d` ("mm: migrate: support poisoned recover from migrate folio"), folio_mc_copy() is called before __folio_migrate_mapping(). If the latter fails (-EAGAIN), the copy will be done again to the same destination page. Since copy_highpage() already set the PG_mte_tagged flag, this second copy will warn. Replace the WARN_ON_ONCE(page already tagged) in the arm64 copy_highpage() with a comment. Reported-by: syzbot+d1974fc28545a3e6218b@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/68dda1ae.a00a0220.102ee.0065.GAE@google.com Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: stable@vger.kernel.org # 6.12.x Reviewed-by: Yang Shi <yang@os.amperecomputing.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2025-10-23 17:34:58 +01:00
Gerd Bayer	0fd20f65df	s390/pci: Avoid deadlock between PCI error recovery and mlx5 crdump Do not block PCI config accesses through pci_cfg_access_lock() when executing the s390 variant of PCI error recovery: Acquire just device_lock() instead of pci_dev_lock() as powerpc's EEH and generig PCI AER processing do. During error recovery testing a pair of tasks was reported to be hung: mlx5_core 0000:00:00.1: mlx5_health_try_recover:338:(pid 5553): health recovery flow aborted, PCI reads still not working INFO: task kmcheck:72 blocked for more than 122 seconds. Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kmcheck state:D stack:0 pid:72 tgid:72 ppid:2 flags:0x00000000 Call Trace: [<000000065256f030>] __schedule+0x2a0/0x590 [<000000065256f356>] schedule+0x36/0xe0 [<000000065256f572>] schedule_preempt_disabled+0x22/0x30 [<0000000652570a94>] __mutex_lock.constprop.0+0x484/0x8a8 [<000003ff800673a4>] mlx5_unload_one+0x34/0x58 [mlx5_core] [<000003ff8006745c>] mlx5_pci_err_detected+0x94/0x140 [mlx5_core] [<0000000652556c5a>] zpci_event_attempt_error_recovery+0xf2/0x398 [<0000000651b9184a>] __zpci_event_error+0x23a/0x2c0 INFO: task kworker/u1664:6:1514 blocked for more than 122 seconds. Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u1664:6 state:D stack:0 pid:1514 tgid:1514 ppid:2 flags:0x00000000 Workqueue: mlx5_health0000:00:00.0 mlx5_fw_fatal_reporter_err_work [mlx5_core] Call Trace: [<000000065256f030>] __schedule+0x2a0/0x590 [<000000065256f356>] schedule+0x36/0xe0 [<0000000652172e28>] pci_wait_cfg+0x80/0xe8 [<0000000652172f94>] pci_cfg_access_lock+0x74/0x88 [<000003ff800916b6>] mlx5_vsc_gw_lock+0x36/0x178 [mlx5_core] [<000003ff80098824>] mlx5_crdump_collect+0x34/0x1c8 [mlx5_core] [<000003ff80074b62>] mlx5_fw_fatal_reporter_dump+0x6a/0xe8 [mlx5_core] [<0000000652512242>] devlink_health_do_dump.part.0+0x82/0x168 [<0000000652513212>] devlink_health_report+0x19a/0x230 [<000003ff80075a12>] mlx5_fw_fatal_reporter_err_work+0xba/0x1b0 [mlx5_core] No kernel log of the exact same error with an upstream kernel is available - but the very same deadlock situation can be constructed there, too: - task: kmcheck mlx5_unload_one() tries to acquire devlink lock while the PCI error recovery code has set pdev->block_cfg_access by way of pci_cfg_access_lock() - task: kworker mlx5_crdump_collect() tries to set block_cfg_access through pci_cfg_access_lock() while devlink_health_report() had acquired the devlink lock. A similar deadlock situation can be reproduced by requesting a crdump with > devlink health dump show pci/<BDF> reporter fw_fatal while PCI error recovery is executed on the same <BDF> physical function by mlx5_core's pci_error_handlers. On s390 this can be injected with > zpcictl --reset-fw <BDF> Tests with this patch failed to reproduce that second deadlock situation, the devlink command is rejected with "kernel answers: Permission denied" - and we get a kernel log message of: mlx5_core 1ed0:00:00.1: mlx5_crdump_collect:50:(pid 254382): crdump: failed to lock vsc gw err -5 because the config read of VSC_SEMAPHORE is rejected by the underlying hardware. Two prior attempts to address this issue have been discussed and ultimately rejected [see link], with the primary argument that s390's implementation of PCI error recovery is imposing restrictions that neither powerpc's EEH nor PCI AER handling need. Tests show that PCI error recovery on s390 is running to completion even without blocking access to PCI config space. Link: https://lore.kernel.org/all/20251007144826.2825134-1-gbayer@linux.ibm.com/ Cc: stable@vger.kernel.org Fixes: `4cdf2f4e24` ("s390/pci: implement minimal PCI error recovery") Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2025-10-23 16:11:16 +02:00
Harald Freudenberger	3ac2939bc4	crypto: s390/phmac - Do not modify the req->nbytes value The phmac implementation used the req->nbytes field on combined operations (finup, digest) to track the state: with req->nbytes > 0 the update needs to be processed, while req->nbytes == 0 means to do the final operation. For this purpose the req->nbytes field was set to 0 after successful update operation. However, aead uses the req->nbytes field after a successful hash operation to determine the amount of data to en/decrypt. So an implementation must not modify the nbytes field. Fixed by a slight rework on the phmac implementation. There is now a new field async_op in the request context which tracks the (asynch) operation to process. So the 'state' via req->nbytes is not needed any more and now this field is untouched and may be evaluated even after a request is processed by the phmac implementation. Fixes: `cbbc675506` ("crypto: s390 - New s390 specific protected key hash phmac") Reported-by: Ingo Franzki <ifranzki@linux.ibm.com> Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Tested-by: Ingo Franzki <ifranzki@linux.ibm.com> Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-10-23 12:53:23 +08:00
Linus Torvalds	0f3ad9c610	Merge tag 'mm-hotfixes-stable-2025-10-22-12-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "17 hotfixes. 12 are cc:stable and 14 are for MM. There's a two-patch DAMON series from SeongJae Park which addresses a missed check and possible memory leak. Apart from that it's all singletons - please see the changelogs for details" * tag 'mm-hotfixes-stable-2025-10-22-12-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: csky: abiv2: adapt to new folio flags field mm/damon/core: use damos_commit_quota_goal() for new goal commit mm/damon/core: fix potential memory leak by cleaning ops_filter in damon_destroy_scheme hugetlbfs: move lock assertions after early returns in huge_pmd_unshare() vmw_balloon: indicate success when effectively deflating during migration mm/damon/core: fix list_add_tail() call on damon_call() mm/mremap: correctly account old mapping after MREMAP_DONTUNMAP remap mm: prevent poison consumption when splitting THP ocfs2: clear extent cache after moving/defragmenting extents mm: don't spin in add_stack_record when gfp flags don't allow dma-debug: don't report false positives with DMA_BOUNCE_UNALIGNED_KMALLOC mm/damon/sysfs: dealloc commit test ctx always mm/damon/sysfs: catch commit test ctx alloc failure hung_task: fix warnings caused by unaligned lock pointers	2025-10-22 14:57:35 -10:00
Ilpo Järvinen	f294a5fd34	MIPS: Malta: Use pcibios_align_resource() to block io range According to Maciej W. Rozycki <macro@orcam.me.uk>, the mips_pcibios_init() for malta adjusts root bus IO resource start address to prevent interfering with PIIX4 I/O cycle decoding. Adjusting lower bound leaves PIIX4 IO resources outside of the root bus resource and assign_fixed_resource_on_bus() does not link the resources into the resource tree. Prior to commit `ae81aad5c2` ("MIPS: PCI: Use pci_enable_resources()") the arch specific pcibios_enable_resources() did not check if the resources were assigned which diverges from what PCI core checks, effectively hiding the PIIX4 IO resources were not properly within the resource tree. After starting to use pcibios_enable_resources() from PCI core, enabling PIIX4 fails: ata_piix 0000:00:0a.1: BAR 0 [io 0x01f0-0x01f7]: not claimed; can't enable device ata_piix 0000:00:0a.1: probe with driver ata_piix failed with error -22 MIPS PCI code already has support for enforcing lower bounds using PCIBIOS_MIN_IO in pcibios_align_resource() without altering the IO window start address itself. Make malta PCI code too to use PCIBIOS_MIN_IO. Fixes: `ae81aad5c2` ("MIPS: PCI: Use pci_enable_resources()") Reported-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/linux-pci/9085ab12-1559-4462-9b18-f03dcb9a4088@roeck-us.net/ Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk> Link: https://lore.kernel.org/linux-pci/alpine.DEB.2.21.2510132229120.39634@angie.orcam.me.uk/ Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Maciej W. Rozycki <macro@orcam.me.uk> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Link: https://patch.msgid.link/20251017110903.1973-1-ilpo.jarvinen@linux.intel.com	2025-10-22 11:06:31 -05:00
Maciej W. Rozycki	1d5d166361	MIPS: Malta: Fix PCI southbridge legacy resource reservations Covering the PCI southbridge legacy port I/O range with a northbridge resource reservation prevents MIPS Malta platform code from claiming its standard legacy resources. This is because request_resource() calls cause a clash with the previous reservation and consequently fail. Change to using insert_resource() so as to prevent the clash, switching the legacy reservations from: 00000000-00ffffff : MSC PCI I/O 00000020-00000021 : pic1 00000070-00000077 : rtc0 000000a0-000000a1 : pic2 [...] to: 00000000-00ffffff : MSC PCI I/O 00000000-0000001f : dma1 00000020-00000021 : pic1 00000040-0000005f : timer 00000060-0000006f : keyboard 00000070-00000077 : rtc0 00000080-0000008f : dma page reg 000000a0-000000a1 : pic2 000000c0-000000df : dma2 [...] Fixes: `ae81aad5c2` ("MIPS: PCI: Use pci_enable_resources()") Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: stable@vger.kernel.org # v6.18+ Link: https://patch.msgid.link/alpine.DEB.2.21.2510212001250.8377@angie.orcam.me.uk	2025-10-22 11:06:24 -05:00
Maciej W. Rozycki	bf5570590a	MIPS: Malta: Fix keyboard resource preventing i8042 driver from registering MIPS Malta platform code registers the PCI southbridge legacy port I/O PS/2 keyboard range as a standard resource marked as busy. It prevents the i8042 driver from registering as it fails to claim the resource in a call to i8042_platform_init(). Consequently PS/2 keyboard and mouse devices cannot be used with this platform. Fix the issue by removing the busy marker from the standard reservation, making the driver register successfully: serio: i8042 KBD port at 0x60,0x64 irq 1 serio: i8042 AUX port at 0x60,0x64 irq 12 and the resource show up as expected among the legacy devices: 00000000-00ffffff : MSC PCI I/O 00000000-0000001f : dma1 00000020-00000021 : pic1 00000040-0000005f : timer 00000060-0000006f : keyboard 00000060-0000006f : i8042 00000070-00000077 : rtc0 00000080-0000008f : dma page reg 000000a0-000000a1 : pic2 000000c0-000000df : dma2 [...] If the i8042 driver has not been configured, then the standard resource will remain there preventing any conflicting dynamic assignment of this PCI port I/O address range. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/alpine.DEB.2.21.2510211919240.8377@angie.orcam.me.uk	2025-10-22 11:06:07 -05:00
Ondrej Mosnacek	881a9c9cb7	bpf: Do not audit capability check in do_jit() The failure of this check only results in a security mitigation being applied, slightly affecting performance of the compiled BPF program. It doesn't result in a failed syscall, an thus auditing a failed LSM permission check for it is unwanted. For example with SELinux, it causes a denial to be reported for confined processes running as root, which tends to be flagged as a problem to be fixed in the policy. Yet dontauditing or allowing CAP_SYS_ADMIN to the domain may not be desirable, as it would allow/silence also other checks - either going against the principle of least privilege or making debugging potentially harder. Fix it by changing it from capable() to ns_capable_noaudit(), which instructs the LSMs to not audit the resulting denials. Link: https://bugzilla.redhat.com/show_bug.cgi?id=2369326 Fixes: `d4e89d212d` ("x86/bpf: Call branch history clearing sequence on exit") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Reviewed-by: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20251021122758.2659513-1-omosnace@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-21 18:22:47 -07:00
Thomas Weißschuh	9aa12167ef	csky: abiv2: adapt to new folio flags field Recent changes require the raw folio flags to be accessed via ".f". The merge commit introducing this change adapted most architecture code but forgot the csky abiv2. [rppt@kernel.org: add fix for arch/csky/abiv2/cacheflush.c] Link: https://lkml.kernel.org/r/aPCE238oxAB9QcZa@kernel.org Fixes: `53fbef56e0` ("mm: introduce memdesc_flags_t") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Guo Ren <guoren@kernel.org> Acked-by: Zi Yan <ziy@nvidia.com> Cc: Guo Ren <guoren@kernel.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-10-21 15:46:18 -07:00
Huang Ying	143937ca51	arm64, mm: avoid always making PTE dirty in pte_mkwrite() Current pte_mkwrite_novma() makes PTE dirty unconditionally. This may mark some pages that are never written dirty wrongly. For example, do_swap_page() may map the exclusive pages with writable and clean PTEs if the VMA is writable and the page fault is for read access. However, current pte_mkwrite_novma() implementation always dirties the PTE. This may cause unnecessary disk writing if the pages are never written before being reclaimed. So, change pte_mkwrite_novma() to clear the PTE_RDONLY bit only if the PTE_DIRTY bit is set to make it possible to make the PTE writable and clean. The current behavior was introduced in commit `73e86cb03c` ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()"). Before that, pte_mkwrite() only sets the PTE_WRITE bit, while set_pte_at() only clears the PTE_RDONLY bit if both the PTE_WRITE and the PTE_DIRTY bits are set. To test the performance impact of the patch, on an arm64 server machine, run 16 redis-server processes on socket 1 and 16 memtier_benchmark processes on socket 0 with mostly get transactions (that is, redis-server will mostly read memory only). The memory footprint of redis-server is larger than the available memory, so swap out/in will be triggered. Test results show that the patch can avoid most swapping out because the pages are mostly clean. And the benchmark throughput improves ~23.9% in the test. Fixes: `73e86cb03c` ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()") Signed-off-by: Huang Ying <ying.huang@linux.alibaba.com> Cc: Will Deacon <will@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Gavin Shan <gshan@redhat.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2025-10-21 15:00:25 +01:00
David Kaplan	204ced4108	x86/bugs: Qualify RETBLEED_INTEL_MSG When retbleed mitigation is disabled, the kernel already prints an info message that the system is vulnerable. Recent code restructuring also inadvertently led to RETBLEED_INTEL_MSG being printed as an error, which is unnecessary as retbleed mitigation was already explicitly disabled (by config option, cmdline, etc.). Qualify this print statement so the warning is not printed unless an actual retbleed mitigation was selected and is being disabled due to incompatibility with spectre_v2. Fixes: `e3b78a7ad5` ("x86/bugs: Restructure retbleed mitigation") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220624 Signed-off-by: David Kaplan <david.kaplan@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20251003171936.155391-1-david.kaplan@amd.com	2025-10-21 12:32:28 +02:00
Andrew Cooper	876f0d43af	x86/microcode: Fix Entrysign revision check for Zen1/Naples ... to match AMD's statement here: https://www.amd.com/en/resources/product-security/bulletin/amd-sb-7033.html Fixes: `50cef76d5c` ("x86/microcode/AMD: Load only SHA256-checksummed patches") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org> Link: https://patch.msgid.link/20251020144124.2930784-1-andrew.cooper3@citrix.com	2025-10-21 12:16:51 +02:00
Sean Christopherson	9d7dfb95da	KVM: VMX: Inject #UD if guest tries to execute SEAMCALL or TDCALL Add VMX exit handlers for SEAMCALL and TDCALL to inject a #UD if a non-TD guest attempts to execute SEAMCALL or TDCALL. Neither SEAMCALL nor TDCALL is gated by any software enablement other than VMXON, and so will generate a VM-Exit instead of e.g. a native #UD when executed from the guest kernel. Note! No unprivileged DoS of the L1 kernel is possible as TDCALL and SEAMCALL #GP at CPL > 0, and the CPL check is performed prior to the VMX non-root (VM-Exit) check, i.e. userspace can't crash the VM. And for a nested guest, KVM forwards unknown exits to L1, i.e. an L2 kernel can crash itself, but not L1. Note #2! The Intel® Trust Domain CPU Architectural Extensions spec's pseudocode shows the CPL > 0 check for SEAMCALL coming _after_ the VM-Exit, but that appears to be a documentation bug (likely because the CPL > 0 check was incorrectly bundled with other lower-priority #GP checks). Testing on SPR and EMR shows that the CPL > 0 check is performed before the VMX non-root check, i.e. SEAMCALL #GPs when executed in usermode. Note #3! The aforementioned Trust Domain spec uses confusing pseudocode that says that SEAMCALL will #UD if executed "inSEAM", but "inSEAM" specifically means in SEAM Root Mode, i.e. in the TDX-Module. The long- form description explicitly states that SEAMCALL generates an exit when executed in "SEAM VMX non-root operation". But that's a moot point as the TDX-Module injects #UD if the guest attempts to execute SEAMCALL, as documented in the "Unconditionally Blocked Instructions" section of the TDX-Module base specification. Cc: stable@vger.kernel.org Cc: Kai Huang <kai.huang@intel.com> Cc: Xiaoyao Li <xiaoyao.li@intel.com> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Binbin Wu <binbin.wu@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Link: https://lore.kernel.org/r/20251016182148.69085-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-10-20 09:37:04 -07:00
Babu Moger	19de7113bf	x86,fs/resctrl: Fix NULL pointer dereference with events force-disabled in mbm_event mode The following NULL pointer dereference is encountered on mount of resctrl fs after booting a system that supports assignable counters with the "rdt=!mbmtotal,!mbmlocal" kernel parameters: BUG: kernel NULL pointer dereference, address: 0000000000000008 RIP: 0010:mbm_cntr_get Call Trace: rdtgroup_assign_cntr_event rdtgroup_assign_cntrs rdt_get_tree Specifying the kernel parameter "rdt=!mbmtotal,!mbmlocal" effectively disables the legacy X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features and the MBM events they represent. This results in the per-domain MBM event related data structures to not be allocated during early initialization. resctrl fs initialization follows by implicitly enabling both MBM total and local events on a system that supports assignable counters (mbm_event mode), but this enabling occurs after the per-domain data structures have been created. After booting, resctrl fs assumes that an enabled event can access all its state. This results in NULL pointer dereference when resctrl attempts to access the un-allocated structures of an enabled event. Remove the late MBM event enabling from resctrl fs. This leaves a problem where the X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features may be disabled while assignable counter (mbm_event) mode is enabled without any events to support. Switching between the "default" and "mbm_event" mode without any events is not practical. Create a dependency between the X86_FEATURE_{CQM_MBM_TOTAL,CQM_MBM_LOCAL} and X86_FEATURE_ABMC (assignable counter) hardware features. An x86 system that supports assignable counters now requires support of X86_FEATURE_CQM_MBM_TOTAL or X86_FEATURE_CQM_MBM_LOCAL. This ensures all needed MBM related data structures are created before use and that it is only possible to switch between "default" and "mbm_event" mode when the same events are available in both modes. This dependency does not exist in the hardware but this usage of these feature settings work for known systems. [ bp: Massage commit message. ] Fixes: `13390861b4` ("x86,fs/resctrl: Detect Assignable Bandwidth Monitoring feature details") Co-developed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://patch.msgid.link/a62e6ac063d0693475615edd213d5be5e55443e6.1760560934.git.babu.moger@amd.com	2025-10-20 18:06:31 +02:00
Linus Torvalds	c7864eeaa4	Merge tag 'x86_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Reset the why-the-system-rebooted register on AMD to avoid stale bits remaining from previous boots - Add a missing barrier in the TLB flushing code to prevent erroneously not flushing a TLB generation - Make sure cpa_flush() does not overshoot when computing the end range of a flush region - Fix resctrl bandwidth counting on AMD systems when the amount of monitoring groups created exceeds the number the hardware can track * tag 'x86_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/CPU/AMD: Prevent reset reasons from being retained across reboot x86/mm: Fix SMP ordering in switch_mm_irqs_off() x86/mm: Fix overflow in __cpa_addr() x86/resctrl: Fix miscount of bandwidth event when reactivating previously unavailable RMID	2025-10-19 04:41:27 -10:00
Linus Torvalds	02e5f74ef0	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "ARM: - Fix the handling of ZCR_EL2 in NV VMs - Pick the correct translation regime when doing a PTW on the back of a SEA - Prevent userspace from injecting an event into a vcpu that isn't initialised yet - Move timer save/restore to the sysreg handling code, fixing EL2 timer access in the process - Add FGT-based trapping of MDSCR_EL1 to reduce the overhead of debug - Fix trapping configuration when the host isn't GICv3 - Improve the detection of HCR_EL2.E2H being RES1 - Drop a spurious 'break' statement in the S1 PTW - Don't try to access SPE when owned by EL3 Documentation updates: - Document the failure modes of event injection - Document that a GICv3 guest can be created on a GICv5 host with FEAT_GCIE_LEGACY Selftest improvements: - Add a selftest for the effective value of HCR_EL2.AMO - Address build warning in the timer selftest when building with clang - Teach irqfd selftests about non-x86 architectures - Add missing sysregs to the set_id_regs selftest - Fix vcpu allocation in the vgic_lpi_stress selftest - Correctly enable interrupts in the vgic_lpi_stress selftest x86: - Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test for the bug fixed by commit `3ccbf6f470` ("KVM: x86/mmu: Return -EAGAIN if userspace deletes/moves memslot during prefault") - Don't try to get PMU capabilities from perf when running a CPU with hybrid CPUs/PMUs, as perf will rightly WARN. guest_memfd: - Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a more generic KVM_CAP_GUEST_MEMFD_FLAGS - Add a guest_memfd INIT_SHARED flag and require userspace to explicitly set said flag to initialize memory as SHARED, irrespective of MMAP. The behavior merged in 6.18 is that enabling mmap() implicitly initializes memory as SHARED, which would result in an ABI collision for x86 CoCo VMs as their memory is currently always initialized PRIVATE. - Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with private memory, to enable testing such setups, i.e. to hopefully flush out any other lurking ABI issues before 6.18 is officially released. - Add testcases to the guest_memfd selftest to cover guest_memfd without MMAP, and host userspace accesses to mmap()'d private memory" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (46 commits) arm64: Revamp HCR_EL2.E2H RES1 detection KVM: arm64: nv: Use FGT write trap of MDSCR_EL1 when available KVM: arm64: Compute per-vCPU FGTs at vcpu_load() KVM: arm64: selftests: Fix misleading comment about virtual timer encoding KVM: arm64: selftests: Add an E2H=0-specific configuration to get_reg_list KVM: arm64: selftests: Make dependencies on VHE-specific registers explicit KVM: arm64: Kill leftovers of ad-hoc timer userspace access KVM: arm64: Fix WFxT handling of nested virt KVM: arm64: Move CNTCT_EL0 userspace accessors to generic infrastructure KVM: arm64: Move CNT_CVAL_EL0 userspace accessors to generic infrastructure KVM: arm64: Move CNT_CTL_EL0 userspace accessors to generic infrastructure KVM: arm64: Add timer UAPI workaround to sysreg infrastructure KVM: arm64: Make timer_set_offset() generally accessible KVM: arm64: Replace timer context vcpu pointer with timer_id KVM: arm64: Introduce timer_context_to_vcpu() helper KVM: arm64: Hide CNTHV__EL2 from userspace for nVHE guests Documentation: KVM: Update GICv3 docs for GICv5 hosts KVM: arm64: gic-v3: Only set ICH_HCR traps for v2-on-v3 or v3 guests KVM: arm64: selftests: Actually enable IRQs in vgic_lpi_stress KVM: arm64: selftests: Allocate vcpus with correct size ...	2025-10-18 07:07:14 -10:00
Linus Torvalds	0e622c4b0e	Merge tag 'powerpc-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Madhavan Srinivasan: - Fix to handle NULL pointer dereference at irq domain teardown - Fix for handling extraction of struct xive_irq_data - Fix to skip parameter area allocation when fadump disabled Thanks to Ganesh Goudar, Hari Bathini, Nam Cao, Ritesh Harjani (IBM), Sourabh Jain, and Venkat Rao Bagalkote, * tag 'powerpc-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/fadump: skip parameter area allocation when fadump is disabled powerpc, ocxl: Fix extraction of struct xive_irq_data powerpc/pseries/msi: Fix NULL pointer dereference at irq domain teardown	2025-10-18 07:02:28 -10:00
Paul Walmsley	b7776a802f	riscv: hwprobe: avoid uninitialized variable use in hwprobe_arch_id() Resolve this smatch warning: arch/riscv/kernel/sys_hwprobe.c:50 hwprobe_arch_id() error: uninitialized symbol 'cpu_id'. This could happen if hwprobe_arch_id() was called with a key ID of something other than MVENDORID, MIMPID, and MARCHID. This does not happen in the current codebase. The only caller of hwprobe_arch_id() is a function that only passes one of those three key IDs. For the sake of reducing static analyzer warning noise, and in the unlikely event that hwprobe_arch_id() is someday called with some other key ID, validate hwprobe_arch_id()'s input to ensure that 'cpu_id' is always initialized before use. Fixes: `ea3de9ce8a` ("RISC-V: Add a syscall for HW probing") Cc: Evan Green <evan@rivosinc.com> Signed-off-by: Paul Walmsley <pjw@kernel.org> Link: https://lore.kernel.org/r/cf5a13ec-19d0-9862-059b-943f36107bf3@kernel.org	2025-10-18 09:36:36 -06:00
Paul Walmsley	2dc99ea272	riscv: cpufeature: avoid uninitialized variable in has_thead_homogeneous_vlenb() In has_thead_homogeneous_vlenb(), smatch detected that the vlenb variable could be used while uninitialized. It appears that this could happen if no CPUs described in DT have the "thead,vlenb" property. Fix by initializing vlenb to 0, which will keep thead_vlenb_of set to 0 (as it was statically initialized). This in turn will cause riscv_v_setup_vsize() to fall back to CSR probing - the desired result if thead,vlenb isn't provided in the DT data. While here, fix a nearby comment typo. Cc: stable@vger.kernel.org Cc: Charlie Jenkins <charlie@rivosinc.com> Fixes: `377be47f90` ("riscv: vector: Use vlenb from DT for thead") Signed-off-by: Paul Walmsley <pjw@kernel.org> Link: https://lore.kernel.org/r/22674afb-2fe8-2a83-1818-4c37bd554579@kernel.org	2025-10-18 09:36:11 -06:00
Paolo Bonzini	4361f5aa8b	Merge tag 'kvm-x86-fixes-6.18-rc2' of https://github.com/kvm-x86/linux into HEAD KVM x86 fixes for 6.18: - Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test for the bug fixed by commit `3ccbf6f470` ("KVM: x86/mmu: Return -EAGAIN if userspace deletes/moves memslot during prefault") - Don't try to get PMU capabbilities from perf when running a CPU with hybrid CPUs/PMUs, as perf will rightly WARN. - Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a more generic KVM_CAP_GUEST_MEMFD_FLAGS - Add a guest_memfd INIT_SHARED flag and require userspace to explicitly set said flag to initialize memory as SHARED, irrespective of MMAP. The behavior merged in 6.18 is that enabling mmap() implicitly initializes memory as SHARED, which would result in an ABI collision for x86 CoCo VMs as their memory is currently always initialized PRIVATE. - Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with private memory, to enable testing such setups, i.e. to hopefully flush out any other lurking ABI issues before 6.18 is officially released. - Add testcases to the guest_memfd selftest to cover guest_memfd without MMAP, and host userspace accesses to mmap()'d private memory.	2025-10-18 10:25:43 +02:00
Jingwei Wang	5d15d2ad36	riscv: hwprobe: Fix stale vDSO data for late-initialized keys at boot The hwprobe vDSO data for some keys, like MISALIGNED_VECTOR_PERF, is determined by an asynchronous kthread. This can create a race condition where the kthread finishes after the vDSO data has already been populated, causing userspace to read stale values. To fix this race, a new 'ready' flag is added to the vDSO data, initialized to 'false' during arch_initcall_sync. This flag is checked by both the vDSO's user-space code and the riscv_hwprobe syscall. The syscall serves as a one-time gate, using a completion to wait for any pending probes before populating the data and setting the flag to 'true', thus ensuring userspace reads fresh values on its first request. Reported-by: Tsukasa OI <research_trasio@irq.a4lg.com> Closes: https://lore.kernel.org/linux-riscv/760d637b-b13b-4518-b6bf-883d55d44e7f@irq.a4lg.com/ Fixes: `e7c9d66e31` ("RISC-V: Report vector unaligned access speed hwprobe") Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Olof Johansson <olof@lixom.net> Cc: stable@vger.kernel.org Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Co-developed-by: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Jingwei Wang <wangjingwei@iscas.ac.cn> Link: https://lore.kernel.org/r/20250811142035.105820-1-wangjingwei@iscas.ac.cn [pjw@kernel.org: fix checkpatch issues] Signed-off-by: Paul Walmsley <pjw@kernel.org>	2025-10-17 22:23:11 -06:00
Paul Walmsley	492c513ec6	riscv: add a forward declaration for cpuinfo_op Add a forward declaration for cpuinfo_op to resolve a sparse warning. Link: https://lore.kernel.org/r/b831f349-5d0c-f7ac-8362-acb20bc6221a@kernel.org Signed-off-by: Paul Walmsley <pjw@kernel.org>	2025-10-17 22:10:01 -06:00
Anup Patel	d2721bb165	RISC-V: Don't print details of CPUs disabled in DT Early boot stages may disable CPU DT nodes for unavailable CPUs based on SKU, pinstraps, eFuse, etc. Currently, the riscv_early_of_processor_hartid() prints details of a CPU if it is disabled in DT which has no value and gives a false impression to the users that there some issue with the CPU. Fixes: `e3d794d555` ("riscv: treat cpu devicetree nodes without status as enabled") Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20251014163009.182381-1-apatel@ventanamicro.com Signed-off-by: Paul Walmsley <pjw@kernel.org>	2025-10-17 22:02:21 -06:00
Samuel Holland	768e054de0	riscv: Remove the PER_CPU_OFFSET_SHIFT macro __per_cpu_offset is an array of unsigned long, so we can reuse the existing RISCV_LGPTR macro. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20251015225604.3860409-1-samuel.holland@sifive.com Signed-off-by: Paul Walmsley <pjw@kernel.org>	2025-10-17 22:00:29 -06:00
Samuel Holland	5898fc01ff	riscv: mm: Define MAX_POSSIBLE_PHYSMEM_BITS for zsmalloc This definition is used by zsmalloc to optimize memory allocation. On riscv64, it is the same as MAX_PHYSMEM_BITS from asm/sparsemem.h, but that definition depends on CONFIG_SPARSEMEM. The correct definition is already provided for riscv32. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20251015233327.3885003-1-samuel.holland@sifive.com Signed-off-by: Paul Walmsley <pjw@kernel.org>	2025-10-17 22:00:29 -06:00
Samuel Holland	223bfc4d40	riscv: Register IPI IRQs with unique names This allows different IPIs to be distinguished in tracing output. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20251016003244.3910332-1-samuel.holland@sifive.com Signed-off-by: Paul Walmsley <pjw@kernel.org>	2025-10-17 22:00:29 -06:00
Anup Patel	ca525d53f9	RISC-V: Define pgprot_dmacoherent() for non-coherent devices The pgprot_dmacoherent() is used when allocating memory for non-coherent devices and by default pgprot_dmacoherent() is same as pgprot_noncached() unless architecture overrides it. Currently, there is no pgprot_dmacoherent() definition for RISC-V hence non-coherent device memory is being mapped as IO thereby making CPU access to such memory slow. Define pgprot_dmacoherent() to be same as pgprot_writecombine() for RISC-V so that CPU access non-coherent device memory as NOCACHE which is better than accessing it as IO. Fixes: `ff689fd21c` ("riscv: add RISC-V Svpbmt extension support") Signed-off-by: Anup Patel <apatel@ventanamicro.com> Tested-by: Han Gao <rabenda.cn@gmail.com> Tested-by: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org> Link: https://lore.kernel.org/r/20250820152316.1012757-1-apatel@ventanamicro.com Signed-off-by: Paul Walmsley <pjw@kernel.org>	2025-10-17 21:30:05 -06:00
Linus Torvalds	f406055cb1	Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Explicitly encode the XZR register if the value passed to write_sysreg_s() is 0. The GIC CDEOI instruction is encoded as a system register write with XZR as the source register. However, clang does not honour the "Z" register constraint, leading to incorrect code generation - Ensure the interrupts (DAIF.IF) are unmasked when completing single-step of a suspended breakpoint before calling exit_to_user_mode(). With pseudo-NMIs, interrupts are (additionally) masked at the PMR_EL1 register, handled by local_irq_() tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: debug: always unmask interrupts in el0_softstp() arm64/sysreg: Fix GIC CDEOI instruction encoding	2025-10-17 13:04:21 -10:00
Linus Torvalds	fe69107ec7	Merge tag 'riscv-for-linux-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: - Disable CFI with Rust for any platform other than x86 and ARM64 - Keep task mm_cpumasks up-to-date to avoid triggering M-mode firmware warnings if the kernel tries to send an IPI to an offline CPU - Improve kprobe address validation performance and avoid desyncs (following x86) - Avoid duplicate device probes by avoiding DT hardware probing when ACPI is enabled in early boot - Use the correct set of dependencies for CONFIG_ARCH_HAS_ELF_CORE_EFLAGS, avoiding an allnoconfig warning - Fix a few other minor issues * tag 'riscv-for-linux-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: kprobes: convert one final __ASSEMBLY__ to __ASSEMBLER__ riscv: Respect dependencies of ARCH_HAS_ELF_CORE_EFLAGS riscv: acpi: avoid errors caused by probing DT devices when ACPI is used riscv: kprobes: Fix probe address validation riscv: entry: fix typo in comment 'instruciton' -> 'instruction' RISC-V: clear hot-unplugged cores from all task mm_cpumasks to avoid rfence errors riscv: kgdb: Ensure that BUFMAX > NUMREGBYTES rust: cfi: only 64-bit arm and x86 support CFI_CLANG	2025-10-17 12:59:31 -10:00
Ada Couprie Diaz	ea0d55ae4b	arm64: debug: always unmask interrupts in el0_softstp() We intend that EL0 exception handlers unmask all DAIF exceptions before calling exit_to_user_mode(). When completing single-step of a suspended breakpoint, we do not call local_daif_restore(DAIF_PROCCTX) before calling exit_to_user_mode(), leaving all DAIF exceptions masked. When pseudo-NMIs are not in use this is benign. When pseudo-NMIs are in use, this is unsound. At this point interrupts are masked by both DAIF.IF and PMR_EL1, and subsequent irq flag manipulation may not work correctly. For example, a subsequent local_irq_enable() within exit_to_user_mode_loop() will only unmask interrupts via PMR_EL1 (leaving those masked via DAIF.IF), and anything depending on interrupts being unmasked (e.g. delivery of signals) will not work correctly. This was detected by CONFIG_ARM64_DEBUG_PRIORITY_MASKING. Move the call to `try_step_suspended_breakpoints()` outside of the check so that interrupts can be unmasked even if we don't call the step handler. Fixes: `0ac7584c08` ("arm64: debug: split single stepping exception entry") Cc: <stable@vger.kernel.org> # 6.17 Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> [catalin.marinas@arm.com: added Mark's rewritten commit log and some whitespace] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2025-10-17 18:08:05 +01:00
Lorenzo Pieralisi	e9ad390a48	arm64/sysreg: Fix GIC CDEOI instruction encoding The GIC CDEOI system instruction requires the Rt field to be set to 0b11111 otherwise the instruction behaviour becomes CONSTRAINED UNPREDICTABLE. Currenly, its usage is encoded as a system register write, with a constant 0 value: write_sysreg_s(0, GICV5_OP_GIC_CDEOI) While compiling with GCC, the 0 constant value, through these asm constraints and modifiers ('x' modifier and 'Z' constraint combo): asm volatile(__msr_s(r, "%x0") : : "rZ" (__val)); forces the compiler to issue the XZR register for the MSR operation (ie that corresponds to Rt == 0b11111) issuing the right instruction encoding. Unfortunately LLVM does not yet understand that modifier/constraint combo so it ends up issuing a different register from XZR for the MSR source, which in turns means that it encodes the GIC CDEOI instruction wrongly and the instruction behaviour becomes CONSTRAINED UNPREDICTABLE that we must prevent. Add a conditional to write_sysreg_s() macro that detects whether it is passed a constant 0 value and issues an MSR write with XZR as source register - explicitly doing what the asm modifier/constraint is meant to achieve through constraints/modifiers, fixing the LLVM compilation issue. Fixes: `7ec80fb3f0` ("irqchip/gic-v5: Add GICv5 PPI support") Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Cc: Sascha Bischoff <sascha.bischoff@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2025-10-17 17:55:01 +01:00
Fangyu Yu	873f10cf8e	RISC-V: KVM: Read HGEIP CSR on the correct cpu When executing kvm_riscv_vcpu_aia_has_interrupts, the vCPU may have migrated and the IMSIC VS-file have not been updated yet, currently the HGEIP CSR should be read from the imsic->vsfile_cpu ( the pCPU before migration ) via on_each_cpu_mask, but this will trigger an IPI call and repeated IPI within a period of time is expensive in a many-core systems. Just let the vCPU execute and update the correct IMSIC VS-file via kvm_riscv_vcpu_aia_imsic_update may be a simple solution. Fixes: `4cec89db80` ("RISC-V: KVM: Move HGEI[E\|P] CSR access to IMSIC virtualization") Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com> Reviewed-by: Guo Ren <guoren@kernel.org> Reviewed-by: Anup Patel <anup@brainfault.org> Tested-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20251016012659.82998-1-fangyu.yu@linux.alibaba.com Signed-off-by: Anup Patel <anup@brainfault.org>	2025-10-17 13:10:01 +05:30

1 2 3 4 5 ...

238443 Commits