Commit Graph

238443 Commits

Author SHA1 Message Date
Samuel Holland
ea138a6077 RISC-V: KVM: Fix check for local interrupts on riscv32
To set all 64 bits in the mask on a 32-bit system, the constant must
have type `unsigned long long`.

Fixes: 6b1e8ba4ba ("RISC-V: KVM: Use bitmap for irqs_pending and irqs_pending_mask")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20251016001714.3889380-1-samuel.holland@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-10-16 17:17:29 +05:30
Rong Zhang
e6416c2dfe x86/CPU/AMD: Prevent reset reasons from being retained across reboot
The S5_RESET_STATUS register is parsed on boot and printed to kmsg.
However, this could sometimes be misleading and lead to users wasting a
lot of time on meaningless debugging for two reasons:

* Some bits are never cleared by hardware. It's the software's
responsibility to clear them as per the Processor Programming Reference
(see [1]).

* Some rare hardware-initiated platform resets do not update the
register at all.

In both cases, a previous reboot could leave its trace in the register,
resulting in users seeing unrelated reboot reasons while debugging random
reboots afterward.

Write the read value back to the register in order to clear all reason bits
since they are write-1-to-clear while the others must be preserved.

  [1]: https://bugzilla.kernel.org/show_bug.cgi?id=206537#attach_303991

  [ bp: Massage commit message. ]

Fixes: ab81310287 ("x86/CPU/AMD: Print the reason for the last reset")
Signed-off-by: Rong Zhang <i@rong.moe>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/all/20250913144245.23237-1-i@rong.moe/
2025-10-15 21:38:06 +02:00
Linus Torvalds
1f4a222b0e Remove long-stale ext3 defconfig option
Inspired by commit c065b6046b ("Use CONFIG_EXT4_FS instead of
CONFIG_EXT3_FS in all of the defconfigs") I looked around for any other
left-over EXT3 config options, and found some old defconfig files still
mentioned CONFIG_EXT3_DEFAULTS_TO_ORDERED.

That config option was removed a decade ago in commit c290ea01ab ("fs:
Remove ext3 filesystem driver").  It had a good run, but let's remove it
for good.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-10-15 07:57:28 -07:00
Linus Torvalds
66f8e4df00 Merge tag 'ext4_for_linus-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 bug fixes from Ted Ts'o:

 - Fix regression caused by removing CONFIG_EXT3_FS when testing some
   very old defconfigs

 - Avoid a BUG_ON when opening a file on a maliciously corrupted file
   system

 - Avoid mm warnings when freeing a very large orphan file metadata

 - Avoid a theoretical races between metadata writeback and checkpoints
   (it's very hard to hit in practice, since the race requires that the
   writeback take a very long time)

* tag 'ext4_for_linus-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  Use CONFIG_EXT4_FS instead of CONFIG_EXT3_FS in all of the defconfigs
  ext4: free orphan info with kvfree
  ext4: detect invalid INLINE_DATA + EXTENTS flag combination
  ext4, doc: fix and improve directory hash tree description
  ext4: wait for ongoing I/O to complete before freeing blocks
  jbd2: ensure that all ongoing I/O complete before freeing blocks
2025-10-15 07:51:57 -07:00
Marc Zyngier
ca88ecdce5 arm64: Revamp HCR_EL2.E2H RES1 detection
We currently have two ways to identify CPUs that only implement FEAT_VHE
and not FEAT_E2H0:

- either they advertise it via ID_AA64MMFR4_EL1.E2H0,
- or the HCR_EL2.E2H bit is RAO/WI

However, there is a third category of "cpus" that fall between these
two cases: on CPUs that do not implement FEAT_FGT, it is IMPDEF whether
an access to ID_AA64MMFR4_EL1 can trap to EL2 when the register value
is zero.

A consequence of this is that on systems such as Neoverse V2, a NV
guest cannot reliably detect that it is in a VHE-only configuration
(E2H is writable, and ID_AA64MMFR0_EL1 is 0), despite the hypervisor's
best effort to repaint the id register.

Replace the RAO/WI test by a sequence that makes use of the VHE
register remnapping between EL1 and EL2 to detect this situation,
and work out whether we get the VHE behaviour even after having
set HCR_EL2.E2H to 0.

This solves the NV problem, and provides a more reliable acid test
for CPUs that do not completely follow the letter of the architecture
while providing a RES1 behaviour for HCR_EL2.E2H.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Tested-by: Jan Kotas <jank@cadence.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/15A85F2B-1A0C-4FA7-9FE4-EEC2203CC09E@global.cadence.com
2025-10-14 08:18:40 +01:00
Theodore Ts'o
c065b6046b Use CONFIG_EXT4_FS instead of CONFIG_EXT3_FS in all of the defconfigs
Commit d6ace46c82 ("ext4: remove obsolete EXT3 config options")
removed the obsolete EXT3_CONFIG options, since it had been over a
decade since fs/ext3 had been removed.  Unfortunately, there were a
number of defconfigs that still used CONFIG_EXT3_FS which the cleanup
commit didn't fix up.  This led to a large number of defconfig test
builds to fail.  Oops.

Fixes: d6ace46c82 ("ext4: remove obsolete EXT3 config options")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-10-13 21:50:40 -04:00
Ingo Molnar
83b0177a6c x86/mm: Fix SMP ordering in switch_mm_irqs_off()
Stephen noted that it is possible to not have an smp_mb() between
the loaded_mm store and the tlb_gen load in switch_mm(), meaning the
ordering against flush_tlb_mm_range() goes out the window, and it
becomes possible for switch_mm() to not observe a recent tlb_gen
update and fail to flush the TLBs.

[ dhansen: merge conflict fixed by Ingo ]

Fixes: 209954cbc7 ("x86/mm/tlb: Update mm_cpumask lazily")
Reported-by: Stephen Dolan <sdolan@janestreet.com>
Closes: https://lore.kernel.org/all/CAHDw0oGd0B4=uuv8NGqbUQ_ZVmSheU2bN70e4QhFXWvuAZdt2w@mail.gmail.com/
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
2025-10-13 13:55:53 -07:00
Rik van Riel
f25785f9b0 x86/mm: Fix overflow in __cpa_addr()
The change to have cpa_flush() call flush_kernel_pages() introduced
a bug where __cpa_addr() can access an address one larger than the
largest one in the cpa->pages array.

KASAN reports the issue like this:

BUG: KASAN: slab-out-of-bounds in __cpa_addr arch/x86/mm/pat/set_memory.c:309 [inline]
BUG: KASAN: slab-out-of-bounds in __cpa_addr+0x1d3/0x220 arch/x86/mm/pat/set_memory.c:306
Read of size 8 at addr ffff88801f75e8f8 by task syz.0.17/5978

This bug could cause cpa_flush() to not properly flush memory,
which somehow never showed any symptoms in my tests, possibly
because cpa_flush() is called so rarely, but could potentially
cause issues for other people.

Fix the issue by directly calculating the flush end address
from the start address.

Fixes: 86e6815b31 ("x86/mm: Change cpa_flush() to call flush_kernel_range() directly")
Reported-by: syzbot+afec6555eef563c66c97@syzkaller.appspotmail.com
Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Kiryl Shutsemau <kas@kernel.org>
Link: https://lore.kernel.org/all/68e2ff90.050a0220.2c17c1.0038.GAE@google.com/
2025-10-13 13:55:48 -07:00
Babu Moger
15292f1b4c x86/resctrl: Fix miscount of bandwidth event when reactivating previously unavailable RMID
Users can create as many monitoring groups as the number of RMIDs supported
by the hardware. However, on AMD systems, only a limited number of RMIDs
are guaranteed to be actively tracked by the hardware. RMIDs that exceed
this limit are placed in an "Unavailable" state.

When a bandwidth counter is read for such an RMID, the hardware sets
MSR_IA32_QM_CTR.Unavailable (bit 62). When such an RMID starts being tracked
again the hardware counter is reset to zero. MSR_IA32_QM_CTR.Unavailable
remains set on first read after tracking re-starts and is clear on all
subsequent reads as long as the RMID is tracked.

resctrl miscounts the bandwidth events after an RMID transitions from the
"Unavailable" state back to being tracked. This happens because when the
hardware starts counting again after resetting the counter to zero, resctrl
in turn compares the new count against the counter value stored from the
previous time the RMID was tracked.

This results in resctrl computing an event value that is either undercounting
(when new counter is more than stored counter) or a mistaken overflow (when
new counter is less than stored counter).

Reset the stored value (arch_mbm_state::prev_msr) of MSR_IA32_QM_CTR to
zero whenever the RMID is in the "Unavailable" state to ensure accurate
counting after the RMID resets to zero when it starts to be tracked again.

Example scenario that results in mistaken overflow
==================================================
1. The resctrl filesystem is mounted, and a task is assigned to a
   monitoring group.

   $mount -t resctrl resctrl /sys/fs/resctrl
   $mkdir /sys/fs/resctrl/mon_groups/test1/
   $echo 1234 > /sys/fs/resctrl/mon_groups/test1/tasks

   $cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
   21323            <- Total bytes on domain 0
   "Unavailable"    <- Total bytes on domain 1

   Task is running on domain 0. Counter on domain 1 is "Unavailable".

2. The task runs on domain 0 for a while and then moves to domain 1. The
   counter starts incrementing on domain 1.

   $cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
   7345357          <- Total bytes on domain 0
   4545             <- Total bytes on domain 1

3. At some point, the RMID in domain 0 transitions to the "Unavailable"
   state because the task is no longer executing in that domain.

   $cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
   "Unavailable"    <- Total bytes on domain 0
   434341           <- Total bytes on domain 1

4.  Since the task continues to migrate between domains, it may eventually
    return to domain 0.

    $cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
    17592178699059  <- Overflow on domain 0
    3232332         <- Total bytes on domain 1

In this case, the RMID on domain 0 transitions from "Unavailable" state to
active state. The hardware sets MSR_IA32_QM_CTR.Unavailable (bit 62) when
the counter is read and begins tracking the RMID counting from 0.

Subsequent reads succeed but return a value smaller than the previously
saved MSR value (7345357). Consequently, the resctrl's overflow logic is
triggered, it compares the previous value (7345357) with the new, smaller
value and incorrectly interprets this as a counter overflow, adding a large
delta.

In reality, this is a false positive: the counter did not overflow but was
simply reset when the RMID transitioned from "Unavailable" back to active
state.

Here is the text from APM [1] available from [2].

"In PQOS Version 2.0 or higher, the MBM hardware will set the U bit on the
first QM_CTR read when it begins tracking an RMID that it was not
previously tracking. The U bit will be zero for all subsequent reads from
that RMID while it is still tracked by the hardware. Therefore, a QM_CTR
read with the U bit set when that RMID is in use by a processor can be
considered 0 when calculating the difference with a subsequent read."

[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
    Publication # 24593 Revision 3.41 section 19.3.3 Monitoring L3 Memory
    Bandwidth (MBM).

  [ bp: Split commit message into smaller paragraph chunks for better
    consumption. ]

Fixes: 4d05bf71f1 ("x86/resctrl: Introduce AMD QOS feature")
Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: stable@vger.kernel.org # needs adjustments for <= v6.17
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2]
2025-10-13 21:24:39 +02:00
Stefan Wahren
4adc20ba95 ARM: dts: broadcom: rpi: Switch to V3D firmware clock
Until commit 919d6924ae ("clk: bcm: rpi: Turn firmware clock on/off
when preparing/unpreparing") the clk-raspberrypi driver wasn't able
to change the state of the V3D clock. Only the clk-bcm2835 was able
to do this before. After this commit both drivers were able to work
against each other, which could result in a system freeze. One step
to avoid this conflict is to switch all V3D consumer to the firmware
clock.

Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Closes: https://lore.kernel.org/linux-arm-kernel/727aa0c8-2981-4662-adf3-69cac2da956d@samsung.com/
Fixes: 919d6924ae ("clk: bcm: rpi: Turn firmware clock on/off when preparing/unpreparing")
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Co-developed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20251005113816.6721-1-wahrenst@gmx.net
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
2025-10-13 10:31:25 -07:00
Peter Robinson
aa960b5976 arm64: dts: broadcom: bcm2712: Define VGIC interrupt
Define the interrupt in the GICv2 for vGIC so KVM
can be used, it was missed from the original upstream
DTB for some reason.

Signed-off-by: Peter Robinson <pbrobinson@gmail.com>
Cc: Andrea della Porta <andrea.porta@suse.com>
Cc: Phil Elwell <phil@raspberrypi.com>
Fixes: faa3381267 ("arm64: dts: broadcom: Add minimal support for Raspberry Pi 5")
Link: https://lore.kernel.org/r/20250924085612.1039247-1-pbrobinson@gmail.com
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
2025-10-13 10:31:04 -07:00
Oliver Upton
e0b5a7967d KVM: arm64: nv: Use FGT write trap of MDSCR_EL1 when available
Marc reports that the performance of running an L3 guest has regressed
by 60% as a result of setting MDCR_EL2.TDA to hide bad architecture.
That's of course terrible for the single user of recursive NV ;-)

While there's nothing to be done on non-FGT systems, take advantage of
the precise write trap of MDSCR_EL1 and leave the rest of the debug
registers untrapped.

Reported-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:44:37 +01:00
Oliver Upton
fb10ddf35c KVM: arm64: Compute per-vCPU FGTs at vcpu_load()
To date KVM has used the fine-grained traps for the sake of UNDEF
enforcement (so-called FGUs), meaning the constituent parts could be
computed on a per-VM basis and folded into the effective value when
programmed.

Prepare for traps changing based on the vCPU context by computing the
whole mess of them at vcpu_load(). Aggressively inline all the helpers
to preserve the build-time checks that were there before.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:44:37 +01:00
Marc Zyngier
386aac77da KVM: arm64: Kill leftovers of ad-hoc timer userspace access
Now that the whole timer infrastructure is handled as system register
accesses, get rid of the now unused ad-hoc infrastructure.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:42:41 +01:00
Marc Zyngier
892f7c38ba KVM: arm64: Fix WFxT handling of nested virt
The spec for WFxT indicates that the parameter to the WFxT instruction
is relative to the reading of CNTVCT_EL0. This means that the implementation
needs to take the execution context into account, as CNTVOFF_EL2
does not always affect readings of CNTVCT_EL0 (such as when HCR_EL2.E2H
is 1 and that we're in host context).

This also rids us of the last instance of KVM_REG_ARM_TIMER_CNT
outside of the userspace interaction code.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:42:41 +01:00
Marc Zyngier
c3be3a48fb KVM: arm64: Move CNT*CT_EL0 userspace accessors to generic infrastructure
Moving the counter registers is a bit more involved than for the control
and comparator (there is no shadow data for the counter), but still
pretty manageable.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:42:41 +01:00
Marc Zyngier
8af198980e KVM: arm64: Move CNT*_CVAL_EL0 userspace accessors to generic infrastructure
As for the control registers, move the comparator registers to
the common infrastructure.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:42:41 +01:00
Marc Zyngier
09424d5d7d KVM: arm64: Move CNT*_CTL_EL0 userspace accessors to generic infrastructure
Remove the handling of CNT*_CTL_EL0 from guest.c, and move it to
sys_regs.c, using a new TIMER_REG() definition to encapsulate it.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:42:40 +01:00
Marc Zyngier
77a0c42eaf KVM: arm64: Add timer UAPI workaround to sysreg infrastructure
Amongst the numerous bugs that plague the KVM/arm64 UAPI, one of
the most annoying thing is that the userspace view of the virtual
timer has its CVAL and CNT encodings swapped.

In order to reduce the amount of code that has to know about this,
start by adding handling for this bug in the sys_reg code.

Nothing is making use of it yet, as the code responsible for userspace
interaction is catching the accesses early.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:42:40 +01:00
Marc Zyngier
a92d552266 KVM: arm64: Make timer_set_offset() generally accessible
Move the timer_set_offset() helper to arm_arch_timer.h, so that it
is next to timer_get_offset(), and accessible by the rest of KVM.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:42:40 +01:00
Marc Zyngier
8625a670af KVM: arm64: Replace timer context vcpu pointer with timer_id
Having to follow a pointer to a vcpu is pretty dumb, when the timers
are are a fixed offset in the vcpu structure itself.

Trade the vcpu pointer for a timer_id, which can then be used to
compute the vcpu address as needed.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:42:40 +01:00
Marc Zyngier
aa68975c97 KVM: arm64: Introduce timer_context_to_vcpu() helper
We currently have a vcpu pointer nested into each timer context.

As we are about to remove this pointer, introduce a helper (aptly
named timer_context_to_vcpu()) that returns this pointer, at least
until we repaint the data structure.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:42:40 +01:00
Marc Zyngier
4cab5c857d KVM: arm64: Hide CNTHV_*_EL2 from userspace for nVHE guests
Although we correctly UNDEF any CNTHV_*_EL2 access from the guest
when E2H==0, we still expose these registers to userspace, which
is a bad idea.

Drop the ad-hoc UNDEF injection and switch to a .visibility()
callback which will also hide the register from userspace.

Fixes: 0e45981028 ("KVM: arm64: timer: Don't adjust the EL2 virtual timer offset")
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:42:40 +01:00
Sascha Bischoff
3193287ddf KVM: arm64: gic-v3: Only set ICH_HCR traps for v2-on-v3 or v3 guests
The ICH_HCR_EL2 traps are used when running on GICv3 hardware, or when
running a GICv3-based guest using FEAT_GCIE_LEGACY on GICv5
hardware. When running a GICv2 guest on GICv3 hardware the traps are
used to ensure that the guest never sees any part of GICv3 (only GICv2
is visible to the guest), and when running a GICv3 guest they are used
to trap in specific scenarios. They are not applicable for a
GICv2-native guest, and won't be applicable for a(n upcoming) GICv5
guest.

The traps themselves are configured in the vGIC CPU IF state, which is
stored as a union. Updating the wrong aperture of the union risks
corrupting state, and therefore needs to be avoided at all costs.

Bail early if we're not running a compatible guest (GICv2 on GICv3
hardware, GICv3 native, GICv3 on GICv5 hardware). Trap everything
unconditionally if we're running a GICv2 guest on GICv3
hardware. Otherwise, conditionally set up GICv3-native trapping.

Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:40:33 +01:00
Mukesh Ojha
c35dd83866 KVM: arm64: Guard PMSCR_EL1 initialization with SPE presence check
Commit efad60e460 ("KVM: arm64: Initialize PMSCR_EL1 when in VHE")
does not perform sufficient check before initializing PMSCR_EL1 to 0
when running in VHE mode. On some platforms, this causes the system to
hang during boot, as EL3 has not delegated access to the Profiling
Buffer to the Non-secure world, nor does it reinject an UNDEF on sysreg
trap.

To avoid this issue, restrict the PMSCR_EL1 initialization to CPUs that
support Statistical Profiling Extension (FEAT_SPE) and have the
Profiling Buffer accessible in Non-secure EL1. This is determined via a
new helper `cpu_has_spe()` which checks both PMSVer and PMBIDR_EL1.P.

This ensures the initialization only affects CPUs where SPE is
implemented and usable, preventing boot failures on platforms where SPE
is not properly configured.

Fixes: efad60e460 ("KVM: arm64: Initialize PMSCR_EL1 when in VHE")
Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:26:36 +01:00
Osama Abdelkader
05a02490fa KVM: arm64: Remove unreachable break after return
Remove an unnecessary 'break' statement that follows a 'return'
in arch/arm64/kvm/at.c. The break is unreachable.

Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:17:03 +01:00
Oliver Upton
0aa1b76fe1 KVM: arm64: Prevent access to vCPU events before init
Another day, another syzkaller bug. KVM erroneously allows userspace to
pend vCPU events for a vCPU that hasn't been initialized yet, leading to
KVM interpreting a bunch of uninitialized garbage for routing /
injecting the exception.

In one case the injection code and the hyp disagree on whether the vCPU
has a 32bit EL1 and put the vCPU into an illegal mode for AArch64,
tripping the BUG() in exception_target_el() during the next injection:

  kernel BUG at arch/arm64/kvm/inject_fault.c:40!
  Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
  CPU: 3 UID: 0 PID: 318 Comm: repro Not tainted 6.17.0-rc4-00104-g10fd0285305d #6 PREEMPT
  Hardware name: linux,dummy-virt (DT)
  pstate: 21402009 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
  pc : exception_target_el+0x88/0x8c
  lr : pend_serror_exception+0x18/0x13c
  sp : ffff800082f03a10
  x29: ffff800082f03a10 x28: ffff0000cb132280 x27: 0000000000000000
  x26: 0000000000000000 x25: ffff0000c2a99c20 x24: 0000000000000000
  x23: 0000000000008000 x22: 0000000000000002 x21: 0000000000000004
  x20: 0000000000008000 x19: ffff0000c2a99c20 x18: 0000000000000000
  x17: 0000000000000000 x16: 0000000000000000 x15: 00000000200000c0
  x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
  x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
  x8 : ffff800082f03af8 x7 : 0000000000000000 x6 : 0000000000000000
  x5 : ffff800080f621f0 x4 : 0000000000000000 x3 : 0000000000000000
  x2 : 000000000040009b x1 : 0000000000000003 x0 : ffff0000c2a99c20
  Call trace:
   exception_target_el+0x88/0x8c (P)
   kvm_inject_serror_esr+0x40/0x3b4
   __kvm_arm_vcpu_set_events+0xf0/0x100
   kvm_arch_vcpu_ioctl+0x180/0x9d4
   kvm_vcpu_ioctl+0x60c/0x9f4
   __arm64_sys_ioctl+0xac/0x104
   invoke_syscall+0x48/0x110
   el0_svc_common.constprop.0+0x40/0xe0
   do_el0_svc+0x1c/0x28
   el0_svc+0x34/0xf0
   el0t_64_sync_handler+0xa0/0xe4
   el0t_64_sync+0x198/0x19c
  Code: f946bc01 b4fffe61 9101e020 17fffff2 (d4210000)

Reject the ioctls outright as no sane VMM would call these before
KVM_ARM_VCPU_INIT anyway. Even if it did the exception would've been
thrown away by the eventual reset of the vCPU's state.

Cc: stable@vger.kernel.org # 6.17
Fixes: b7b27facc7 ("arm/arm64: KVM: Add KVM_GET/SET_VCPU_EVENTS")
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:17:03 +01:00
Oliver Upton
a46c09b382 KVM: arm64: Use the in-context stage-1 in __kvm_find_s1_desc_level()
Running the external_aborts selftest at EL2 leads to an ugly splat due
to the stage-1 MMU being disabled for the walked context, owing to the
fact that __kvm_find_s1_desc_level() is hardcoded to the EL1&0 regime.

Select the appropriate translation regime for the stage-1 walk based on
the current vCPU context.

Fixes: b8e625167a ("KVM: arm64: Add S1 IPA to page table level walker")
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:17:03 +01:00
Marc Zyngier
9a1950f977 KVM: arm64: nv: Don't advance PC when pending an SVE exception
Jan reports that running a nested guest on Neoverse-V2 leads to a WARN
in the host due to simultaneously pending an exception and PC increment
after an access to ZCR_EL2.

Returning true from a sysreg accessor is an indication that the sysreg
instruction has been retired. Of course this isn't the case when we've
pended a synchronous SVE exception for the guest. Fix the return value
and let the exception propagate to the guest as usual.

Reported-by: Jan Kotas <jank@cadence.com>
Closes: https://lore.kernel.org/kvmarm/865xd61tt5.wl-maz@kernel.org/
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:17:03 +01:00
Oliver Upton
ed25dcfbc4 KVM: arm64: nv: Don't treat ZCR_EL2 as a 'mapped' register
Unlike the other mapped EL2 sysregs ZCR_EL2 isn't guaranteed to be
resident when a vCPU is loaded as it actually follows the SVE
context. As such, the contents of ZCR_EL1 may belong to another guest if
the vCPU has been preempted before reaching sysreg emulation.

Unconditionally use the in-memory value of ZCR_EL2 and switch to the
memory-only accessors. The in-memory value is guaranteed to be valid as
fpsimd_lazy_switch_to_{guest,host}() will restore/save the register
appropriately.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:17:02 +01:00
Sourabh Jain
0843ba4584 powerpc/fadump: skip parameter area allocation when fadump is disabled
Fadump allocates memory to pass additional kernel command-line argument
to the fadump kernel. However, this allocation is not needed when fadump
is disabled. So avoid allocating memory for the additional parameter
area in such cases.

Fixes: f4892c68ec ("powerpc/fadump: allocate memory for additional parameters early")
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Fixes: f4892c68ec ("powerpc/fadump: allocate memory for additional  parameters early")
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20251008032934.262683-1-sourabhjain@linux.ibm.com
2025-10-13 09:41:31 +05:30
Nam Cao
2743cf75f7 powerpc, ocxl: Fix extraction of struct xive_irq_data
Commit cc0cc23bab ("powerpc/xive: Untangle xive from child interrupt
controller drivers") changed xive_irq_data to be stashed to chip_data
instead of handler_data. However, multiple places are still attempting to
read xive_irq_data from handler_data and get a NULL pointer deference bug.

Update them to read xive_irq_data from chip_data.

Non-XIVE files which touch xive_irq_data seem quite strange to me,
especially the ocxl driver. I think there ought to be an alternative
platform-independent solution, instead of touching XIVE's data directly.
Therefore, I think this whole thing should be cleaned up. But perhaps I
just misunderstand something. In any case, this cleanup would not be
trivial; for now, just get things working again.

Fixes: cc0cc23bab ("powerpc/xive: Untangle xive from child interrupt controller drivers")
Reported-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Closes: https://lore.kernel.org/linuxppc-dev/68e48df8.170a0220.4b4b0.217d@mx.google.com/
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>  # ocxl
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20251008081359.1382699-1-namcao@linutronix.de
2025-10-13 09:40:55 +05:30
Nam Cao
ef3e73a917 powerpc/pseries/msi: Fix NULL pointer dereference at irq domain teardown
pseries_msi_ops_teardown() reads pci_dev* from msi_alloc_info_t. However,
pseries_msi_ops_prepare() does not populate this structure, thus it is all
zeros. Consequently, pseries_msi_ops_teardown() triggers a NULL pointer
dereference crash.

struct pci_dev is available in struct irq_domain. Read it there instead.

Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Closes: https://lore.kernel.org/linuxppc-dev/878d7651-433a-46fe-a28b-1b7e893fcbe0@linux.ibm.com/
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20251010120307.3281720-1-namcao@linutronix.de
2025-10-13 09:39:02 +05:30
Linus Torvalds
c04022dccb Merge tag 'kbuild-fixes-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild fixes from Nathan Chancellor:

 - Fix UAPI types check in headers_check.pl

 - Only enable -Werror for hostprogs with CONFIG_WERROR / W=e

 - Ignore fsync() error when output of gen_init_cpio is a pipe

 - Several little build fixes for recent modules.builtin.modinfo series

* tag 'kbuild-fixes-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
  kbuild: Use '--strip-unneeded-symbol' for removing module device table symbols
  s390/vmlinux.lds.S: Move .vmlinux.info to end of allocatable sections
  kbuild: Add '.rel.*' strip pattern for vmlinux
  kbuild: Restore pattern to avoid stripping .rela.dyn from vmlinux
  gen_init_cpio: Ignore fsync() returning EINVAL on pipes
  scripts/Makefile.extrawarn: Respect CONFIG_WERROR / W=e for hostprogs
  kbuild: uapi: Strip comments before size type check
2025-10-11 15:47:12 -07:00
Linus Torvalds
9591fdb061 Merge tag 'x86_core_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull more x86 updates from Borislav Petkov:

 - Remove a bunch of asm implementing condition flags testing in KVM's
   emulator in favor of int3_emulate_jcc() which is written in C

 - Replace KVM fastops with C-based stubs which avoids problems with the
   fastop infra related to latter not adhering to the C ABI due to their
   special calling convention and, more importantly, bypassing compiler
   control-flow integrity checking because they're written in asm

 - Remove wrongly used static branches and other ugliness accumulated
   over time in hyperv's hypercall implementation with a proper static
   function call to the correct hypervisor call variant

 - Add some fixes and modifications to allow running FRED-enabled
   kernels in KVM even on non-FRED hardware

 - Add kCFI improvements like validating indirect calls and prepare for
   enabling kCFI with GCC. Add cmdline params documentation and other
   code cleanups

 - Use the single-byte 0xd6 insn as the official #UD single-byte
   undefined opcode instruction as agreed upon by both x86 vendors

 - Other smaller cleanups and touchups all over the place

* tag 'x86_core_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  x86,retpoline: Optimize patch_retpoline()
  x86,ibt: Use UDB instead of 0xEA
  x86/cfi: Remove __noinitretpoline and __noretpoline
  x86/cfi: Add "debug" option to "cfi=" bootparam
  x86/cfi: Standardize on common "CFI:" prefix for CFI reports
  x86/cfi: Document the "cfi=" bootparam options
  x86/traps: Clarify KCFI instruction layout
  compiler_types.h: Move __nocfi out of compiler-specific header
  objtool: Validate kCFI calls
  x86/fred: KVM: VMX: Always use FRED for IRQs when CONFIG_X86_FRED=y
  x86/fred: Play nice with invoking asm_fred_entry_from_kvm() on non-FRED hardware
  x86/fred: Install system vector handlers even if FRED isn't fully enabled
  x86/hyperv: Use direct call to hypercall-page
  x86/hyperv: Clean up hv_do_hypercall()
  KVM: x86: Remove fastops
  KVM: x86: Convert em_salc() to C
  KVM: x86: Introduce EM_ASM_3WCL
  KVM: x86: Introduce EM_ASM_1SRC2
  KVM: x86: Introduce EM_ASM_2CL
  KVM: x86: Introduce EM_ASM_2W
  ...
2025-10-11 11:19:16 -07:00
Linus Torvalds
2f0a750453 Merge tag 'x86_cleanups_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Borislav Petkov:

 - Simplify inline asm flag output operands now that the minimum
   compiler version supports the =@ccCOND syntax

 - Remove a bunch of AS_* Kconfig symbols which detect assembler support
   for various instruction mnemonics now that the minimum assembler
   version supports them all

 - The usual cleanups all over the place

* tag 'x86_cleanups_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/asm: Remove code depending on __GCC_ASM_FLAG_OUTPUTS__
  x86/sgx: Use ENCLS mnemonic in <kernel/cpu/sgx/encls.h>
  x86/mtrr: Remove license boilerplate text with bad FSF address
  x86/asm: Use RDPKRU and WRPKRU mnemonics in <asm/special_insns.h>
  x86/idle: Use MONITORX and MWAITX mnemonics in <asm/mwait.h>
  x86/entry/fred: Push __KERNEL_CS directly
  x86/kconfig: Remove CONFIG_AS_AVX512
  crypto: x86 - Remove CONFIG_AS_VPCLMULQDQ
  crypto: X86 - Remove CONFIG_AS_VAES
  crypto: x86 - Remove CONFIG_AS_GFNI
  x86/kconfig: Drop unused and needless config X86_64_SMP
2025-10-11 10:51:14 -07:00
Paul Walmsley
852947be66 riscv: kprobes: convert one final __ASSEMBLY__ to __ASSEMBLER__
Per the reasoning in commit f811f58597 ("riscv: Replace __ASSEMBLY__
with __ASSEMBLER__ in non-uapi headers"), convert one last remaining
instance of __ASSEMBLY__ in the arch/riscv kprobes code.  This entered
the tree from patches that were sent before Thomas' changes; and when
I reviewed the kprobes patches before queuing them, I missed this
instance.

Cc: Nam Cao <namcao@linutronix.dev>
Cc: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/linux-riscv/16b74b63-f223-4f0b-b6e5-31cea5e620b4@redhat.com/
Link: https://lore.kernel.org/linux-riscv/20250606070952.498274-1-thuth@redhat.com/
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-10-10 16:04:25 -06:00
Sean Christopherson
44c6cb9fe9 KVM: guest_memfd: Allow mmap() on guest_memfd for x86 VMs with private memory
Allow mmap() on guest_memfd instances for x86 VMs with private memory as
the need to track private vs. shared state in the guest_memfd instance is
only pertinent to INIT_SHARED.  Doing mmap() on private memory isn't
terrible useful (yet!), but it's now possible, and will be desirable when
guest_memfd gains support for other VMA-based syscalls, e.g. mbind() to
set NUMA policy.

Lift the restriction now, before MMAP support is officially released, so
that KVM doesn't need to add another capability to enumerate support for
mmap() on private memory.

Fixes: 3d3a04fad2 ("KVM: Allow and advertise support for host mmap() on guest_memfd files")
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Tested-by: Ackerley Tng <ackerleytng@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20251003232606.4070510-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-10-10 14:25:25 -07:00
Dapeng Mi
034417c143 KVM: x86/pmu: Don't try to get perf capabilities for hybrid CPUs
Explicitly zero kvm_host_pmu instead of attempting to get the perf PMU
capabilities when running on a hybrid CPU to avoid running afoul of perf's
sanity check.

  ------------[ cut here ]------------
  WARNING: arch/x86/events/core.c:3089 at perf_get_x86_pmu_capability+0xd/0xc0,
  Call Trace:
   <TASK>
   kvm_x86_vendor_init+0x1b0/0x1a40 [kvm]
   vmx_init+0xdb/0x260 [kvm_intel]
   vt_init+0x12/0x9d0 [kvm_intel]
   do_one_initcall+0x60/0x3f0
   do_init_module+0x97/0x2b0
   load_module+0x2d08/0x2e30
   init_module_from_file+0x96/0xe0
   idempotent_init_module+0x117/0x330
   __x64_sys_finit_module+0x73/0xe0

Always read the capabilities for non-hybrid CPUs, i.e. don't entirely
revert to reading capabilities if and only if KVM wants to use a PMU, as
it may be useful to have the host PMU capabilities available, e.g. if only
or debug.

Reported-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Closes: https://lore.kernel.org/all/70b64347-2aca-4511-af78-a767d5fa8226@intel.com/
Fixes: 51f34b1e65 ("KVM: x86/pmu: Snapshot host (i.e. perf's) reported PMU capabilities")
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20251010005239.146953-1-dapeng1.mi@linux.intel.com
[sean: rework changelog, call out hybrid CPUs in shortlog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-10-10 14:25:12 -07:00
Linus Torvalds
917167ed12 Merge tag 'xtensa-20251010' of https://github.com/jcmvbkbc/linux-xtensa
Pull Xtensa updates from Max Filippov:

 - minor cleanups

* tag 'xtensa-20251010' of https://github.com/jcmvbkbc/linux-xtensa:
  xtensa: use HZ_PER_MHZ in platform_calibrate_ccount
  xtensa: simdisk: add input size check in proc_write_simdisk
2025-10-10 11:20:19 -07:00
Nathan Chancellor
9338d660b7 s390/vmlinux.lds.S: Move .vmlinux.info to end of allocatable sections
When building s390 defconfig with binutils older than 2.32, there are
several warnings during the final linking stage:

  s390-linux-ld: .tmp_vmlinux1: warning: allocated section `.got.plt' not in segment
  s390-linux-ld: .tmp_vmlinux2: warning: allocated section `.got.plt' not in segment
  s390-linux-ld: vmlinux.unstripped: warning: allocated section `.got.plt' not in segment
  s390-linux-objcopy: vmlinux: warning: allocated section `.got.plt' not in segment
  s390-linux-objcopy: st7afZyb: warning: allocated section `.got.plt' not in segment

binutils commit afca762f598 ("S/390: Improve partial relro support for
64 bit") [1] in 2.32 changed where .got.plt is emitted, avoiding the
warning.

The :NONE in the .vmlinux.info output section description changes the
segment for subsequent allocated sections. Move .vmlinux.info right
above the discards section to place all other sections in the previously
defined segment, .data.

Fixes: 30226853d6 ("s390: vmlinux.lds.S: explicitly handle '.got' and '.plt' sections")
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=afca762f598d453c563f244cd3777715b1a0cb72 [1]
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Acked-by: Alexey Gladkov <legion@kernel.org>
Acked-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20251008-kbuild-fix-modinfo-regressions-v1-3-9fc776c5887c@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2025-10-10 10:22:08 -07:00
Linus Torvalds
8cc8ea228c Merge tag 'parisc-for-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
 "Minor enhancements and fixes, specifically:

   - report emulation and alignment faults via perf

   - add initial kernel-side support for perf_events

   - small initialization fixes in the parisc firmware layer

   - adjust TC* constants and avoid referencing termio structs to avoid
     userspace build errors"

* tag 'parisc-for-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix iodc and device path return values on old machines
  parisc: Firmware: Fix returned path for PDC_MODULE_FIND on older machines
  parisc: Add initial kernel-side perf_event support
  parisc: Report software alignment faults via perf
  parisc: Report emulation faults via perf
  parisc: don't reference obsolete termio struct for TC* constants
  parisc: Remove spurious if statement from raw_copy_from_user()
2025-10-10 10:01:55 -07:00
Thomas Weißschuh
7882d2c45c riscv: Respect dependencies of ARCH_HAS_ELF_CORE_EFLAGS
This kconfig symbol has dependencies and is only selectable if those
dependencies are also enabled.

Respect the dependencies.

Fixes the following warning when configuring an 'allnoconfig':

WARNING: unmet direct dependencies detected for ARCH_HAS_ELF_CORE_EFLAGS
  Depends on [n]: BINFMT_ELF [=n] && ELF_CORE [=y]
  Selected by [y]:
  - RISCV [=y]

Fixes: 8c94db0ae9 ("binfmt_elf: preserve original ELF e_flags for core dumps")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://lore.kernel.org/r/20251009-riscv-elf-core-eflags-v1-1-e9b45ab6b36d@linutronix.de
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-10-09 19:39:05 -06:00
Han Gao
69a8b62a7a riscv: acpi: avoid errors caused by probing DT devices when ACPI is used
Similar to the ARM64 commit 3505f30fb6a9s ("ARM64 / ACPI: If we chose
to boot from acpi then disable FDT"), let's not do DT hardware probing
if ACPI is enabled in early boot.  This avoids errors caused by
repeated driver probing.

Signed-off-by: Han Gao <rabenda.cn@gmail.com>
Link: https://lore.kernel.org/r/20250910112401.552987-1-rabenda.cn@gmail.com
[pjw@kernel.org: cleaned up patch description and subject]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-10-09 19:36:45 -06:00
Fabian Vogt
9e68bd803f riscv: kprobes: Fix probe address validation
When adding a kprobe such as "p:probe/tcp_sendmsg _text+15392192",
arch_check_kprobe would start iterating all instructions starting from
_text until the probed address. Not only is this very inefficient, but
literal values in there (e.g. left by function patching) are
misinterpreted in a way that causes a desync.

Fix this by doing it like x86: start the iteration at the closest
preceding symbol instead of the given starting point.

Fixes: 87f48c7ccc ("riscv: kprobe: Fixup kernel panic when probing an illegal position")
Signed-off-by: Fabian Vogt <fvogt@suse.de>
Signed-off-by: Marvin Friedrich <marvin.friedrich@suse.com>
Acked-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/6191817.lOV4Wx5bFT@fvogt-thinkpad
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-10-09 19:36:45 -06:00
Florian Schmaus
c199745d3a riscv: entry: fix typo in comment 'instruciton' -> 'instruction'
Fix a typo in a comment in the RISC-V entry.S.

Signed-off-by: Florian Schmaus <flo@geekplace.eu>
Link: https://lore.kernel.org/r/20251006093742.53925-1-flo@geekplace.eu
[pjw@kernel.org: wrote a basic patch description]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-10-09 19:36:45 -06:00
Danil Skrebenkov
ae9e9f3d67 RISC-V: clear hot-unplugged cores from all task mm_cpumasks to avoid rfence errors
openSBI v1.7 adds harts checks for ipi operations. Especially it
adds comparison between hmask passed as an argument from linux
and mask of online harts (from openSBI side). If they don't
fit each other the error occurs.

When cpu is offline, cpu_online_mask is explicitly cleared in
__cpu_disable. However, there is no explicit clearing of
mm_cpumask. mm_cpumask is used for rfence operations that
call openSBI RFENCE extension which uses ipi to remote harts.
If hart is offline there may be error if mask of linux is not
as mask of online harts in openSBI.

this patch adds explicit clearing of mm_cpumask for offline hart.

Signed-off-by: Danil Skrebenkov <danil.skrebenkov@cloudbear.ru>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250919132849.31676-1-danil.skrebenkov@cloudbear.ru
[pjw@kernel.org: rewrote subject line for clarity]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-10-09 19:36:45 -06:00
Miquel Sabaté Solà
781380d2cd riscv: kgdb: Ensure that BUFMAX > NUMREGBYTES
The current value of BUFMAX is similar as in other architectures, but as
per documentation on KGDB (see
'Documentation/process/debugging/kgdb.rst'), BUFMAX has to be larger
than NUMREGBYTES.

Some NUMREGBYTES architectures (e.g. powerpc or hexagon) actually define
BUFMAX in relation to NUMREGBYTES, and thus this condition is always
guaranteed. Since 2048 is a value that is generally accepted on all
architectures, and that is larger than the current value of NUMREGBYTES,
we can keep this value in arch/riscv, but we can at least add an
'static_assert' as an extra measure just in case NUMREGBYTES changes in
the future for some unforseen reason.

Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com>
Link: https://lore.kernel.org/r/20250915143252.154955-1-mikisabate@gmail.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-10-09 19:36:45 -06:00
Conor Dooley
812258ff41 rust: cfi: only 64-bit arm and x86 support CFI_CLANG
The kernel uses the standard rustc targets for non-x86 targets, and out
of those only 64-bit arm's target has kcfi support enabled. For x86, the
custom 64-bit target enables kcfi.

The HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC config option that allows
CFI_CLANG to be used in combination with RUST does not check whether the
rustc target supports kcfi. This breaks the build on riscv (and
presumably 32-bit arm) when CFI_CLANG and RUST are enabled at the same
time.

Ordinarily, a rustc-option check would be used to detect target support
but unfortunately rustc-option filters out the target for reasons given
in commit 46e24a545c ("rust: kasan/kbuild: fix missing flags on first
build"). As a result, if the host supports kcfi but the target does not,
e.g. when building for riscv on x86_64, the build would remain broken.

Instead, make HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC depend on the only
two architectures where the target used supports it to fix the build.

CC: stable@vger.kernel.org
Fixes: ca627e6365 ("rust: cfi: add support for CFI_CLANG with Rust")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20250908-distill-lint-1ae78bcf777c@spud
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-10-09 19:36:45 -06:00
Helge Deller
f4edb5c52c parisc: Fix iodc and device path return values on old machines
Older machines may not fully initialize the return values when asking for IODC
and device path data when building the inventory.  Work around possible
firmware leaks by proper initialization of the variables.

Signed-off-by: Helge Deller <deller@gmx.de>
2025-10-09 23:45:04 +02:00