Commit Graph

228309 Commits

Author SHA1 Message Date
Jiaxun Yang
341cf992d3 LoongArch: Disable FIX_EARLYCON_MEM when ARCH_IOREMAP is enabled
When ARCH_IOREMAP is enabled, we are using always accessible DMW for
ioremap(). It makes no sense to create a dedicated mapping for earlycon
given that we can access the region via DMW.

Disable FIX_EARLYCON_MEM when ARCH_IOREMAP is selected. This can ease
debugging for early mapping issues.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-01-25 18:51:33 +08:00
Masahiro Yamada
c91ddab579 LoongArch: Migrate to the generic rule for built-in DTB
Commit 654102df2a ("kbuild: add generic support for built-in boot
DTBs") introduced generic support for built-in DTBs.

Select GENERIC_BUILTIN_DTB when built-in DTB support is enabled.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-01-25 18:51:33 +08:00
Huacai Chen
09e9d370ce Merge tag 'irq-core-2025-01-21' into loongarch-next
LoongArch architecture changes for 6.14 depend on the irq-core changes
(AVECINTC fixes) to work well for NUMA, so merge them to create a base.
2025-01-25 18:15:32 +08:00
Linus Torvalds
9528d418de Merge tag 'x86_urgent_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:

 - Mark serialize() noinstr so that it can be used from instrumentation-
   free code

 - Make sure FRED's RSP0 MSR is synchronized with its corresponding
   per-CPU value in order to avoid double faults in hotplug scenarios

 - Disable EXECMEM_ROX on x86 for now because it didn't receive proper
   x86 maintainers review, went in and broke a bunch of things

* tag 'x86_urgent_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/asm: Make serialize() always_inline
  x86/fred: Fix the FRED RSP0 MSR out of sync with its per-CPU cache
  x86: Disable EXECMEM_ROX support
2025-01-19 09:33:40 -08:00
Juergen Gross
ae02ae16b7 x86/asm: Make serialize() always_inline
In order to allow serialize() to be used from noinstr code, make it
__always_inline.

Fixes: 0ef8047b73 ("x86/static-call: provide a way to do very early static-call updates")
Closes: https://lore.kernel.org/oe-kbuild-all/202412181756.aJvzih2K-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20241218100918.22167-1-jgross@suse.com
2025-01-16 16:51:17 +01:00
Thomas Gleixner
f94a18249b genirq: Remove IRQ_MOVE_PCNTXT and related code
Now that x86 is converted over to use the IRQCHIP_MOVE_DEFERRED flags,
remove IRQ*_MOVE_PCNTXT and related code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241210103335.626707225@linutronix.de
2025-01-15 21:38:53 +01:00
Thomas Gleixner
7d04319a05 x86/apic: Convert to IRQCHIP_MOVE_DEFERRED
Instead of marking individual interrupts as safe to be migrated in
arbitrary contexts, mark the interrupt chips, which require the interrupt
to be moved in actual interrupt context, with the new IRQCHIP_MOVE_DEFERRED
flag. This makes more sense because this is a per interrupt chip property
and not restricted to individual interrupts.

That flips the logic from the historical opt-out to a opt-in model. This is
simpler to handle for other architectures, which default to unrestricted
affinity setting. It also allows to cleanup the redundant core logic
significantly.

All interrupt chips, which belong to a top-level domain sitting directly on
top of the x86 vector domain are marked accordingly, unless the related
setup code marks the interrupts with IRQ_MOVE_PCNTXT, i.e. XEN.

No functional change intended.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steve Wahl <steve.wahl@hpe.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/all/20241210103335.563277044@linutronix.de
2025-01-15 21:38:53 +01:00
Thomas Gleixner
65d09d269f hexagon: Remove GENERIC_PENDING_IRQ leftover
Commented out since 2011....

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Brian Cain <bcain@quicinc.com>
Link: https://lore.kernel.org/all/20241210103335.437630614@linutronix.de
2025-01-15 10:56:22 +01:00
Thomas Gleixner
5d30d6ab8c ARC: Remove GENERIC_PENDING_IRQ
Nothing uses the actual functionality and the MCIP controller sets the
flags which disables the deferred affinity change. The other interrupt
controller does not support affinity setting at all.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Vineet Gupta <vgupta@kernel.org>   # arch/arc/
Link: https://lore.kernel.org/all/20241210103335.373392568@linutronix.de
2025-01-15 10:56:22 +01:00
Nicolas Frayer
b8b26ae398 irqchip/ti-sci-inta : Add module build support
Add module build support in Kconfig for the TI SCI interrupt aggregator
driver. The driver's default build is built-in and it also depends on
ARCH_K3 as the driver uses some 64 bit ops and should only be built for
64-bit platforms.

Signed-off-by: Nicolas Frayer <nfrayer@baylibre.com>
Signed-off-by: Guillaume La Roque <glaroque@baylibre.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Nishanth Menon <nm@ti.com>
Link: https://lore.kernel.org/all/20241224-timodules-v4-2-c5e010f58e2c@baylibre.com
2025-01-15 09:54:29 +01:00
Nicolas Frayer
2d95ffaecb irqchip/ti-sci-intr: Add module build support
Add module build support in Kconfig for the TI SCI interrupt router
driver. This driver depends on the TI sci firmware driver which aready
supports module build.

Signed-off-by: Nicolas Frayer <nfrayer@baylibre.com>
Signed-off-by: Guillaume La Roque <glaroque@baylibre.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Nishanth Menon <nm@ti.com>
Link: https://lore.kernel.org/all/20241224-timodules-v4-1-c5e010f58e2c@baylibre.com
2025-01-15 09:54:29 +01:00
Xin Li (Intel)
de31b3cd70 x86/fred: Fix the FRED RSP0 MSR out of sync with its per-CPU cache
The FRED RSP0 MSR is only used for delivering events when running
userspace.  Linux leverages this property to reduce expensive MSR
writes and optimize context switches.  The kernel only writes the
MSR when about to run userspace *and* when the MSR has actually
changed since the last time userspace ran.

This optimization is implemented by maintaining a per-CPU cache of
FRED RSP0 and then checking that against the value for the top of
current task stack before running userspace.

However cpu_init_fred_exceptions() writes the MSR without updating
the per-CPU cache.  This means that the kernel might return to
userspace with MSR_IA32_FRED_RSP0==0 when it needed to point to the
top of current task stack.  This would induce a double fault (#DF),
which is bad.

A context switch after cpu_init_fred_exceptions() can paper over
the issue since it updates the cached value.  That evidently
happens most of the time explaining how this bug got through.

Fix the bug through resynchronizing the FRED RSP0 MSR with its
per-CPU cache in cpu_init_fred_exceptions().

Fixes: fe85ee3919 ("x86/entry: Set FRED RSP0 on return to userspace instead of context switch")
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250110174639.1250829-1-xin%40zytor.com
2025-01-14 14:16:36 -08:00
Linus Torvalds
c45323b756 Merge tag 'mm-hotfixes-stable-2025-01-13-00-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
 "18 hotfixes. 11 are cc:stable. 13 are MM and 5 are non-MM.

  All patches are singletons - please see the relevant changelogs for
  details"

* tag 'mm-hotfixes-stable-2025-01-13-00-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  fs/proc: fix softlockup in __read_vmcore (part 2)
  mm: fix assertion in folio_end_read()
  mm: vmscan : pgdemote vmstat is not getting updated when MGLRU is enabled.
  vmstat: disable vmstat_work on vmstat_cpu_down_prep()
  zram: fix potential UAF of zram table
  selftests/mm: set allocated memory to non-zero content in cow test
  mm: clear uffd-wp PTE/PMD state on mremap()
  module: fix writing of livepatch relocations in ROX text
  mm: zswap: properly synchronize freeing resources during CPU hotunplug
  Revert "mm: zswap: fix race between [de]compression and CPU hotunplug"
  hugetlb: fix NULL pointer dereference in trace_hugetlbfs_alloc_inode
  mm: fix div by zero in bdi_ratio_from_pages
  x86/execmem: fix ROX cache usage in Xen PV guests
  filemap: avoid truncating 64-bit offset to 32 bits
  tools: fix atomic_set() definition to set the value correctly
  mm/mempolicy: count MPOL_WEIGHTED_INTERLEAVE to "interleave_hit"
  scripts/decode_stacktrace.sh: fix decoding of lines with an additional info
  mm/kmemleak: fix percpu memory leak detection failure
2025-01-13 09:03:18 -08:00
Peter Zijlstra
a9bbe34133 x86: Disable EXECMEM_ROX support
The whole module_writable_address() nonsense made a giant mess of
alternative.c, not to mention it still contains bugs -- notable some of the
CFI variants crash and burn.

Mike has been working on patches to clean all this up again, but given the
current state of things, this stuff just isn't ready.

Disable for now, lets try again next cycle.

Fixes: 5185e7f9f3 ("x86/module: enable ROX caches for module text on 64 bit")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20250113112934.GA8385@noisy.programming.kicks-ass.net
2025-01-13 12:42:51 +01:00
Juergen Gross
59f5910847 x86/execmem: fix ROX cache usage in Xen PV guests
The recently introduced ROX cache for modules is assuming large page
support in 64-bit mode without testing the related feature bit.  This
results in breakage when running as a Xen PV guest, as in this mode large
pages are not supported.

Fix that by testing the X86_FEATURE_PSE capability when deciding whether
to enable the ROX cache.

Link: https://lkml.kernel.org/r/20250103065631.26459-1-jgross@suse.com
Fixes: 2e45474ab1 ("execmem: add support for cache of large ROX pages")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-12 19:03:35 -08:00
Linus Torvalds
be54864552 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "The largest part here is for KVM/PPC, where a NULL pointer dereference
  was introduced in the 6.13 merge window and is now fixed.

  There's some "holiday-induced lateness", as the s390 submaintainer put
  it, but otherwise things looks fine.

  s390:

   - fix a latent bug when the kernel is compiled in debug mode

   - two small UCONTROL fixes and their selftests

  arm64:

   - always check page state in hyp_ack_unshare()

   - align set_id_regs selftest with the fact that ASIDBITS field is RO

   - various vPMU fixes for bugs that only affect nested virt

  PPC e500:

   - Fix a mostly impossible (but just wrong) case where IRQs were never
     re-enabled

   - Observe host permissions instead of mapping readonly host pages as
     guest-writable. This fixes a NULL-pointer dereference in 6.13

   - Replace brittle VMA-based attempts at building huge shadow TLB
     entries with PTE lookups"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: e500: perform hugepage check after looking up the PFN
  KVM: e500: map readonly host pages for read
  KVM: e500: track host-writability of pages
  KVM: e500: use shadow TLB entry as witness for writability
  KVM: e500: always restore irqs
  KVM: s390: selftests: Add has device attr check to uc_attr_mem_limit selftest
  KVM: s390: selftests: Add ucontrol gis routing test
  KVM: s390: Reject KVM_SET_GSI_ROUTING on ucontrol VMs
  KVM: s390: selftests: Add ucontrol flic attr selftests
  KVM: s390: Reject setting flic pfault attributes on ucontrol VMs
  KVM: s390: vsie: fix virtual/physical address in unpin_scb()
  KVM: arm64: Only apply PMCR_EL0.P to the guest range of counters
  KVM: arm64: nv: Reload PMU events upon MDCR_EL2.HPME change
  KVM: arm64: Use KVM_REQ_RELOAD_PMU to handle PMCR_EL0.E change
  KVM: arm64: Add unified helper for reprogramming counters by mask
  KVM: arm64: Always check the state from hyp_ack_unshare()
  KVM: arm64: Fix set_id_regs selftest for ASIDBITS becoming unwritable
2025-01-12 12:04:53 -08:00
Linus Torvalds
f31acaef55 Merge tag 'x86_urgent_for_v6.13_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:

 - Check whether shadow stack is active before using the ptrace regset
   getter

 - Remove a wrong BUG_ON in the early static call code which breaks Xen
   PVH when booting as dom0

* tag 'x86_urgent_for_v6.13_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu: Ensure shadow stack is active before "getting" registers
  x86/static-call: Remove early_boot_irqs_disabled check to fix Xen PVH dom0
2025-01-12 11:55:48 -08:00
Paolo Bonzini
a5546c2f0d Merge tag 'kvm-s390-master-6.13-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
KVM: s390: three small bugfixes

Fix a latent bug when the kernel is compiled in debug mode.
Two small UCONTROL fixes and their selftests.
2025-01-12 12:51:05 +01:00
Paolo Bonzini
5c99a684c9 Merge tag 'kvmarm-fixes-6.13-3' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 changes for 6.13, part #3

 - Always check page state in hyp_ack_unshare()

 - Align set_id_regs selftest with the fact that ASIDBITS field is RO

 - Various vPMU fixes for bugs that only affect nested virt
2025-01-12 12:50:39 +01:00
Paolo Bonzini
55f4db79c4 KVM: e500: perform hugepage check after looking up the PFN
e500 KVM tries to bypass __kvm_faultin_pfn() in order to map VM_PFNMAP
VMAs as huge pages.  This is a Bad Idea because VM_PFNMAP VMAs could
become noncontiguous as a result of callsto remap_pfn_range().

Instead, use the already existing host PTE lookup to retrieve a
valid host-side mapping level after __kvm_faultin_pfn() has
returned.  Then find the largest size that will satisfy the
guest's request while staying within a single host PTE.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-12 10:53:00 +01:00
Paolo Bonzini
03b755b2aa KVM: e500: map readonly host pages for read
The new __kvm_faultin_pfn() function is upset by the fact that e500 KVM
ignores host page permissions - __kvm_faultin requires a "writable"
outgoing argument, but e500 KVM is nonchalantly passing NULL.

If the host page permissions do not include writability, the shadow
TLB entry is forcibly mapped read-only.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-12 10:35:45 +01:00
Paolo Bonzini
f2104bf22f KVM: e500: track host-writability of pages
Add the possibility of marking a page so that the UW and SW bits are
force-cleared.  This is stored in the private info so that it persists
across multiple calls to kvmppc_e500_setup_stlbe.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-12 10:35:41 +01:00
Paolo Bonzini
e97fbb43fb KVM: e500: use shadow TLB entry as witness for writability
kvmppc_e500_ref_setup is returning whether the guest TLB entry is writable,
which is than passed to kvm_release_faultin_page.  This makes little sense
for two reasons: first, because the function sets up the private data for
the page and the return value feels like it has been bolted on the side;
second, because what really matters is whether the _shadow_ TLB entry is
writable.  If it is not writable, the page can be released as non-dirty.
Shift from using tlbe_is_writable(gtlbe) to doing the same check on
the shadow TLB entry.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-12 10:35:36 +01:00
Paolo Bonzini
87ecfdbc69 KVM: e500: always restore irqs
If find_linux_pte fails, IRQs will not be restored.  This is unlikely
to happen in practice since it would have been reported as hanging
hosts, but it should of course be fixed anyway.

Cc: stable@vger.kernel.org
Reported-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-12 10:34:44 +01:00
Linus Torvalds
da60d1547d Merge tag 'soc-fixes-6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
 "Over the Christmas break a couple of devicetree fixes came in for
  Rockchips, Qualcomm and NXP/i.MX. These add some missing board
  specific properties, address build time warnings,

  The USB/TOG supoprt on X1 Elite regressed, so two earlier DT changes
  get reverted for now.

  Aside from the devicetree fixes, there is One build fix for the stm32
  firewall driver, and a defconfig change to enable SPDIF support for
  i.MX"

* tag 'soc-fixes-6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  firewall: remove misplaced semicolon from stm32_firewall_get_firewall
  arm64: dts: rockchip: add hevc power domain clock to rk3328
  arm64: dts: rockchip: Fix the SD card detection on NanoPi R6C/R6S
  arm64: dts: qcom: sa8775p: fix the secure device bootup issue
  Revert "arm64: dts: qcom: x1e80100: enable OTG on USB-C controllers"
  Revert "arm64: dts: qcom: x1e80100-crd: enable otg on usb ports"
  arm64: dts: qcom: x1e80100: Fix up BAR space size for PCIe6a
  Revert "arm64: dts: qcom: x1e78100-t14s: enable otg on usb-c ports"
  ARM: dts: imxrt1050: Fix clocks for mmc
  ARM: imx_v6_v7_defconfig: enable SND_SOC_SPDIF
  arm64: dts: imx95: correct the address length of netcmix_blk_ctrl
  arm64: dts: imx8-ss-audio: add fallback compatible string fsl,imx6ull-esai for esai
  arm64: dts: rockchip: rename rfkill label for Radxa ROCK 5B
  arm64: dts: rockchip: add reset-names for combphy on rk3568
  arm64: dts: qcom: sa8775p: Fix the size of 'addr_space' regions
2025-01-11 10:42:05 -08:00
Linus Torvalds
f8c6263347 Merge tag 'riscv-for-linus-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:

 - a handful of selftest fixes

 - fix a memory leak in relocation processing during module loading

 - avoid sleeping in die()

 - fix kprobe instruction slot address calculations

 - fix DT node reference leak in SBI idle probing

 - avoid initializing out of bounds pages on sparse vmemmap systems with
   a gap at the start of their physical memory map

 - fix backtracing through exceptions

 - _Q_PENDING_LOOPS is now defined whenever QUEUED_SPINLOCKS=y

 - local labels in entry.S are now marked with ".L", which prevents them
   from trashing backtraces

 - a handful of fixes for SBI-based performance counters

* tag 'riscv-for-linus-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  drivers/perf: riscv: Do not allow invalid raw event config
  drivers/perf: riscv: Return error for default case
  drivers/perf: riscv: Fix Platform firmware event data
  tools: selftests: riscv: Add test count for vstate_prctl
  tools: selftests: riscv: Add pass message for v_initval_nolibc
  riscv: use local label names instead of global ones in assembly
  riscv: qspinlock: Fixup _Q_PENDING_LOOPS definition
  riscv: stacktrace: fix backtracing through exceptions
  riscv: mm: Fix the out of bound issue of vmemmap address
  cpuidle: riscv-sbi: fix device node release in early exit of for_each_possible_cpu
  riscv: kprobes: Fix incorrect address calculation
  riscv: Fix sleeping in invalid context in die()
  riscv: module: remove relocation_head rel_entry member allocation
  riscv: selftests: Fix warnings pointer masking test
2025-01-10 10:50:30 -08:00
Arnd Bergmann
5391c5d8e6 Merge tag 'v6.13-rockchip-dtsfixes1' of https://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into arm/fixes
Fixed card-detect on one board and some missing properties added.

* tag 'v6.13-rockchip-dtsfixes1' of https://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
  arm64: dts: rockchip: add hevc power domain clock to rk3328
  arm64: dts: rockchip: Fix the SD card detection on NanoPi R6C/R6S
  arm64: dts: rockchip: rename rfkill label for Radxa ROCK 5B
  arm64: dts: rockchip: add reset-names for combphy on rk3568

Link: https://lore.kernel.org/r/2914560.yaVYbkx8dN@diego
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-01-10 18:20:59 +01:00
Arnd Bergmann
00f63a0f5f Merge tag 'imx-fixes-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes
i.MX fixes for 6.13:

- Add fallback for i.MX8QM ESAI compatible to fix a dt-schema warning
  caused by bindings update
- Fix uSDHC1 clock for i.MX RT1050
- Enable SND_SOC_SPDIF in imx_v6_v7_defconfig to fix a regression caused
  by an i.MX6 SPDIF sound card change in DT
- Fix address length of i.MX95 netcmix_blk_ctrl

* tag 'imx-fixes-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: dts: imxrt1050: Fix clocks for mmc
  ARM: imx_v6_v7_defconfig: enable SND_SOC_SPDIF
  arm64: dts: imx95: correct the address length of netcmix_blk_ctrl
  arm64: dts: imx8-ss-audio: add fallback compatible string fsl,imx6ull-esai for esai

Link: https://lore.kernel.org/r/Z3Jf9zbv/xH3YzuB@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-01-09 22:56:15 +01:00
Arnd Bergmann
627522c8bf Merge tag 'qcom-arm64-fixes-for-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
Qualcomm Arm64 DeviceTree fixes for v6.13

Revert the enablement of OTG support on primary and secondary USB Type-C
controllers of X1 Elite, for now, as this broke support for USB hotplug.

Disable the TPDM DCC device on SA8775P, as this is inaccessible per
current firmware configuration. Also correct the PCIe "addr_space"
region to enable larger BAR sizes.

Also fix the address space of PCIe6a found in X1 Elite.

* tag 'qcom-arm64-fixes-for-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
  arm64: dts: qcom: sa8775p: fix the secure device bootup issue
  Revert "arm64: dts: qcom: x1e80100: enable OTG on USB-C controllers"
  Revert "arm64: dts: qcom: x1e80100-crd: enable otg on usb ports"
  arm64: dts: qcom: x1e80100: Fix up BAR space size for PCIe6a
  Revert "arm64: dts: qcom: x1e78100-t14s: enable otg on usb-c ports"
  arm64: dts: qcom: sa8775p: Fix the size of 'addr_space' regions

Link: https://lore.kernel.org/r/20250103024945.4649-1-andersson@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-01-09 22:54:53 +01:00
Palmer Dabbelt
6f6ecce59d Merge patch series "SBI PMU event related fixes"
Atish Patra <atishp@rivosinc.com> says:

Here are two minor improvement/fixes in the PMU event path. The first patch
was part of the series[1]. The 2nd patch was suggested during the series
review.

While the series can only be merged once SBI v3.0 is frozen, these two
patches can be independent of SBI v3.0 and can be merged sooner. Hence, these
two patches are sent as a separate series.

* b4-shazam-merge:
  drivers/perf: riscv: Do not allow invalid raw event config
  drivers/perf: riscv: Return error for default case
  drivers/perf: riscv: Fix Platform firmware event data

Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-0-813e8a4f5962@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09 09:37:12 -08:00
Atish Patra
fc58db9aeb drivers/perf: riscv: Fix Platform firmware event data
Platform firmware event data field is allowed to be 62 bits for
Linux as uppper most two bits are reserved to indicate SBI fw or
platform specific firmware events.
However, the event data field is masked as per the hardware raw
event mask which is not correct.

Fix the platform firmware event data field with proper mask.

Fixes: f0c9363db2 ("perf/riscv-sbi: Add platform specific firmware event handling")

Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-1-813e8a4f5962@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09 09:37:08 -08:00
Peter Geis
3699f2c43e arm64: dts: rockchip: add hevc power domain clock to rk3328
There is a race condition at startup between disabling power domains not
used and disabling clocks not used on the rk3328. When the clocks are
disabled first, the hevc power domain fails to shut off leading to a
splat of failures. Add the hevc core clock to the rk3328 power domain
node to prevent this condition.

rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 3-.... }
1087 jiffies s: 89 root: 0x8/.
rcu: blocking rcu_node structures (internal RCU debug):
Sending NMI from CPU 0 to CPUs 3:
NMI backtrace for cpu 3
CPU: 3 UID: 0 PID: 86 Comm: kworker/3:3 Not tainted 6.12.0-rc5+ #53
Hardware name: Firefly ROC-RK3328-CC (DT)
Workqueue: pm genpd_power_off_work_fn
pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : regmap_unlock_spinlock+0x18/0x30
lr : regmap_read+0x60/0x88
sp : ffff800081123c00
x29: ffff800081123c00 x28: ffff2fa4c62cad80 x27: 0000000000000000
x26: ffffd74e6e660eb8 x25: ffff2fa4c62cae00 x24: 0000000000000040
x23: ffffd74e6d2f3ab8 x22: 0000000000000001 x21: ffff800081123c74
x20: 0000000000000000 x19: ffff2fa4c0412000 x18: 0000000000000000
x17: 77202c31203d2065 x16: 6c6469203a72656c x15: 6c6f72746e6f632d
x14: 7265776f703a6e6f x13: 2063766568206e69 x12: 616d6f64202c3431
x11: 347830206f742030 x10: 3430303034783020 x9 : ffffd74e6c7369e0
x8 : 3030316666206e69 x7 : 205d383738353733 x6 : 332e31202020205b
x5 : ffffd74e6c73fc88 x4 : ffffd74e6c73fcd4 x3 : ffffd74e6c740b40
x2 : ffff800080015484 x1 : 0000000000000000 x0 : ffff2fa4c0412000
Call trace:
regmap_unlock_spinlock+0x18/0x30
rockchip_pmu_set_idle_request+0xac/0x2c0
rockchip_pd_power+0x144/0x5f8
rockchip_pd_power_off+0x1c/0x30
_genpd_power_off+0x9c/0x180
genpd_power_off.part.0.isra.0+0x130/0x2a8
genpd_power_off_work_fn+0x6c/0x98
process_one_work+0x170/0x3f0
worker_thread+0x290/0x4a8
kthread+0xec/0xf8
ret_from_fork+0x10/0x20
rockchip-pm-domain ff100000.syscon:power-controller: failed to get ack on domain 'hevc', val=0x88220

Fixes: 52e02d377a ("arm64: dts: rockchip: add core dtsi file for RK3328 SoCs")
Signed-off-by: Peter Geis <pgwipeout@gmail.com>
Reviewed-by: Dragan Simic <dsimic@manjaro.org>
Link: https://lore.kernel.org/r/20241214224339.24674-1-pgwipeout@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-01-09 16:22:49 +01:00
Clément Léger
5cd900b8b7 riscv: use local label names instead of global ones in assembly
Local labels should be prefix by '.L' or they'll be exported in the
symbol table. Additionally, this messes up the backtrace by displaying
an incorrect symbol:

  ...
  [   12.751810] [<ffffffff80441628>] _copy_from_user+0x28/0xc2
  [   12.752035] [<ffffffff800152ca>] handle_misaligned_load+0x1ca/0x2fc
  [   12.752310] [<ffffffff80a033e8>] do_trap_load_misaligned+0x24/0xee
  [   12.752596] [<ffffffff80a0dcae>] _new_vmalloc_restore_context_a0+0xc2/0xce

After:
  ...
  [   10.243916] [<ffffffff804415e4>] _copy_from_user+0x28/0xc2
  [   10.244026] [<ffffffff800152ca>] handle_misaligned_load+0x1ca/0x2fc
  [   10.244150] [<ffffffff80a033a0>] do_trap_load_misaligned+0x24/0xee
  [   10.244268] [<ffffffff80a0dc66>] handle_exception+0x146/0x152

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Fixes: 503638e0ba ("riscv: Stop emitting preventive sfence.vma for new vmalloc mappings")
Link: https://lore.kernel.org/r/20250103141814.508865-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:46:14 -08:00
Guo Ren
40e6073e76 riscv: qspinlock: Fixup _Q_PENDING_LOOPS definition
When CONFIG_RISCV_QUEUED_SPINLOCKS=y, the _Q_PENDING_LOOPS
definition is missing. Add the _Q_PENDING_LOOPS definition for
pure qspinlock usage.

Fixes: ab83647fad ("riscv: Add qspinlock support")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241215135252.201983-1-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:46:01 -08:00
Clément Léger
51356ce60e riscv: stacktrace: fix backtracing through exceptions
Prior to commit 5d5fc33ce5 ("riscv: Improve exception and system call
latency"), backtrace through exception worked since ra was filled with
ret_from_exception symbol address and the stacktrace code checked 'pc' to
be equal to that symbol. Now that handle_exception uses regular 'call'
instructions, this isn't working anymore and backtrace stops at
handle_exception(). Since there are multiple call site to C code in the
exception handling path, rather than checking multiple potential return
addresses, add a new symbol at the end of exception handling and check pc
to be in that range.

Fixes: 5d5fc33ce5 ("riscv: Improve exception and system call latency")
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241209155714.1239665-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:45:49 -08:00
Xu Lu
f754f27e98 riscv: mm: Fix the out of bound issue of vmemmap address
In sparse vmemmap model, the virtual address of vmemmap is calculated as:
((struct page *)VMEMMAP_START - (phys_ram_base >> PAGE_SHIFT)).
And the struct page's va can be calculated with an offset:
(vmemmap + (pfn)).

However, when initializing struct pages, kernel actually starts from the
first page from the same section that phys_ram_base belongs to. If the
first page's physical address is not (phys_ram_base >> PAGE_SHIFT), then
we get an va below VMEMMAP_START when calculating va for it's struct page.

For example, if phys_ram_base starts from 0x82000000 with pfn 0x82000, the
first page in the same section is actually pfn 0x80000. During
init_unavailable_range(), we will initialize struct page for pfn 0x80000
with virtual address ((struct page *)VMEMMAP_START - 0x2000), which is
below VMEMMAP_START as well as PCI_IO_END.

This commit fixes this bug by introducing a new variable
'vmemmap_start_pfn' which is aligned with memory section size and using
it to calculate vmemmap address instead of phys_ram_base.

Fixes: a11dd49dcb ("riscv: Sparse-Memory/vmemmap out-of-bounds fix")
Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20241209122617.53341-1-luxu.kernel@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:45:34 -08:00
Nam Cao
13134cc949 riscv: kprobes: Fix incorrect address calculation
p->ainsn.api.insn is a pointer to u32, therefore arithmetic operations are
multiplied by four. This is clearly undesirable for this case.

Cast it to (void *) first before any calculation.

Below is a sample before/after. The dumped memory is two kprobe slots, the
first slot has

  - c.addiw a0, 0x1c (0x7125)
  - ebreak           (0x00100073)

and the second slot has:

  - c.addiw a0, -4   (0x7135)
  - ebreak           (0x00100073)

Before this patch:

(gdb) x/16xh 0xff20000000135000
0xff20000000135000:	0x7125	0x0000	0x0000	0x0000	0x7135	0x0010	0x0000	0x0000
0xff20000000135010:	0x0073	0x0010	0x0000	0x0000	0x0000	0x0000	0x0000	0x0000

After this patch:

(gdb) x/16xh 0xff20000000125000
0xff20000000125000:	0x7125	0x0073	0x0010	0x0000	0x7135	0x0073	0x0010	0x0000
0xff20000000125010:	0x0000	0x0000	0x0000	0x0000	0x0000	0x0000	0x0000	0x0000

Fixes: b1756750a3 ("riscv: kprobes: Use patch_text_nosync() for insn slots")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241119111056.2554419-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:39:39 -08:00
Nam Cao
6a97f4118a riscv: Fix sleeping in invalid context in die()
die() can be called in exception handler, and therefore cannot sleep.
However, die() takes spinlock_t which can sleep with PREEMPT_RT enabled.
That causes the following warning:

BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 285, name: mutex
preempt_count: 110001, expected: 0
RCU nest depth: 0, expected: 0
CPU: 0 UID: 0 PID: 285 Comm: mutex Not tainted 6.12.0-rc7-00022-ge19049cf7d56-dirty #234
Hardware name: riscv-virtio,qemu (DT)
Call Trace:
    dump_backtrace+0x1c/0x24
    show_stack+0x2c/0x38
    dump_stack_lvl+0x5a/0x72
    dump_stack+0x14/0x1c
    __might_resched+0x130/0x13a
    rt_spin_lock+0x2a/0x5c
    die+0x24/0x112
    do_trap_insn_illegal+0xa0/0xea
    _new_vmalloc_restore_context_a0+0xcc/0xd8
Oops - illegal instruction [#1]

Switch to use raw_spinlock_t, which does not sleep even with PREEMPT_RT
enabled.

Fixes: 76d2a0493a ("RISC-V: Init and Halt Code")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20241118091333.1185288-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:23:17 -08:00
Clément Léger
03f0b54853 riscv: module: remove relocation_head rel_entry member allocation
relocation_head's list_head member, rel_entry, doesn't need to be
allocated, its storage can just be part of the allocated relocation_head.
Remove the pointer which allows to get rid of the allocation as well as
an existing memory leak found by Kai Zhang using kmemleak.

Fixes: 8fd6c51423 ("riscv: Add remaining module relocations")
Reported-by: Kai Zhang <zhangkai@iscas.ac.cn>
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20241128081636.3620468-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:22:52 -08:00
Anton Kirilov
95147bb42b arm64: dts: rockchip: Fix the SD card detection on NanoPi R6C/R6S
Fix the SD card detection on FriendlyElec NanoPi R6C/R6S boards.

Signed-off-by: Anton Kirilov <anton.kirilov@arm.com>
Link: https://lore.kernel.org/r/20241219113145.483205-1-anton.kirilov@arm.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-01-08 13:18:05 +01:00
Rick Edgecombe
a9d9c33132 x86/fpu: Ensure shadow stack is active before "getting" registers
The x86 shadow stack support has its own set of registers. Those registers
are XSAVE-managed, but they are "supervisor state components" which means
that userspace can not touch them with XSAVE/XRSTOR.  It also means that
they are not accessible from the existing ptrace ABI for XSAVE state.
Thus, there is a new ptrace get/set interface for it.

The regset code that ptrace uses provides an ->active() handler in
addition to the get/set ones. For shadow stack this ->active() handler
verifies that shadow stack is enabled via the ARCH_SHSTK_SHSTK bit in the
thread struct. The ->active() handler is checked from some call sites of
the regset get/set handlers, but not the ptrace ones. This was not
understood when shadow stack support was put in place.

As a result, both the set/get handlers can be called with
XFEATURE_CET_USER in its init state, which would cause get_xsave_addr() to
return NULL and trigger a WARN_ON(). The ssp_set() handler luckily has an
ssp_active() check to avoid surprising the kernel with shadow stack
behavior when the kernel is not ready for it (ARCH_SHSTK_SHSTK==0). That
check just happened to avoid the warning.

But the ->get() side wasn't so lucky. It can be called with shadow stacks
disabled, triggering the warning in practice, as reported by Christina
Schimpe:

WARNING: CPU: 5 PID: 1773 at arch/x86/kernel/fpu/regset.c:198 ssp_get+0x89/0xa0
[...]
Call Trace:
<TASK>
? show_regs+0x6e/0x80
? ssp_get+0x89/0xa0
? __warn+0x91/0x150
? ssp_get+0x89/0xa0
? report_bug+0x19d/0x1b0
? handle_bug+0x46/0x80
? exc_invalid_op+0x1d/0x80
? asm_exc_invalid_op+0x1f/0x30
? __pfx_ssp_get+0x10/0x10
? ssp_get+0x89/0xa0
? ssp_get+0x52/0xa0
__regset_get+0xad/0xf0
copy_regset_to_user+0x52/0xc0
ptrace_regset+0x119/0x140
ptrace_request+0x13c/0x850
? wait_task_inactive+0x142/0x1d0
? do_syscall_64+0x6d/0x90
arch_ptrace+0x102/0x300
[...]

Ensure that shadow stacks are active in a thread before looking them up
in the XSAVE buffer. Since ARCH_SHSTK_SHSTK and user_ssp[SHSTK_EN] are
set at the same time, the active check ensures that there will be
something to find in the XSAVE buffer.

[ dhansen: changelog/subject tweaks ]

Fixes: 2fab02b25a ("x86: Add PTRACE interface for shadow stack")
Reported-by: Christina Schimpe <christina.schimpe@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Christina Schimpe <christina.schimpe@intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250107233056.235536-1-rick.p.edgecombe%40intel.com
2025-01-07 15:55:51 -08:00
Christoph Schlameuss
5021fd77d6 KVM: s390: Reject KVM_SET_GSI_ROUTING on ucontrol VMs
Prevent null pointer dereference when processing
KVM_IRQ_ROUTING_S390_ADAPTER routing entries.
The ioctl cannot be processed for ucontrol VMs.

Fixes: f65470661f ("KVM: s390/interrupt: do not pin adapter interrupt pages")
Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com>
Tested-by: Hariharan Mari <hari55@linux.ibm.com>
Reviewed-by: Hariharan Mari <hari55@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20241216092140.329196-4-schlameuss@linux.ibm.com
Message-ID: <20241216092140.329196-4-schlameuss@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2025-01-07 16:36:11 +01:00
Christoph Schlameuss
df989238fa KVM: s390: Reject setting flic pfault attributes on ucontrol VMs
Prevent null pointer dereference when processing the
KVM_DEV_FLIC_APF_ENABLE and KVM_DEV_FLIC_APF_DISABLE_WAIT ioctls in the
interrupt controller.

Fixes: 3c038e6be0 ("KVM: async_pf: Async page fault support on s390")
Reported-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com>
Reviewed-by: Hariharan Mari <hari55@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20241216092140.329196-2-schlameuss@linux.ibm.com
Message-ID: <20241216092140.329196-2-schlameuss@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2025-01-07 16:36:10 +01:00
Claudio Imbrenda
984aaf6161 KVM: s390: vsie: fix virtual/physical address in unpin_scb()
In commit 77b5334115 ("KVM: s390: VSIE: sort out virtual/physical
address in pin_guest_page"), only pin_scb() has been updated. This
means that in unpin_scb() a virtual address was still used directly as
physical address without conversion. The resulting physical address is
obviously wrong and most of the time also invalid.

Since commit d0ef8d9fbe ("KVM: s390: Use kvm_release_page_dirty() to
unpin "struct page" memory"), unpin_guest_page() will directly use
kvm_release_page_dirty(), instead of kvm_release_pfn_dirty(), which has
since been removed.

One of the checks that were performed by kvm_release_pfn_dirty() was to
verify whether the page was valid at all, and silently return
successfully without doing anything if the page was invalid.

When kvm_release_pfn_dirty() was still used, the invalid page was thus
silently ignored. Now the check is gone and the result is an Oops.
This also means that when running with a V!=R kernel, the page was not
released, causing a leak.

The solution is simply to add the missing virt_to_phys().

Fixes: 77b5334115 ("KVM: s390: VSIE: sort out virtual/physical address in pin_guest_page")
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Nico Boehr <nrb@linux.ibm.com>
Link: https://lore.kernel.org/r/20241210083948.23963-1-imbrenda@linux.ibm.com
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-ID: <20241210083948.23963-1-imbrenda@linux.ibm.com>
2025-01-07 16:36:10 +01:00
Linus Torvalds
ee063c23e4 Merge tag 'nios2_update_for_v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux
Pull nios2 fixlet from Dinh Nguyen:

 - Use str_yes_no() helper function

* tag 'nios2_update_for_v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
  nios2: Use str_yes_no() helper in show_cpuinfo()
2025-01-03 14:16:25 -08:00
Linus Torvalds
f274fffbc2 Merge tag 'pinctrl-v6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:

 - A small Kconfig fixup for the i.MX.

   In principle this could come in from the SoC tree but the bug was
   introduced from the pin control tree so let's fix it from here.

 - Fix a sleep in atomic context in the MCP23xxx GPIO expander by
   disabling the regmap locking and using explicit mutex locks.

* tag 'pinctrl-v6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: mcp23s08: Fix sleeping in atomic context due to regmap locking
  ARM: imx: Re-introduce the PINCTRL selection
2025-01-03 10:57:57 -08:00
Andrew Cooper
5cc2db3712 x86/static-call: Remove early_boot_irqs_disabled check to fix Xen PVH dom0
__static_call_update_early() has a check for early_boot_irqs_disabled, but
is used before early_boot_irqs_disabled is set up in start_kernel().

Xen PV has always special cased early_boot_irqs_disabled, but Xen PVH does
not and falls over the BUG when booting as dom0.

It is very suspect that early_boot_irqs_disabled starts as 0, becomes 1 for
a time, then becomes 0 again, but as this needs backporting to fix a
breakage in a security fix, dropping the BUG_ON() is the far safer option.

Fixes: 0ef8047b73 ("x86/static-call: provide a way to do very early static-call updates")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219620
Reported-by: Alex Zenla <alex@edera.dev>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Alex Zenla <alex@edera.dev>
Link: https://lore.kernel.org/r/20241221211046.6475-1-andrew.cooper3@citrix.com
2025-01-02 17:11:29 +01:00
Linus Torvalds
6cbc4b29eb Merge tag 'x86-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:

 - Fix a hang in the "kernel IBT no ENDBR" self-test that may trigger
   on FRED systems, caused by incomplete FRED state cleanup in the
   #CP fault handler

 - Improve TDX (Coco VM) guest unrecoverable error handling to not
   potentially leak decrypted memory

* tag 'x86-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  virt: tdx-guest: Just leak decrypted memory on unrecoverable errors
  x86/fred: Clear WFE in missing-ENDBRANCH #CPs
2024-12-29 10:16:41 -08:00
Linus Torvalds
f65832a32f Merge tag 'perf-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 perf fixes from Ingo Molnar:

 - Fix Intel Lunar Lake build-in event definitions

 - Fall back to (compatible) legacy features on new Intel PEBS format v6
   hardware

 - Enable uncore support on Intel Clearwater Forest CPUs, which is the
   same as the existing Sierra Forest uncore driver

* tag 'perf-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Fix bitmask of OCR and FRONTEND events for LNC
  perf/x86/intel/ds: Add PEBS format 6
  perf/x86/intel/uncore: Add Clearwater Forest support
2024-12-29 10:14:08 -08:00
Xin Li (Intel)
dc81e556f2 x86/fred: Clear WFE in missing-ENDBRANCH #CPs
An indirect branch instruction sets the CPU indirect branch tracker
(IBT) into WAIT_FOR_ENDBRANCH (WFE) state and WFE stays asserted
across the instruction boundary.  When the decoder finds an
inappropriate instruction while WFE is set ENDBR, the CPU raises a #CP
fault.

For the "kernel IBT no ENDBR" selftest where #CPs are deliberately
triggered, the WFE state of the interrupted context needs to be
cleared to let execution continue.  Otherwise when the CPU resumes
from the instruction that just caused the previous #CP, another
missing-ENDBRANCH #CP is raised and the CPU enters a dead loop.

This is not a problem with IDT because it doesn't preserve WFE and
IRET doesn't set WFE.  But FRED provides space on the entry stack
(in an expanded CS area) to save and restore the WFE state, thus the
WFE state is no longer clobbered, so software must clear it.

Clear WFE to avoid dead looping in ibt_clear_fred_wfe() and the
!ibt_fatal code path when execution is allowed to continue.

Clobbering WFE in any other circumstance is a security-relevant bug.

[ dhansen: changelog rewording ]

Fixes: a5f6c2ace9 ("x86/shstk: Add user control-protection fault handler")
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20241113175934.3897541-1-xin%40zytor.com
2024-12-29 10:18:10 +01:00