When running as a Xen PV guests commit eed9a328aa ("mm: x86: add
CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG") can cause a protection violation in
pmdp_test_and_clear_young():
BUG: unable to handle page fault for address: ffff8880083374d0
#PF: supervisor write access in kernel mode
#PF: error_code(0x0003) - permissions violation
PGD 3026067 P4D 3026067 PUD 3027067 PMD 7fee5067 PTE 8010000008337065
Oops: 0003 [#1] PREEMPT SMP NOPTI
CPU: 7 PID: 158 Comm: kswapd0 Not tainted 6.1.0-rc5-20221118-doflr+ #1
RIP: e030:pmdp_test_and_clear_young+0x25/0x40
This happens because the Xen hypervisor can't emulate direct writes to
page table entries other than PTEs.
This can easily be fixed by introducing arch_has_hw_nonleaf_pmd_young()
similar to arch_has_hw_pte_young() and test that instead of
CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG.
Link: https://lkml.kernel.org/r/20221123064510.16225-1-jgross@suse.com
Fixes: eed9a328aa ("mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Acked-by: Yu Zhao <yuzhao@google.com>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Acked-by: David Hildenbrand <david@redhat.com> [core changes]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
During discussions of this series [1], it was suggested that hugetlb
handling code in follow_page_mask could be simplified. At the beginning
of follow_page_mask, there currently is a call to follow_huge_addr which
'may' handle hugetlb pages. ia64 is the only architecture which provides
a follow_huge_addr routine that does not return error. Instead, at each
level of the page table a check is made for a hugetlb entry. If a hugetlb
entry is found, a call to a routine associated with that entry is made.
Currently, there are two checks for hugetlb entries at each page table
level. The first check is of the form:
if (p?d_huge())
page = follow_huge_p?d();
the second check is of the form:
if (is_hugepd())
page = follow_huge_pd().
We can replace these checks, as well as the special handling routines such
as follow_huge_p?d() and follow_huge_pd() with a single routine to handle
hugetlb vmas.
A new routine hugetlb_follow_page_mask is called for hugetlb vmas at the
beginning of follow_page_mask. hugetlb_follow_page_mask will use the
existing routine huge_pte_offset to walk page tables looking for hugetlb
entries. huge_pte_offset can be overwritten by architectures, and already
handles special cases such as hugepd entries.
[1] https://lore.kernel.org/linux-mm/cover.1661240170.git.baolin.wang@linux.alibaba.com/
[mike.kravetz@oracle.com: remove vma (pmd sharing) per Peter]
Link: https://lkml.kernel.org/r/20221028181108.119432-1-mike.kravetz@oracle.com
[mike.kravetz@oracle.com: remove left over hugetlb_vma_unlock_read()]
Link: https://lkml.kernel.org/r/20221030225825.40872-1-mike.kravetz@oracle.com
Link: https://lkml.kernel.org/r/20220919021348.22151-1-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pull perf fixes from Borislav Petkov:
- Add Cooper Lake's stepping to the PEBS guest/host events isolation
fixed microcode revisions checking quirk
- Update Icelake and Sapphire Rapids events constraints
- Use the standard energy unit for Sapphire Rapids in RAPL
- Fix the hw_breakpoint test to fail more graciously on !SMP configs
* tag 'perf_urgent_for_v6.1_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Add Cooper Lake stepping to isolation_ucodes[]
perf/x86/intel: Fix pebs event constraints for SPR
perf/x86/intel: Fix pebs event constraints for ICL
perf/x86/rapl: Use standard Energy Unit for SPR Dram RAPL domain
perf/hw_breakpoint: test: Skip the test if dependencies unmet
Pull x86 fixes from Borislav Petkov:
- Add new Intel CPU models
- Enforce that TDX guests are successfully loaded only on TDX hardware
where virtualization exception (#VE) delivery on kernel memory is
disabled because handling those in all possible cases is "essentially
impossible"
- Add the proper include to the syscall wrappers so that BTF can see
the real pt_regs definition and not only the forward declaration
* tag 'x86_urgent_for_v6.1_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Add several Intel server CPU model numbers
x86/tdx: Panic on bad configs that #VE on "private" memory access
x86/tdx: Prepare for using "INFO" call for a second purpose
x86/syscall: Include asm/ptrace.h in syscall_wrapper header
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Fix the pKVM stage-1 walker erronously using the stage-2 accessor
- Correctly convert vcpu->kvm to a hyp pointer when generating an
exception in a nVHE+MTE configuration
- Check that KVM_CAP_DIRTY_LOG_* are valid before enabling them
- Fix SMPRI_EL1/TPIDR2_EL0 trapping on VHE
- Document the boot requirements for FGT when entering the kernel at
EL1
x86:
- Use SRCU to protect zap in __kvm_set_or_clear_apicv_inhibit()
- Make argument order consistent for kvcalloc()
- Userspace API fixes for DEBUGCTL and LBRs"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Fix a typo about the usage of kvcalloc()
KVM: x86: Use SRCU to protect zap in __kvm_set_or_clear_apicv_inhibit()
KVM: VMX: Ignore guest CPUID for host userspace writes to DEBUGCTL
KVM: VMX: Fold vmx_supported_debugctl() into vcpu_supported_debugctl()
KVM: VMX: Advertise PMU LBRs if and only if perf supports LBRs
arm64: booting: Document our requirements for fine grained traps with SME
KVM: arm64: Fix SMPRI_EL1/TPIDR2_EL0 trapping on VHE
KVM: Check KVM_CAP_DIRTY_LOG_{RING, RING_ACQ_REL} prior to enabling them
KVM: arm64: Fix bad dereference on MTE-enabled systems
KVM: arm64: Use correct accessor to parse stage-1 PTEs
Pull xen fixes from Juergen Gross:
"One fix for silencing a smatch warning, and a small cleanup patch"
* tag 'for-linus-6.1-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: simplify sysenter and syscall setup
x86/xen: silence smatch warning in pmu_msr_chk_emulated()
* Fix the pKVM stage-1 walker erronously using the stage-2 accessor
* Correctly convert vcpu->kvm to a hyp pointer when generating
an exception in a nVHE+MTE configuration
* Check that KVM_CAP_DIRTY_LOG_* are valid before enabling them
* Fix SMPRI_EL1/TPIDR2_EL0 trapping on VHE
* Document the boot requirements for FGT when entering the kernel
at EL1
x86:
* Use SRCU to protect zap in __kvm_set_or_clear_apicv_inhibit()
* Make argument order consistent for kvcalloc()
* Userspace API fixes for DEBUGCTL and LBRs
Pull arm64 fixes from Catalin Marinas:
- Avoid kprobe recursion when cortex_a76_erratum_1463225_debug_handler()
is not inlined (change to __always_inline).
- Fix the visibility of compat hwcaps, broken by recent changes to
consolidate the visibility of hwcaps and the user-space view of the
ID registers.
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: cpufeature: Fix the visibility of compat hwcaps
arm64: entry: avoid kprobe recursion
Pull EFI fixes from Ard Biesheuvel:
- A pair of tweaks to the EFI random seed code so that externally
provided version of this config table are handled more robustly
- Another fix for the v6.0 EFI variable refactor that turned out to
break Apple machines which don't provide QueryVariableInfo()
- Add some guard rails to the EFI runtime service call wrapper so we
can recover from synchronous exceptions caused by firmware
* tag 'efi-fixes-for-v6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
arm64: efi: Recover from synchronous exceptions occurring in firmware
efi: efivars: Fix variable writes with unsupported query_variable_store()
efi: random: Use 'ACPI reclaim' memory for random seed
efi: random: reduce seed size to 32 bytes
efi/tpm: Pass correct address to memblock_reserve
Pull ARM SoC fixes from Arnd Bergmann:
"There are not a lot of important fixes for the soc tree yet this time,
but it's time to upstream what I got so far:
- DT Fixes for Arm Juno and ST-Ericsson Ux500 to add missing critical
temperature points
- A number of fixes for the Arm SCMI firmware, addressing correctness
issues in the code, in particular error handling and resource
leaks.
- One error handling fix for the new i.MX93 power domain driver
- Several devicetree fixes for NXP i.MX6/8/9 and Layerscape chips,
fixing incorrect or missing DT properties for MDIO controller
nodes, CPLD, USB and regulators for various boards, as well as some
fixes for DT schema checks.
- MAINTAINERS file updates for HiSilicon LPC Bus and Broadcom git
URLs"
* tag 'soc-fixes-6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (26 commits)
arm64: dts: juno: Add thermal critical trip points
firmware: arm_scmi: Fix deferred_tx_wq release on error paths
firmware: arm_scmi: Fix devres allocation device in virtio transport
firmware: arm_scmi: Make Rx chan_setup fail on memory errors
firmware: arm_scmi: Make tx_prepare time out eventually
firmware: arm_scmi: Suppress the driver's bind attributes
firmware: arm_scmi: Cleanup the core driver removal callback
MAINTAINERS: Update HiSilicon LPC BUS Driver maintainer
ARM: dts: ux500: Add trips to battery thermal zones
arm64: dts: ls208xa: specify clock frequencies for the MDIO controllers
arm64: dts: ls1088a: specify clock frequencies for the MDIO controllers
arm64: dts: lx2160a: specify clock frequencies for the MDIO controllers
soc: imx: imx93-pd: Fix the error handling path of imx93_pd_probe()
arm64: dts: imx93: correct gpio-ranges
arm64: dts: imx93: correct s4mu interrupt names
dt-bindings: power: gpcv2: add power-domains property
arm64: dts: imx8: correct clock order
ARM: dts: imx6dl-yapp4: Do not allow PM to switch PU regulator off on Q/QP
ARM: dts: imx6qdl-gw59{10,13}: fix user pushbutton GPIO offset
arm64: dts: imx8mn: Correct the usb power domain
...
Commit 237405ebef ("arm64: cpufeature: Force HWCAP to be based on the
sysreg visible to user-space") forced the hwcaps to use sanitised
user-space view of the id registers. However, the ID register structures
used to select few compat cpufeatures (vfp, crc32, ...) are masked and
hence such hwcaps do not appear in /proc/cpuinfo anymore for PER_LINUX32
personality.
Add the ID register structures explicitly and set the relevant entry as
visible. As these ID registers are now of type visible so make them
available in 64-bit userspace by making necessary changes in register
emulation logic and documentation.
While at it, update the comment for structure ftr_generic_32bits[] which
lists the ID register that use it.
Fixes: 237405ebef ("arm64: cpufeature: Force HWCAP to be based on the sysreg visible to user-space")
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Link: https://lore.kernel.org/r/20221103082232.19189-1-amit.kachhap@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Pull powerpc fixes from Michael Ellerman:
- Fix an endian thinko in the asm-generic compat_arg_u64() which led to
syscall arguments being swapped for some compat syscalls.
- Fix syscall wrapper handling of syscalls with 64-bit arguments on
32-bit kernels, which led to syscall arguments being misplaced.
- A build fix for amdgpu on Book3E with AltiVec disabled.
Thanks to Andreas Schwab, Christian Zigotzky, and Arnd Bergmann.
* tag 'powerpc-6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/32: Select ARCH_SPLIT_ARG64
powerpc/32: fix syscall wrappers with 64-bit arguments
asm-generic: compat: fix compat_arg_u64() and compat_arg_u64_dual()
powerpc/64e: Fix amdgpu build on Book3E w/o AltiVec
Unlike x86, which has machinery to deal with page faults that occur
during the execution of EFI runtime services, arm64 has nothing like
that, and a synchronous exception raised by firmware code brings down
the whole system.
With more EFI based systems appearing that were not built to run Linux
(such as the Windows-on-ARM laptops based on Qualcomm SOCs), as well as
the introduction of PRM (platform specific firmware routines that are
callable just like EFI runtime services), we are more likely to run into
issues of this sort, and it is much more likely that we can identify and
work around such issues if they don't bring down the system entirely.
Since we already use a EFI runtime services call wrapper in assembler,
we can quite easily add some code that captures the execution state at
the point where the call is made, allowing us to revert to this state
and proceed execution if the call triggered a synchronous exception.
Given that the kernel and the firmware don't share any data structures
that could end up in an indeterminate state, we can happily continue
running, as long as we mark the EFI runtime services as unavailable from
that point on.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
kvm_zap_gfn_range() must be called in an SRCU read-critical section, but
there is no SRCU annotation in __kvm_set_or_clear_apicv_inhibit(). This
can lead to the following warning via
kvm_arch_vcpu_ioctl_set_guest_debug() if a Shadow MMU is in use (TDP
MMU disabled or nesting):
[ 1416.659809] =============================
[ 1416.659810] WARNING: suspicious RCU usage
[ 1416.659839] 6.1.0-dbg-DEV #1 Tainted: G S I
[ 1416.659853] -----------------------------
[ 1416.659854] include/linux/kvm_host.h:954 suspicious rcu_dereference_check() usage!
[ 1416.659856]
...
[ 1416.659904] dump_stack_lvl+0x84/0xaa
[ 1416.659910] dump_stack+0x10/0x15
[ 1416.659913] lockdep_rcu_suspicious+0x11e/0x130
[ 1416.659919] kvm_zap_gfn_range+0x226/0x5e0
[ 1416.659926] ? kvm_make_all_cpus_request_except+0x18b/0x1e0
[ 1416.659935] __kvm_set_or_clear_apicv_inhibit+0xcc/0x100
[ 1416.659940] kvm_arch_vcpu_ioctl_set_guest_debug+0x350/0x390
[ 1416.659946] kvm_vcpu_ioctl+0x2fc/0x620
[ 1416.659955] __se_sys_ioctl+0x77/0xc0
[ 1416.659962] __x64_sys_ioctl+0x1d/0x20
[ 1416.659965] do_syscall_64+0x3d/0x80
[ 1416.659969] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Always take the KVM SRCU read lock in __kvm_set_or_clear_apicv_inhibit()
to protect the GFN to memslot translation. The SRCU read lock is not
technically required when no Shadow MMUs are in use, since the TDP MMU
walks the paging structures from the roots and does not need to look up
GFN translations in the memslots, but make the SRCU locking
unconditional for simplicty.
In most cases, the SRCU locking is taken care of in the vCPU run loop,
but when called through other ioctls (such as KVM_SET_GUEST_DEBUG)
there is no srcu_read_lock.
Tested: ran tools/testing/selftests/kvm/x86_64/debug_regs on a DBG
build. This patch causes the suspicious RCU warning to disappear.
Note that the warning is hit in __kvm_zap_rmaps(), so
kvm_memslots_have_rmaps() must return true in order for this to
repro (i.e. the TDP MMU must be off or nesting in use.)
Reported-by: Greg Thelen <gthelen@google.com>
Fixes: 36222b117e ("KVM: x86: don't disable APICv memslot when inhibited")
Signed-off-by: Ben Gardon <bgardon@google.com>
Message-Id: <20221102205359.1260980-1-bgardon@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
xen_enable_sysenter() and xen_enable_syscall() can be simplified a lot.
While at it, switch to use cpu_feature_enabled() instead of
boot_cpu_has().
Signed-off-by: Juergen Gross <jgross@suse.com>
Commit 8714f7bcd3 ("xen/pv: add fault recovery control to pmu msr
accesses") introduced code resulting in a warning issued by the smatch
static checker, claiming to use an uninitialized variable.
This is a false positive, but work around the warning nevertheless.
Fixes: 8714f7bcd3 ("xen/pv: add fault recovery control to pmu msr accesses")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Armv8 Juno fix for v6.1
Just a single fix to add the missing critical points in the thermal
zones that has been mandatory in the binding but was enforced in the
code recently.
* tag 'juno-fix-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
arm64: dts: juno: Add thermal critical trip points
Link: https://lore.kernel.org/r/20221102140156.2758137-1-sudeep.holla@arm.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Pull parisc architecture fixes from Helge Deller:
"This mostly handles oddities with the serial port 8250_gsc.c driver.
Although the name suggests it's just for serial ports on the GSC bus
(e.g. in older PA-RISC machines), it handles serial ports on PA-RISC
PCI devices (e.g. on the SuperIO chip) as well.
Thus this renames the driver to 8250_parisc and fixes the config
dependencies.
The other change is a cleanup on how the device IDs of devices in a
PA-RISC machine are shown at startup"
* tag 'parisc-for-6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Avoid printing the hardware path twice
parisc: Export iosapic_serial_irq() symbol for serial port driver
MAINTAINERS: adjust entry after renaming parisc serial driver
parisc: Use signed char for hardware path in pdc.h
parisc/serial: Rename 8250_gsc.c to 8250_parisc.c
parisc: Make 8250_gsc driver dependend on CONFIG_PARISC
Ignore guest CPUID for host userspace writes to the DEBUGCTL MSR, KVM's
ABI is that setting CPUID vs. state can be done in any order, i.e. KVM
allows userspace to stuff MSRs prior to setting the guest's CPUID that
makes the new MSR "legal".
Keep the vmx_get_perf_capabilities() check for guest writes, even though
it's technically unnecessary since the vCPU's PERF_CAPABILITIES is
consulted when refreshing LBR support. A future patch will clean up
vmx_get_perf_capabilities() to avoid the RDMSR on every call, at which
point the paranoia will incur no meaningful overhead.
Note, prior to vmx_get_perf_capabilities() checking that the host fully
supports LBRs via x86_perf_get_lbr(), KVM effectively relied on
intel_pmu_lbr_is_enabled() to guard against host userspace enabling LBRs
on platforms without full support.
Fixes: c646236344 ("KVM: vmx/pmu: Add PMU_CAP_LBR_FMT check when guest LBR is enabled")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221006000314.73240-5-seanjc@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fold vmx_supported_debugctl() into vcpu_supported_debugctl(), its only
caller. Setting bits only to clear them a few instructions later is
rather silly, and splitting the logic makes things seem more complicated
than they actually are.
Opportunistically drop DEBUGCTLMSR_LBR_MASK now that there's a single
reference to the pair of bits. The extra layer of indirection provides
no meaningful value and makes it unnecessarily tedious to understand
what KVM is doing.
No functional change.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221006000314.73240-4-seanjc@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Advertise LBR support to userspace via MSR_IA32_PERF_CAPABILITIES if and
only if perf fully supports LBRs. Perf may disable LBRs (by zeroing the
number of LBRs) even on platforms the allegedly support LBRs, e.g. if
probing any LBR MSRs during setup fails.
Fixes: be635e34c2 ("KVM: vmx/pmu: Expose LBR_FMT in the MSR_IA32_PERF_CAPABILITIES")
Reported-by: Like Xu <like.xu.linux@gmail.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221006000314.73240-3-seanjc@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All normal kernel memory is "TDX private memory". This includes
everything from kernel stacks to kernel text. Handling
exceptions on arbitrary accesses to kernel memory is essentially
impossible because they can happen in horribly nasty places like
kernel entry/exit. But, TDX hardware can theoretically _deliver_
a virtualization exception (#VE) on any access to private memory.
But, it's not as bad as it sounds. TDX can be configured to never
deliver these exceptions on private memory with a "TD attribute"
called ATTR_SEPT_VE_DISABLE. The guest has no way to *set* this
attribute, but it can check it.
Ensure ATTR_SEPT_VE_DISABLE is set in early boot. panic() if it
is unset. There is no sane way for Linux to run with this
attribute clear so a panic() is appropriate.
There's small window during boot before the check where kernel
has an early #VE handler. But the handler is only for port I/O
and will also panic() as soon as it sees any other #VE, such as
a one generated by a private memory access.
[ dhansen: Rewrite changelog and rebase on new tdx_parse_tdinfo().
Add Kirill's tested-by because I made changes since
he wrote this. ]
Fixes: 9a22bf6deb ("x86/traps: Add #VE support for TDX guest")
Reported-by: ruogui.ygr@alibaba-inc.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20221028141220.29217-3-kirill.shutemov%40linux.intel.com
Pull kvm fixes from Paolo Bonzini:
"x86:
- fix lock initialization race in gfn-to-pfn cache (+selftests)
- fix two refcounting errors
- emulator fixes
- mask off reserved bits in CPUID
- fix bug with disabling SGX
RISC-V:
- update MAINTAINERS"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86/xen: Fix eventfd error handling in kvm_xen_eventfd_assign()
KVM: x86: smm: number of GPRs in the SMRAM image depends on the image format
KVM: x86: emulator: update the emulation mode after CR0 write
KVM: x86: emulator: update the emulation mode after rsm
KVM: x86: emulator: introduce emulator_recalc_and_set_mode
KVM: x86: emulator: em_sysexit should update ctxt->mode
KVM: selftests: Mark "guest_saw_irq" as volatile in xen_shinfo_test
KVM: selftests: Add tests in xen_shinfo_test to detect lock races
KVM: Reject attempts to consume or refresh inactive gfn_to_pfn_cache
KVM: Initialize gfn_to_pfn_cache locks in dedicated helper
KVM: VMX: fully disable SGX if SECONDARY_EXEC_ENCLS_EXITING unavailable
KVM: x86: Exempt pending triple fault from event injection sanity check
MAINTAINERS: git://github -> https://github.com for kvm-riscv
KVM: debugfs: Return retval of simple_attr_open() if it fails
KVM: x86: Reduce refcount if single_open() fails in kvm_mmu_rmaps_stat_open()
KVM: x86: Mask off reserved bits in CPUID.8000001FH
KVM: x86: Mask off reserved bits in CPUID.8000001AH
KVM: x86: Mask off reserved bits in CPUID.80000008H
KVM: x86: Mask off reserved bits in CPUID.80000006H
KVM: x86: Mask off reserved bits in CPUID.80000001H
The TDG.VP.INFO TDCALL provides the guest with various details about
the TDX system that the guest needs to run. Only one field is currently
used: 'gpa_width' which tells the guest which PTE bits mark pages shared
or private.
A second field is now needed: the guest "TD attributes" to tell if
virtualization exceptions are configured in a way that can harm the guest.
Make the naming and calling convention more generic and discrete from the
mask-centric one.
Thanks to Sathya for the inspiration here, but there's no code, comments
or changelogs left from where he started.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@vger.kernel.org
The trapping of SMPRI_EL1 and TPIDR2_EL0 currently only really
work on nVHE, as only this mode uses the fine-grained trapping
that controls these two registers.
Move the trapping enable/disable code into
__{de,}activate_traps_common(), allowing it to be called when it
actually matters on VHE, and remove the flipping of EL2 control
for TPIDR2_EL0, which only affects the host access of this
register.
Fixes: 861262ab86 ("KVM: arm64: Handle SME host state when running guests")
Reported-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/86bkpqer4z.wl-maz@kernel.org
Recent changes to the thermal framework has made the trip
points (trips) for thermal zones compulsory, which made
the Ux500 DTS files break validation and also stopped
probing because of similar changes to the code.
Fix this by adding an "outer bounding box": battery thermal
zones should not get warmer than 70 degress, then we will
shut down.
Fixes: 8c59632423 ("dt-bindings: thermal: Fix missing required property")
Fixes: 3fd6d6e2b4 ("thermal/of: Rework the thermal device tree initialization")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-pm@vger.kernel.org
Link: https://lore.kernel.org/r/20221030210854.346662-1-linus.walleij@linaro.org'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
i.MX fixes for 6.1:
- Fix imx93-pd driver to release resources when error occurs in probe.
- A series from Ioana Ciornei to add missing clock frequencies for MDIO
controllers on LayerScape SoCs, so that the kernel driver can work
independently from bootloader.
- A series from Li Jun to fix USB power domain setup in i.MX8MM/N device
trees.
- Fix CPLD_Dn pull configuration for MX8Menlo board to avoid interfering
with CPLD power off functionality.
- Fix ctrl_sleep_moci GPIO setup for verdin-imx8mp board.
- Fix DT schema check warnings on uSDHC clocks for imx8-ss-conn device
tree.
- Fix up gpcv2 DT bindings to have an optional `power-domains` property.
- A couple of i.MX93 device tree fixes on S4MU interrupt and gpio-ranges
of GPIO controllers.
- Keep PU regulator on for Quad and QuadPlus based imx6dl-yapp4 boards to
work around a hardware design flaw in supply voltage distribution.
- Fix user push-button GPIO offset on imx6qdl-gw59 boards.
* tag 'imx-fixes-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
arm64: dts: ls208xa: specify clock frequencies for the MDIO controllers
arm64: dts: ls1088a: specify clock frequencies for the MDIO controllers
arm64: dts: lx2160a: specify clock frequencies for the MDIO controllers
soc: imx: imx93-pd: Fix the error handling path of imx93_pd_probe()
arm64: dts: imx93: correct gpio-ranges
arm64: dts: imx93: correct s4mu interrupt names
dt-bindings: power: gpcv2: add power-domains property
arm64: dts: imx8: correct clock order
ARM: dts: imx6dl-yapp4: Do not allow PM to switch PU regulator off on Q/QP
ARM: dts: imx6qdl-gw59{10,13}: fix user pushbutton GPIO offset
arm64: dts: imx8mn: Correct the usb power domain
arm64: dts: imx8mn: remove otg1 power domain dependency on hsio
arm64: dts: imx8mm: correct usb power domains
arm64: dts: imx8mm: remove otg1/2 power domain dependency on hsio
arm64: dts: verdin-imx8mp: fix ctrl_sleep_moci
arm64: dts: imx8mm: Enable CPLD_Dn pull down resistor on MX8Menlo
Link: https://lore.kernel.org/r/20221101031547.GB125525@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
On 32-bit kernels, 64-bit syscall arguments are split into two
registers. For that to work with syscall wrappers, the prototype of the
syscall must have the argument split so that the wrapper macro properly
unpacks the arguments from pt_regs.
The fanotify_mark() syscall is one such syscall, which already has a
split prototype, guarded behind ARCH_SPLIT_ARG64.
So select ARCH_SPLIT_ARG64 to get that prototype and fix fanotify_mark()
on 32-bit kernels with syscall wrappers.
Note also that fanotify_mark() is the only usage of ARCH_SPLIT_ARG64.
Fixes: 7e92e01b72 ("powerpc: Provide syscall wrapper")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221101034852.2340319-1-mpe@ellerman.id.au
With the introduction of syscall wrappers all wrappers for syscalls with
64-bit arguments must be handled specially, not only those that have
unaligned 64-bit arguments. This left out the fallocate() and
sync_file_range2() syscalls.
Fixes: 7e92e01b72 ("powerpc: Provide syscall wrapper")
Fixes: e237506238 ("powerpc/32: fix syscall wrappers with 64-bit arguments of unaligned register-pairs")
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/87mt9cxd6g.fsf_-_@igel.home
Avoid that the hardware path is shown twice in the kernel log, and clean
up the output of the version numbers to show up in the same order as
they are listed in the hardware database in the hardware.c file.
Additionally, optimize the memory footprint of the hardware database
and mark some code as init code.
Fixes: cab56b51ec ("parisc: Fix device names in /proc/iomem")
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v4.9+
There's a build failure for Book3E without AltiVec:
Error: cc1: error: AltiVec not supported in this target
make[6]: *** [/linux/scripts/Makefile.build:250:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o] Error 1
This happens because the amdgpu build is only gated by
PPC_LONG_DOUBLE_128, but that symbol can be enabled even though AltiVec
is disabled.
The only user of PPC_LONG_DOUBLE_128 is amdgpu, so just add a dependency
on AltiVec to that symbol to fix the build.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221027125626.1383092-1-mpe@ellerman.id.au