Pull kvm fixes from Paolo Bonzini:
"The main and larger change here is a workaround for AMD's lack of
cache coherency for encrypted-memory guests.
I have another patch pending, but it's waiting for review from the
architecture maintainers.
RISC-V:
- Remove 's' & 'u' as valid ISA extension
- Do not allow disabling the base extensions 'i'/'m'/'a'/'c'
x86:
- Fix NMI watchdog in guests on AMD
- Fix for SEV cache incoherency issues
- Don't re-acquire SRCU lock in complete_emulated_io()
- Avoid NULL pointer deref if VM creation fails
- Fix race conditions between APICv disabling and vCPU creation
- Bugfixes for disabling of APICv
- Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume
selftests:
- Do not use bitfields larger than 32-bits, they differ between GCC
and clang"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: selftests: introduce and use more page size-related constants
kvm: selftests: do not use bitfields larger than 32-bits for PTEs
KVM: SEV: add cache flush to solve SEV cache incoherency issues
KVM: SVM: Flush when freeing encrypted pages even on SME_COHERENT CPUs
KVM: SVM: Simplify and harden helper to flush SEV guest page(s)
KVM: selftests: Silence compiler warning in the kvm_page_table_test
KVM: x86/pmu: Update AMD PMC sample period to fix guest NMI-watchdog
x86/kvm: Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume
KVM: SPDX style and spelling fixes
KVM: x86: Skip KVM_GUESTDBG_BLOCKIRQ APICv update if APICv is disabled
KVM: x86: Pend KVM_REQ_APICV_UPDATE during vCPU creation to fix a race
KVM: nVMX: Defer APICv updates while L2 is active until L1 is active
KVM: x86: Tag APICv DISABLE inhibit, not ABSENT, if APICv is disabled
KVM: Initialize debugfs_dentry when a VM is created to avoid NULL deref
KVM: Add helpers to wrap vcpu->srcu_idx and yell if it's abused
KVM: RISC-V: Use kvm_vcpu.srcu_idx, drop RISC-V's unnecessary copy
KVM: x86: Don't re-acquire SRCU lock in complete_emulated_io()
RISC-V: KVM: Restrict the extensions that can be disabled
RISC-V: KVM: Remove 's' & 'u' as valid ISA extension
Pull RISC-V fixes Palmer Dabbelt:
- A pair of build fixes for the recent cpuidle driver
- A fix for systems without sv57 that manifests as a crash
early in boot
* tag 'riscv-for-linus-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: cpuidle: fix Kconfig select for RISCV_SBI_CPUIDLE
RISC-V: mm: Fix set_satp_mode() for platform not having Sv57
cpuidle: riscv: support non-SMP config
Pull arm64 fixes from Will Deacon:
"There's no real pattern to the fixes, but the main one fixes our
pmd_leaf() definition to resolve a NULL dereference on the migration
path.
- Fix PMU event validation in the absence of any event counters
- Fix allmodconfig build using clang in conjunction with binutils
- Fix definitions of pXd_leaf() to handle PROT_NONE entries
- More typo fixes"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: fix p?d_leaf()
arm64: fix typos in comments
arm64: Improve HAVE_DYNAMIC_FTRACE_WITH_REGS selection for clang
arm_pmu: Validate single/group leader events
There can be lots of build errors when building cpuidle-riscv-sbi.o.
They are all caused by a kconfig problem with this warning:
WARNING: unmet direct dependencies detected for RISCV_SBI_CPUIDLE
Depends on [n]: CPU_IDLE [=y] && RISCV [=y] && RISCV_SBI [=n]
Selected by [y]:
- SOC_VIRT [=y] && CPU_IDLE [=y]
so make the 'select' of RISCV_SBI_CPUIDLE also depend on RISCV_SBI.
Fixes: c5179ef1ca ("RISC-V: Enable RISC-V SBI CPU Idle driver for QEMU virt machine")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
When Sv57 is not available the satp.MODE test in set_satp_mode() will
fail and lead to pgdir re-programming for Sv48. The pgdir re-programming
will fail as well due to pre-existing pgdir entry used for Sv57 and as
a result kernel fails to boot on RISC-V platform not having Sv57.
To fix above issue, we should clear the pgdir memory in set_satp_mode()
before re-programming.
Fixes: 011f09d120 ("riscv: mm: Set sv57 on defaultly")
Reported-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Flush the CPU caches when memory is reclaimed from an SEV guest (where
reclaim also includes it being unmapped from KVM's memslots). Due to lack
of coherency for SEV encrypted memory, failure to flush results in silent
data corruption if userspace is malicious/broken and doesn't ensure SEV
guest memory is properly pinned and unpinned.
Cache coherency is not enforced across the VM boundary in SEV (AMD APM
vol.2 Section 15.34.7). Confidential cachelines, generated by confidential
VM guests have to be explicitly flushed on the host side. If a memory page
containing dirty confidential cachelines was released by VM and reallocated
to another user, the cachelines may corrupt the new user at a later time.
KVM takes a shortcut by assuming all confidential memory remain pinned
until the end of VM lifetime. Therefore, KVM does not flush cache at
mmu_notifier invalidation events. Because of this incorrect assumption and
the lack of cache flushing, malicous userspace can crash the host kernel:
creating a malicious VM and continuously allocates/releases unpinned
confidential memory pages when the VM is running.
Add cache flush operations to mmu_notifier operations to ensure that any
physical memory leaving the guest VM get flushed. In particular, hook
mmu_notifier_invalidate_range_start and mmu_notifier_release events and
flush cache accordingly. The hook after releasing the mmu lock to avoid
contention with other vCPUs.
Cc: stable@vger.kernel.org
Suggested-by: Sean Christpherson <seanjc@google.com>
Reported-by: Mingwei Zhang <mizhang@google.com>
Signed-off-by: Mingwei Zhang <mizhang@google.com>
Message-Id: <20220421031407.2516575-4-mizhang@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use clflush_cache_range() to flush the confidential memory when
SME_COHERENT is supported in AMD CPU. Cache flush is still needed since
SME_COHERENT only support cache invalidation at CPU side. All confidential
cache lines are still incoherent with DMA devices.
Cc: stable@vger.kerel.org
Fixes: add5e2f045 ("KVM: SVM: Add support for the SEV-ES VMSA")
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Mingwei Zhang <mizhang@google.com>
Message-Id: <20220421031407.2516575-3-mizhang@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rework sev_flush_guest_memory() to explicitly handle only a single page,
and harden it to fall back to WBINVD if VM_PAGE_FLUSH fails. Per-page
flushing is currently used only to flush the VMSA, and in its current
form, the helper is completely broken with respect to flushing actual
guest memory, i.e. won't work correctly for an arbitrary memory range.
VM_PAGE_FLUSH takes a host virtual address, and is subject to normal page
walks, i.e. will fault if the address is not present in the host page
tables or does not have the correct permissions. Current AMD CPUs also
do not honor SMAP overrides (undocumented in kernel versions of the APM),
so passing in a userspace address is completely out of the question. In
other words, KVM would need to manually walk the host page tables to get
the pfn, ensure the pfn is stable, and then use the direct map to invoke
VM_PAGE_FLUSH. And the latter might not even work, e.g. if userspace is
particularly evil/clever and backs the guest with Secret Memory (which
unmaps memory from the direct map).
Signed-off-by: Sean Christopherson <seanjc@google.com>
Fixes: add5e2f045 ("KVM: SVM: Add support for the SEV-ES VMSA")
Reported-by: Mingwei Zhang <mizhang@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mingwei Zhang <mizhang@google.com>
Message-Id: <20220421031407.2516575-2-mizhang@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
NMI-watchdog is one of the favorite features of kernel developers,
but it does not work in AMD guest even with vPMU enabled and worse,
the system misrepresents this capability via /proc.
This is a PMC emulation error. KVM does not pass the latest valid
value to perf_event in time when guest NMI-watchdog is running, thus
the perf_event corresponding to the watchdog counter will enter the
old state at some point after the first guest NMI injection, forcing
the hardware register PMC0 to be constantly written to 0x800000000001.
Meanwhile, the running counter should accurately reflect its new value
based on the latest coordinated pmc->counter (from vPMC's point of view)
rather than the value written directly by the guest.
Fixes: 168d918f26 ("KVM: x86: Adjust counter sample period after a wrmsr")
Reported-by: Dongli Cao <caodongli@kingsoft.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Tested-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <20220409015226.38619-1-likexu@tencent.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
MSR_KVM_POLL_CONTROL is cleared on reset, thus reverting guests to
host-side polling after suspend/resume. Non-bootstrap CPUs are
restored correctly by the haltpoll driver because they are hot-unplugged
during suspend and hot-plugged during resume; however, the BSP
is not hotpluggable and remains in host-sde polling mode after
the guest resume. The makes the guest pay for the cost of vmexits
every time the guest enters idle.
Fix it by recording BSP's haltpoll state and resuming it during guest
resume.
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1650267752-46796-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Skip the APICv inhibit update for KVM_GUESTDBG_BLOCKIRQ if APICv is
disabled at the module level to avoid having to acquire the mutex and
potentially process all vCPUs. The DISABLE inhibit will (barring bugs)
never be lifted, so piling on more inhibits is unnecessary.
Fixes: cae72dcc3b ("KVM: x86: inhibit APICv when KVM_GUESTDBG_BLOCKIRQ active")
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220420013732.3308816-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make a KVM_REQ_APICV_UPDATE request when creating a vCPU with an
in-kernel local APIC and APICv enabled at the module level. Consuming
kvm_apicv_activated() and stuffing vcpu->arch.apicv_active directly can
race with __kvm_set_or_clear_apicv_inhibit(), as vCPU creation happens
before the vCPU is fully onlined, i.e. it won't get the request made to
"all" vCPUs. If APICv is globally inhibited between setting apicv_active
and onlining the vCPU, the vCPU will end up running with APICv enabled
and trigger KVM's sanity check.
Mark APICv as active during vCPU creation if APICv is enabled at the
module level, both to be optimistic about it's final state, e.g. to avoid
additional VMWRITEs on VMX, and because there are likely bugs lurking
since KVM checks apicv_active in multiple vCPU creation paths. While
keeping the current behavior of consuming kvm_apicv_activated() is
arguably safer from a regression perspective, force apicv_active so that
vCPU creation runs with deterministic state and so that if there are bugs,
they are found sooner than later, i.e. not when some crazy race condition
is hit.
WARNING: CPU: 0 PID: 484 at arch/x86/kvm/x86.c:9877 vcpu_enter_guest+0x2ae3/0x3ee0 arch/x86/kvm/x86.c:9877
Modules linked in:
CPU: 0 PID: 484 Comm: syz-executor361 Not tainted 5.16.13 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1~cloud0 04/01/2014
RIP: 0010:vcpu_enter_guest+0x2ae3/0x3ee0 arch/x86/kvm/x86.c:9877
Call Trace:
<TASK>
vcpu_run arch/x86/kvm/x86.c:10039 [inline]
kvm_arch_vcpu_ioctl_run+0x337/0x15e0 arch/x86/kvm/x86.c:10234
kvm_vcpu_ioctl+0x4d2/0xc80 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3727
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:874 [inline]
__se_sys_ioctl fs/ioctl.c:860 [inline]
__x64_sys_ioctl+0x16d/0x1d0 fs/ioctl.c:860
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
The bug was hit by a syzkaller spamming VM creation with 2 vCPUs and a
call to KVM_SET_GUEST_DEBUG.
r0 = openat$kvm(0xffffffffffffff9c, &(0x7f0000000000), 0x0, 0x0)
r1 = ioctl$KVM_CREATE_VM(r0, 0xae01, 0x0)
ioctl$KVM_CAP_SPLIT_IRQCHIP(r1, 0x4068aea3, &(0x7f0000000000)) (async)
r2 = ioctl$KVM_CREATE_VCPU(r1, 0xae41, 0x0) (async)
r3 = ioctl$KVM_CREATE_VCPU(r1, 0xae41, 0x400000000000002)
ioctl$KVM_SET_GUEST_DEBUG(r3, 0x4048ae9b, &(0x7f00000000c0)={0x5dda9c14aa95f5c5})
ioctl$KVM_RUN(r2, 0xae80, 0x0)
Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Reported-by: Yongkang Jia <kangel@zju.edu.cn>
Fixes: 8df14af42f ("kvm: x86: Add support for dynamic APICv activation")
Cc: stable@vger.kernel.org
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220420013732.3308816-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Defer APICv updates that occur while L2 is active until nested VM-Exit,
i.e. until L1 regains control. vmx_refresh_apicv_exec_ctrl() assumes L1
is active and (a) stomps all over vmcs02 and (b) neglects to ever updated
vmcs01. E.g. if vmcs12 doesn't enable the TPR shadow for L2 (and thus no
APICv controls), L1 performs nested VM-Enter APICv inhibited, and APICv
becomes unhibited while L2 is active, KVM will set various APICv controls
in vmcs02 and trigger a failed VM-Entry. The kicker is that, unless
running with nested_early_check=1, KVM blames L1 and chaos ensues.
In all cases, ignoring vmcs02 and always deferring the inhibition change
to vmcs01 is correct (or at least acceptable). The ABSENT and DISABLE
inhibitions cannot truly change while L2 is active (see below).
IRQ_BLOCKING can change, but it is firmly a best effort debug feature.
Furthermore, only L2's APIC is accelerated/virtualized to the full extent
possible, e.g. even if L1 passes through its APIC to L2, normal MMIO/MSR
interception will apply to the virtual APIC managed by KVM.
The exception is the SELF_IPI register when x2APIC is enabled, but that's
an acceptable hole.
Lastly, Hyper-V's Auto EOI can technically be toggled if L1 exposes the
MSRs to L2, but for that to work in any sane capacity, L1 would need to
pass through IRQs to L2 as well, and IRQs must be intercepted to enable
virtual interrupt delivery. I.e. exposing Auto EOI to L2 and enabling
VID for L2 are, for all intents and purposes, mutually exclusive.
Lack of dynamic toggling is also why this scenario is all but impossible
to encounter in KVM's current form. But a future patch will pend an
APICv update request _during_ vCPU creation to plug a race where a vCPU
that's being created doesn't get included in the "all vCPUs request"
because it's not yet visible to other vCPUs. If userspaces restores L2
after VM creation (hello, KVM selftests), the first KVM_RUN will occur
while L2 is active and thus service the APICv update request made during
VM creation.
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220420013732.3308816-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Set the DISABLE inhibit, not the ABSENT inhibit, if APICv is disabled via
module param. A recent refactoring to add a wrapper for setting/clearing
inhibits unintentionally changed the flag, probably due to a copy+paste
goof.
Fixes: 4f4c4a3ee5 ("KVM: x86: Trace all APICv inhibit changes and capture overall status")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220420013732.3308816-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add wrappers to acquire/release KVM's SRCU lock when stashing the index
in vcpu->src_idx, along with rudimentary detection of illegal usage,
e.g. re-acquiring SRCU and thus overwriting vcpu->src_idx. Because the
SRCU index is (currently) either 0 or 1, illegal nesting bugs can go
unnoticed for quite some time and only cause problems when the nested
lock happens to get a different index.
Wrap the WARNs in PROVE_RCU=y, and make them ONCE, otherwise KVM will
likely yell so loudly that it will bring the kernel to its knees.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Tested-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20220415004343.2203171-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use the generic kvm_vcpu's srcu_idx instead of using an indentical field
in RISC-V's version of kvm_vcpu_arch. Generic KVM very intentionally
does not touch vcpu->srcu_idx, i.e. there's zero chance of running afoul
of common code.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220415004343.2203171-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Don't re-acquire SRCU in complete_emulated_io() now that KVM acquires the
lock in kvm_arch_vcpu_ioctl_run(). More importantly, don't overwrite
vcpu->srcu_idx. If the index acquired by complete_emulated_io() differs
from the one acquired by kvm_arch_vcpu_ioctl_run(), KVM will effectively
leak a lock and hang if/when synchronize_srcu() is invoked for the
relevant grace period.
Fixes: 8d25b7beca ("KVM: x86: pull kvm->srcu read-side to kvm_arch_vcpu_ioctl_run")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220415004343.2203171-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull xtensa fixes from Max Filippov:
- fix patching CPU selection in patch_text
- fix potential deadlock in ISS platform serial driver
- fix potential register clobbering in coprocessor exception handler
* tag 'xtensa-20220416' of https://github.com/jcmvbkbc/linux-xtensa:
xtensa: fix a7 clobbering in coprocessor context load/store
arch: xtensa: platforms: Fix deadlock in rs_close()
xtensa: patch_text: Fixup last cpu should be master
The first "if" condition in __memcpy_flushcache is supposed to align the
"dest" variable to 8 bytes and copy data up to this alignment. However,
this condition may misbehave if "size" is greater than 4GiB.
The statement min_t(unsigned, size, ALIGN(dest, 8) - dest); casts both
arguments to unsigned int and selects the smaller one. However, the
cast truncates high bits in "size" and it results in misbehavior.
For example:
suppose that size == 0x100000001, dest == 0x200000002
min_t(unsigned, size, ALIGN(dest, 8) - dest) == min_t(0x1, 0xe) == 0x1;
...
dest += 0x1;
so we copy just one byte "and" dest remains unaligned.
This patch fixes the bug by replacing unsigned with size_t.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, the config isa register allows us to disable all allowed
single letter ISA extensions. It shouldn't be the case as vmm shouldn't
be able to disable base extensions (imac).
These extensions should always be enabled as long as they are enabled
in the host ISA.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Fixes: 92ad82002c ("RISC-V: KVM: Implement
KVM_GET_ONE_REG/KVM_SET_ONE_REG ioctls")
There are no ISA extension defined as 's' & 'u' in RISC-V specifications.
The misa register defines 's' & 'u' bit as Supervisor/User privilege mode
enabled. But it should not appear in the ISA extension in the device tree.
Remove those from the allowed ISA extension for kvm.
Fixes: a33c72faf2 ("RISC-V: KVM: Implement VCPU create, init and
destroy functions")
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Huge page backed vmalloc memory could benefit performance in many cases.
However, some users of vmalloc may not be ready to handle huge pages for
various reasons: hardware constraints, potential pages split, etc.
VM_NO_HUGE_VMAP was introduced to allow vmalloc users to opt-out huge
pages. However, it is not easy to track down all the users that require
the opt-out, as the allocation are passed different stacks and may cause
issues in different layers.
To address this issue, replace VM_NO_HUGE_VMAP with an opt-in flag,
VM_ALLOW_HUGE_VMAP, so that users that benefit from huge pages could ask
specificially.
Also, remove vmalloc_no_huge() and add opt-in helper vmalloc_huge().
Fixes: fac54e2bfb ("x86/Kconfig: Select HAVE_ARCH_HUGE_VMALLOC with HAVE_ARCH_HUGE_VMAP")
Link: https://lore.kernel.org/netdev/14444103-d51b-0fb3-ee63-c3f182f0b546@molgen.mpg.de/"
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>
Reviewed-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull x86 fixes from Thomas Gleixner:
"Two x86 fixes related to TSX:
- Use either MSR_TSX_FORCE_ABORT or MSR_IA32_TSX_CTRL to disable TSX
to cover all CPUs which allow to disable it.
- Disable TSX development mode at boot so that a microcode update
which provides TSX development mode does not suddenly make the
system vulnerable to TSX Asynchronous Abort"
* tag 'x86-urgent-2022-04-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/tsx: Disable TSX development mode at boot
x86/tsx: Use MSR_TSX_CTRL to clear CPUID bits
Pull ARM SoC fixes from Arnd Bergmann:
"There are a number of SoC bugfixes that came in since the merge
window, and more of them are already pending.
This batch includes:
- A boot time regression fix for davinci that triggered on
multi_v5_defconfig when booting any platform
- Defconfig updates to address removed features, changed symbol names
or dependencies, for gemini, ux500, and pxa
- Email address changes for Krzysztof Kozlowski
- Build warning fixes for ep93xx and iop32x
- Devicetree warning fixes across many platforms
- Minor bugfixes for the reset controller, memory controller and SCMI
firmware subsystems plus the versatile-express board"
* tag 'soc-fixes-5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (34 commits)
ARM: config: Update Gemini defconfig
arm64: dts: qcom/sdm845-shift-axolotl: Fix boolean properties with values
ARM: dts: align SPI NOR node name with dtschema
ARM: dts: Fix more boolean properties with values
arm/arm64: dts: qcom: Fix boolean properties with values
arm64: dts: imx: Fix imx8*-var-som touchscreen property sizes
arm: dts: imx: Fix boolean properties with values
arm64: dts: tegra: Fix boolean properties with values
arm: dts: at91: Fix boolean properties with values
arm: configs: imote2: Drop defconfig as board support dropped.
ep93xx: clock: Don't use plain integer as NULL pointer
ep93xx: clock: Fix UAF in ep93xx_clk_register_gate()
ARM: vexpress/spc: Fix all the kernel-doc build warnings
ARM: vexpress/spc: Fix kernel-doc build warning for ve_spc_cpu_in_wfi
ARM: config: u8500: Re-enable AB8500 battery charging
ARM: config: u8500: Add some common hardware
memory: fsl_ifc: populate child nodes of buses and mfd devices
ARM: config: Refresh U8500 defconfig
firmware: arm_scmi: Fix sparse warnings in OPTEE transport driver
firmware: arm_scmi: Replace zero-length array with flexible-array member
...
Fast coprocessor exception handler saves a3..a6, but coprocessor context
load/store code uses a4..a7 as temporaries, potentially clobbering a7.
'Potentially' because coprocessor state load/store macros may not use
all four temporary registers (and neither FPU nor HiFi macros do).
Use a3..a6 as intended.
Cc: stable@vger.kernel.org
Fixes: c658eac628 ("[XTENSA] Add support for configurable registers and coprocessors")
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Commit 3ee48b6af4 ("mm, x86: Saving vmcore with non-lazy freeing of
vmas") introduced set_iounmap_nonlazy(), which sets vmap_lazy_nr to
lazy_max_pages() + 1, ensuring that any future vunmaps() immediately
purge the vmap areas instead of doing it lazily.
Commit 690467c81b ("mm/vmalloc: Move draining areas out of caller
context") moved the purging from the vunmap() caller to a worker thread.
Unfortunately, set_iounmap_nonlazy() can cause the worker thread to spin
(possibly forever). For example, consider the following scenario:
1. Thread reads from /proc/vmcore. This eventually calls
__copy_oldmem_page() -> set_iounmap_nonlazy(), which sets
vmap_lazy_nr to lazy_max_pages() + 1.
2. Then it calls free_vmap_area_noflush() (via iounmap()), which adds 2
pages (one page plus the guard page) to the purge list and
vmap_lazy_nr. vmap_lazy_nr is now lazy_max_pages() + 3, so the
drain_vmap_work is scheduled.
3. Thread returns from the kernel and is scheduled out.
4. Worker thread is scheduled in and calls drain_vmap_area_work(). It
frees the 2 pages on the purge list. vmap_lazy_nr is now
lazy_max_pages() + 1.
5. This is still over the threshold, so it tries to purge areas again,
but doesn't find anything.
6. Repeat 5.
If the system is running with only one CPU (which is typicial for kdump)
and preemption is disabled, then this will never make forward progress:
there aren't any more pages to purge, so it hangs. If there is more
than one CPU or preemption is enabled, then the worker thread will spin
forever in the background. (Note that if there were already pages to be
purged at the time that set_iounmap_nonlazy() was called, this bug is
avoided.)
This can be reproduced with anything that reads from /proc/vmcore
multiple times. E.g., vmcore-dmesg /proc/vmcore.
It turns out that improvements to vmap() over the years have obsoleted
the need for this "optimization". I benchmarked `dd if=/proc/vmcore
of=/dev/null` with 4k and 1M read sizes on a system with a 32GB vmcore.
The test was run on 5.17, 5.18-rc1 with a fix that avoided the hang, and
5.18-rc1 with set_iounmap_nonlazy() removed entirely:
|5.17 |5.18+fix|5.18+removal
4k|40.86s| 40.09s| 26.73s
1M|24.47s| 23.98s| 21.84s
The removal was the fastest (by a wide margin with 4k reads). This
patch removes set_iounmap_nonlazy().
Link: https://lkml.kernel.org/r/52f819991051f9b865e9ce25605509bfdbacadcd.1649277321.git.osandov@fb.com
Fixes: 690467c81b ("mm/vmalloc: Move draining areas out of caller context")
Signed-off-by: Omar Sandoval <osandov@fb.com>
Acked-by: Chris Down <chris@chrisdown.name>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The Gemini defconfig needs to be updated due to DSA driver
Kconfig changes in the v5.18 merge window:
CONFIG_NET_DSA_REALTEK_SMI is now behind
CONFIG_NET_DSA_REALTEK and the desired DSA switch need
to be selected explicitly with
CONFIG_NET_DSA_REALTEK_RTL8366RB.
Take this opportunity to update some other minor config
options:
- CONFIG_MARVELL_PHY moved around because of Kconfig
changes.
- CONFIG_SENSORS_DRIVETEMP should be selected since is
regulates the system critical alert temperature on
some devices, which is nice if it is handled even
if initramfs or root fails to mount.
Fixes: 319a70a5fe ("net: dsa: realtek-smi: move to subdirectory")
Fixes: 765c39a4fa ("net: dsa: realtek: convert subdrivers into modules")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Hans Ulli Kroll <ulli.kroll@googlemail.com>
Cc: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Defconfig updates for kernel v5.18:
- Refresh defconfig with new and moved options
- Add some new hardware drivers
- Activate battery charging
* tag 'ux500-defconfig-soc-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-nomadik:
ARM: config: u8500: Re-enable AB8500 battery charging
ARM: config: u8500: Add some common hardware
ARM: config: Refresh U8500 defconfig
Link: https://lore.kernel.org/r/CACRpkdY+_Go4XNzOh+Rvc24QBnUud2k-S7VQuaH5d-j71_dJog@mail.gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Pull s390 fixes from Heiko Carstens:
- Convert current_stack_pointer to a register alias like it is assumed
if ARCH_HAS_CURRENT_STACK_POINTER is selected. The existing
implementation as a function breaks CONFIG_HARDENED_USERCOPY
sanity-checks
- Get rid of -Warray-bounds warning within kexec code
- Add minimal IBM z16 support by reporting a proper elf platform, and
adding compile options
- Update defconfigs
* tag 's390-5.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: enable CONFIG_HARDENED_USERCOPY in debug_defconfig
s390: current_stack_pointer shouldn't be a function
s390: update defconfigs
s390/kexec: silence -Warray-bounds warning
s390: allow to compile with z16 optimizations
s390: add z16 elf platform
Will and Anders reported that using just 'CC=clang' with CONFIG_FTRACE=y
and CONFIG_STACK_TRACER=y would result in an error while linking:
aarch64-linux-gnu-ld: .init.data has both ordered [`__patchable_function_entries' in init/main.o] and unordered [`.meminit.data' in mm/sparse.o] sections
aarch64-linux-gnu-ld: final link failed: bad value
This error was exposed by commit f12b034afe ("scripts/Makefile.clang:
default to LLVM_IAS=1") in combination with binutils older than 2.36.
When '-fpatchable-function-entry' was implemented in LLVM, two code
paths were added for adding the section attributes, one for the
integrated assembler and another for GNU as, due to binutils
deficiencies at the time. If the integrated assembler was used,
attributes that GNU ld < 2.36 could not handle were added, presumably
with the assumption that use of the integrated assembler meant the whole
LLVM stack was being used, namely ld.lld.
Prior to the kernel change previously mentioned, that assumption was
valid, as there were three commonly used combinations of tools for
compiling, assembling, and linking respectively:
$ make CC=clang (clang, GNU as, GNU ld)
$ make LLVM=1 (clang, GNU as, ld.lld)
$ make LLVM=1 LLVM_IAS=1 (clang, integrated assembler, ld.lld)
After the default switch of the integrated assembler, the second and
third commands become equivalent and the first command means "clang,
integrated assembler, and GNU ld", which was not a combination that was
considered when the aforementioned LLVM change was implemented.
It is not possible to go back and fix LLVM, as this change was
implemented in the 10.x series, which is no longer supported. To
workaround this on the kernel side, split out the selection of
HAVE_DYNAMIC_FTRACE_WITH_REGS to two separate configurations, one for
GCC and one for clang.
The GCC config inherits the '-fpatchable-function-entry' check. The
Clang config does not it, as '-fpatchable-function-entry' is always
available for LLVM 11.0.0 and newer, which is the supported range of
versions for the kernel.
The Clang config makes sure that the user is using GNU as or the
integrated assembler with ld.lld or GNU ld 2.36 or newer, which will
avoid the error above.
Link: https://github.com/ClangBuiltLinux/linux/issues/1507
Link: https://github.com/ClangBuiltLinux/linux/issues/788
Link: https://lore.kernel.org/YlCA5PoIjF6nhwYj@dev-arch.thelio-3990X/
Link: https://sourceware.org/bugzilla/show_bug.cgi?id=26256
Link: 7fa5290d5b
Link: 853a264916
Reported-by: Anders Roxell <anders.roxell@linaro.org>
Reported-by: Will Deacon <will@kernel.org>
Tested-by: Will Deacon <will@kernel.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20220413181420.3522187-1-nathan@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
There is a deadlock in rs_close(), which is shown
below:
(Thread 1) | (Thread 2)
| rs_open()
rs_close() | mod_timer()
spin_lock_bh() //(1) | (wait a time)
... | rs_poll()
del_timer_sync() | spin_lock() //(2)
(wait timer to stop) | ...
We hold timer_lock in position (1) of thread 1 and
use del_timer_sync() to wait timer to stop, but timer handler
also need timer_lock in position (2) of thread 2.
As a result, rs_close() will block forever.
This patch deletes the redundant timer_lock in order to
prevent the deadlock. Because there is no race condition
between rs_close, rs_open and rs_poll.
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Message-Id: <20220407154430.22387-1-duoming@zju.edu.cn>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
These patch_text implementations are using stop_machine_cpuslocked
infrastructure with atomic cpu_count. The original idea: When the
master CPU patch_text, the others should wait for it. But current
implementation is using the first CPU as master, which couldn't
guarantee the remaining CPUs are waiting. This patch changes the
last CPU as the master to solve the potential risk.
Fixes: 64711f9a47 ("xtensa: implement jump_label support")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: <stable@vger.kernel.org>
Message-Id: <20220407073323.743224-4-guoren@kernel.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Pull kvm fixes from Paolo Bonzini:
"x86:
- Miscellaneous bugfixes
- A small cleanup for the new workqueue code
- Documentation syntax fix
RISC-V:
- Remove hgatp zeroing in kvm_arch_vcpu_put()
- Fix alignment of the guest_hang() in KVM selftest
- Fix PTE A and D bits in KVM selftest
- Missing #include in vcpu_fp.c
ARM:
- Some PSCI fixes after introducing PSCIv1.1 and SYSTEM_RESET2
- Fix the MMU write-lock not being taken on THP split
- Fix mixed-width VM handling
- Fix potential UAF when debugfs registration fails
- Various selftest updates for all of the above"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (24 commits)
KVM: x86: hyper-v: Avoid writing to TSC page without an active vCPU
KVM: SVM: Do not activate AVIC for SEV-enabled guest
Documentation: KVM: Add SPDX-License-Identifier tag
selftests: kvm: add tsc_scaling_sync to .gitignore
RISC-V: KVM: include missing hwcap.h into vcpu_fp
KVM: selftests: riscv: Fix alignment of the guest_hang() function
KVM: selftests: riscv: Set PTE A and D bits in VS-stage page table
RISC-V: KVM: Don't clear hgatp CSR in kvm_arch_vcpu_put()
selftests: KVM: Free the GIC FD when cleaning up in arch_timer
selftests: KVM: Don't leak GIC FD across dirty log test iterations
KVM: Don't create VM debugfs files outside of the VM directory
KVM: selftests: get-reg-list: Add KVM_REG_ARM_FW_REG(3)
KVM: avoid NULL pointer dereference in kvm_dirty_ring_push
KVM: arm64: selftests: Introduce vcpu_width_config
KVM: arm64: mixed-width check should be skipped for uninitialized vCPUs
KVM: arm64: vgic: Remove unnecessary type castings
KVM: arm64: Don't split hugepages outside of MMU write lock
KVM: arm64: Drop unneeded minor version check from PSCI v1.x handler
KVM: arm64: Actually prevent SMC64 SYSTEM_RESET2 from AArch32
KVM: arm64: Generally disallow SMC64 for AArch32 guests
...
struct stat (defined in arch/x86/include/uapi/asm/stat.h) has 32-bit
st_dev and st_rdev; struct compat_stat (defined in
arch/x86/include/asm/compat.h) has 16-bit st_dev and st_rdev followed by
a 16-bit padding.
This patch fixes struct compat_stat to match struct stat.
[ Historical note: the old x86 'struct stat' did have that 16-bit field
that the compat layer had kept around, but it was changes back in 2003
by "struct stat - support larger dev_t":
https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=e95b2065677fe32512a597a79db94b77b90c968d
and back in those days, the x86_64 port was still new, and separate
from the i386 code, and had already picked up the old version with a
16-bit st_dev field ]
Note that we can't change compat_dev_t because it is used by
compat_loop_info.
Also, if the st_dev and st_rdev values are 32-bit, we don't have to use
old_valid_dev to test if the value fits into them. This fixes
-EOVERFLOW on filesystems that are on NVMe because NVMe uses the major
number 259.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
s390 defines current_stack_pointer as function while all other
architectures use 'register unsigned long asm("<stackptr reg>").
This make codes like the following from check_stack_object() fail:
if (IS_ENABLED(CONFIG_STACK_GROWSUP)) {
if ((void *)current_stack_pointer < obj + len)
return BAD_STACK;
} else {
if (obj < (void *)current_stack_pointer)
return BAD_STACK;
}
because this would compare the address of current_stack_pointer() and
not the stackpointer value.
Reported-by: Karsten Graul <kgraul@linux.ibm.com>
Fixes: 2792d84e6d ("usercopy: Check valid lifetime via stack depth")
Cc: Kees Cook <keescook@chromium.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
The following WARN is triggered from kvm_vm_ioctl_set_clock():
WARNING: CPU: 10 PID: 579353 at arch/x86/kvm/../../../virt/kvm/kvm_main.c:3161 mark_page_dirty_in_slot+0x6c/0x80 [kvm]
...
CPU: 10 PID: 579353 Comm: qemu-system-x86 Tainted: G W O 5.16.0.stable #20
Hardware name: LENOVO 20UF001CUS/20UF001CUS, BIOS R1CET65W(1.34 ) 06/17/2021
RIP: 0010:mark_page_dirty_in_slot+0x6c/0x80 [kvm]
...
Call Trace:
<TASK>
? kvm_write_guest+0x114/0x120 [kvm]
kvm_hv_invalidate_tsc_page+0x9e/0xf0 [kvm]
kvm_arch_vm_ioctl+0xa26/0xc50 [kvm]
? schedule+0x4e/0xc0
? __cond_resched+0x1a/0x50
? futex_wait+0x166/0x250
? __send_signal+0x1f1/0x3d0
kvm_vm_ioctl+0x747/0xda0 [kvm]
...
The WARN was introduced by commit 03c0304a86bc ("KVM: Warn if
mark_page_dirty() is called without an active vCPU") but the change seems
to be correct (unlike Hyper-V TSC page update mechanism). In fact, there's
no real need to actually write to guest memory to invalidate TSC page, this
can be done by the first vCPU which goes through kvm_guest_time_update().
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20220407201013.963226-1-vkuznets@redhat.com>
A microcode update on some Intel processors causes all TSX transactions
to always abort by default[*]. Microcode also added functionality to
re-enable TSX for development purposes. With this microcode loaded, if
tsx=on was passed on the cmdline, and TSX development mode was already
enabled before the kernel boot, it may make the system vulnerable to TSX
Asynchronous Abort (TAA).
To be on safer side, unconditionally disable TSX development mode during
boot. If a viable use case appears, this can be revisited later.
[*]: Intel TSX Disable Update for Selected Processors, doc ID: 643557
[ bp: Drop unstable web link, massage heavily. ]
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/347bd844da3a333a9793c6687d4e4eb3b2419a3e.1646943780.git.pawan.kumar.gupta@linux.intel.com
Pull driver core updates from Greg KH:
"Here are two small driver core changes for 5.18-rc2.
They are the final bits in the removal of the default_attrs field in
struct kobj_type. I had to wait until after 5.18-rc1 for all of the
changes to do this came in through different development trees, and
then one new user snuck in. So this series has two changes:
- removal of the default_attrs field in the powerpc/pseries/vas code.
The change has been acked by the PPC maintainers to come through
this tree
- removal of default_attrs from struct kobj_type now that all
in-kernel users are removed.
This cleans up the kobject code a little bit and removes some
duplicated functionality that confused people (now there is only
one way to do default groups)
Both of these have been in linux-next for all of this week with no
reported problems"
* tag 'driver-core-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
kobject: kobj_type: remove default_attrs
powerpc/pseries/vas: use default_groups in kobj_type
Pull powerpc fixes from Michael Ellerman:
- Fix KVM "lost kick" race, where an attempt to pull a vcpu out of the
guest could be lost (or delayed until the next guest exit).
- Disable SCV (system call vectored) when PR KVM guests could be run.
- Fix KVM PR guests using SCV, by disallowing AIL != 0 for KVM PR
guests.
- Add a new KVM CAP to indicate if AIL == 3 is supported.
- Fix a regression when hotplugging a CPU to a memoryless/cpuless node.
- Make virt_addr_valid() stricter for 64-bit Book3E & 32-bit, which
fixes crashes seen due to hardened usercopy.
- Revert a change to max_mapnr which broke HIGHMEM.
Thanks to Christophe Leroy, Fabiano Rosas, Kefeng Wang, Nicholas Piggin,
and Srikar Dronamraju.
* tag 'powerpc-5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
Revert "powerpc: Set max_mapnr correctly"
powerpc: Fix virt_addr_valid() for 64-bit Book3E & 32-bit
KVM: PPC: Move kvmhv_on_pseries() into kvm_ppc.h
powerpc/numa: Handle partially initialized numa nodes
powerpc/64: Fix build failure with allyesconfig in book3s_64_entry.S
KVM: PPC: Use KVM_CAP_PPC_AIL_MODE_3
KVM: PPC: Book3S PR: Disallow AIL != 0
KVM: PPC: Book3S PR: Disable SCV when AIL could be disabled
KVM: PPC: Book3S HV P9: Fix "lost kick" race
Pull x86 fixes from Borislav Petkov:
- Fix the MSI message data struct definition
- Use local labels in the exception table macros to avoid symbol
conflicts with clang LTO builds
- A couple of fixes to objtool checking of the relatively newly added
SLS and IBT code
- Rename a local var in the WARN* macro machinery to prevent shadowing
* tag 'x86_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/msi: Fix msi message data shadow struct
x86/extable: Prefer local labels in .set directives
x86,bpf: Avoid IBT objtool warning
objtool: Fix SLS validation for kcov tail-call replacement
objtool: Fix IBT tail-call detection
x86/bug: Prevent shadowing in __WARN_FLAGS
x86/mm/tlb: Revert retpoline avoidance approach
Pull perf fixes from Borislav Petkov:
- A couple of fixes to cgroup-related handling of perf events
- A couple of fixes to event encoding on Sapphire Rapids
- Pass event caps of inherited events so that perf doesn't fail wrongly
at fork()
- Add support for a new Raptor Lake CPU
* tag 'perf_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Always set cpuctx cgrp when enable cgroup event
perf/core: Fix perf_cgroup_switch()
perf/core: Use perf_cgroup_info->active to check if cgroup is active
perf/core: Don't pass task around when ctx sched in
perf/x86/intel: Update the FRONTEND MSR mask on Sapphire Rapids
perf/x86/intel: Don't extend the pseudo-encoding to GP counters
perf/core: Inherit event_caps
perf/x86/uncore: Add Raptor Lake uncore support
perf/x86/msr: Add Raptor Lake CPU support
perf/x86/cstate: Add Raptor Lake support
perf/x86: Add Intel Raptor Lake support
Pull locking fixes from Borislav Petkov:
- Allow the compiler to optimize away unused percpu accesses and change
the local_lock_* macros back to inline functions
- A couple of fixes to static call insn patching
* tag 'locking_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "mm/page_alloc: mark pagesets as __maybe_unused"
Revert "locking/local_lock: Make the empty local_lock_*() function a macro."
x86/percpu: Remove volatile from arch_raw_cpu_ptr().
static_call: Remove __DEFINE_STATIC_CALL macro
static_call: Properly initialise DEFINE_STATIC_CALL_RET0()
static_call: Don't make __static_call_return0 static
x86,static_call: Fix __static_call_return0 for i386