Commit Graph

2548 Commits

Author SHA1 Message Date
Danilo Krummrich
590b9d576c mm: kvmalloc: align kvrealloc() with krealloc()
Besides the obvious (and desired) difference between krealloc() and
kvrealloc(), there is some inconsistency in their function signatures and
behavior:

 - krealloc() frees the memory when the requested size is zero, whereas
   kvrealloc() simply returns a pointer to the existing allocation.

 - krealloc() behaves like kmalloc() if a NULL pointer is passed, whereas
   kvrealloc() does not accept a NULL pointer at all and, if passed,
   would fault instead.

 - krealloc() is self-contained, whereas kvrealloc() relies on the caller
   to provide the size of the previous allocation.

Inconsistent behavior throughout allocation APIs is error prone, hence
make kvrealloc() behave like krealloc(), which seems superior in all
mentioned aspects.

Besides that, implementing kvrealloc() by making use of krealloc() and
vrealloc() provides oppertunities to grow (and shrink) allocations more
efficiently.  For instance, vrealloc() can be optimized to allocate and
map additional pages to grow the allocation or unmap and free unused pages
to shrink the allocation.

[dakr@kernel.org: document concurrency restrictions]
  Link: https://lkml.kernel.org/r/20240725125442.4957-1-dakr@kernel.org
[dakr@kernel.org: disable KASAN when switching to vmalloc]
  Link: https://lkml.kernel.org/r/20240730185049.6244-2-dakr@kernel.org
[dakr@kernel.org: properly document __GFP_ZERO behavior]
  Link: https://lkml.kernel.org/r/20240730185049.6244-5-dakr@kernel.org
Link: https://lkml.kernel.org/r/20240722163111.4766-3-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Chandan Babu R <chandan.babu@oracle.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:25:44 -07:00
Marc Zyngier
3e6245ebe7 KVM: arm64: Make ICC_*SGI*_EL1 undef in the absence of a vGICv3
On a system with a GICv3, if a guest hasn't been configured with
GICv3 and that the host is not capable of GICv2 emulation,
a write to any of the ICC_*SGI*_EL1 registers is trapped to EL2.

We therefore try to emulate the SGI access, only to hit a NULL
pointer as no private interrupt is allocated (no GIC, remember?).

The obvious fix is to give the guest what it deserves, in the
shape of a UNDEF exception.

Reported-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240820100349.3544850-2-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-08-22 08:08:37 +00:00
Oliver Upton
1d8c3c23a6 KVM: arm64: Ensure canonical IPA is hugepage-aligned when handling fault
Zenghui reports that VMs backed by hugetlb pages are no longer booting
after commit fd276e71d1 ("KVM: arm64: nv: Handle shadow stage 2 page
faults").

Support for shadow stage-2 MMUs introduced the concept of a fault IPA
and canonical IPA to stage-2 fault handling. These are identical in the
non-nested case, as the hardware stage-2 context is always that of the
canonical IPA space.

Both addresses need to be hugepage-aligned when preparing to install a
hugepage mapping to ensure that KVM uses the correct GFN->PFN translation
and installs that at the correct IPA for the current stage-2.

And now I'm feeling thirsty after all this talk of IPAs...

Fixes: fd276e71d1 ("KVM: arm64: nv: Handle shadow stage 2 page faults")
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240822071710.2291690-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-08-22 07:41:00 +00:00
Marc Zyngier
f616506754 KVM: arm64: vgic: Don't hold config_lock while unregistering redistributors
We recently moved the teardown of the vgic part of a vcpu inside
a critical section guarded by the config_lock. This teardown phase
involves calling into kvm_io_bus_unregister_dev(), which takes the
kvm->srcu lock.

However, this violates the established order where kvm->srcu is
taken on a memory fault (such as an MMIO access), possibly
followed by taking the config_lock if the GIC emulation requires
mutual exclusion from the other vcpus.

It therefore results in a bad lockdep splat, as reported by Zenghui.

Fix this by moving the call to kvm_io_bus_unregister_dev() outside
of the config_lock critical section. At this stage, there shouln't
be any need to hold the config_lock.

As an additional bonus, document the ordering between kvm->slots_lock,
kvm->srcu and kvm->arch.config_lock so that I cannot pretend I didn't
know about those anymore.

Fixes: 9eb18136af ("KVM: arm64: vgic: Hold config_lock while tearing down a CPU interface")
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Tested-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20240819125045.3474845-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-08-19 17:05:21 +00:00
Zenghui Yu
2240a50e62 KVM: arm64: vgic-debug: Don't put unmarked LPIs
If there were LPIs being mapped behind our back (i.e., between .start() and
.stop()), we would put them at iter_unmark_lpis() without checking if they
were actually *marked*, which is obviously not good.

Switch to use the xa_for_each_marked() iterator to fix it.

Cc: stable@vger.kernel.org
Fixes: 85d3ccc8b7 ("KVM: arm64: vgic-debug: Use an xarray mark for debug iterator")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240817101541.1664-1-yuzenghui@huawei.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-08-19 17:04:36 +00:00
Marc Zyngier
9eb18136af KVM: arm64: vgic: Hold config_lock while tearing down a CPU interface
Tearing down a vcpu CPU interface involves freeing the private interrupt
array. If we don't hold the lock, we may race against another thread
trying to configure it. Yeah, fuzzers do wonderful things...

Taking the lock early solves this particular problem.

Fixes: 03b3d00a70 ("KVM: arm64: vgic: Allocate private interrupts on demand")
Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240808091546.3262111-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-08-08 16:58:22 +00:00
Fuad Tabba
7e814a20f6 KVM: arm64: Tidying up PAuth code in KVM
Tidy up some of the PAuth trapping code to clear up some comments
and avoid clang/checkpatch warnings. Also, don't bother setting
PAuth HCR_EL2 bits in pKVM, since it's handled by the hypervisor.

Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20240722163311.1493879-1-tabba@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-08-07 19:11:05 +00:00
Zenghui Yu
01ab08cafe KVM: arm64: vgic-debug: Exit the iterator properly w/o LPI
In case the guest doesn't have any LPI, we previously relied on the
iterator setting

	'intid = nr_spis + VGIC_NR_PRIVATE_IRQS' && 'lpi_idx = 1'

to exit the iterator. But it was broken with commit 85d3ccc8b7 ("KVM:
arm64: vgic-debug: Use an xarray mark for debug iterator") -- the intid
remains at 'nr_spis + VGIC_NR_PRIVATE_IRQS - 1', and we end up endlessly
printing the last SPI's state.

Consider that it's meaningless to search the LPI xarray and populate
lpi_idx when there is no LPI, let's just skip the process for that case.

The result is that

* If there's no LPI, we focus on the intid and exit the iterator when it
  runs out of the valid SPI range.
* Otherwise we keep the current logic and let the xarray drive the
  iterator.

Fixes: 85d3ccc8b7 ("KVM: arm64: vgic-debug: Use an xarray mark for debug iterator")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240807052024.2084-1-yuzenghui@huawei.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-08-07 19:10:22 +00:00
Marc Zyngier
10f2ad032d KVM: arm64: Enforce dependency on an ARMv8.4-aware toolchain
With the NV support of TLBI-range operations, KVM makes use of
instructions that are only supported by binutils versions >= 2.30.

This breaks the build for very old toolchains.

Make KVM support conditional on having ARMv8.4 support in the
assembler, side-stepping the issue.

Fixes: 5d476ca57d ("KVM: arm64: nv: Add handling of range-based TLBI operations")
Reported-by: Viresh Kumar <viresh.kumar@linaro.org>
Suggested-by: Arnd Bergmann <arnd@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20240807115144.3237260-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-08-07 19:09:35 +00:00
Sebastian Ott
19d837bc88 KVM: arm64: vgic: fix unexpected unlock sparse warnings
Get rid of unexpected unlock sparse warnings in vgic code
by adding an annotation to vgic_queue_irq_unlock().

arch/arm64/kvm/vgic/vgic.c:334:17: warning: context imbalance in 'vgic_queue_irq_unlock' - unexpected unlock
arch/arm64/kvm/vgic/vgic.c:419:5: warning: context imbalance in 'kvm_vgic_inject_irq' - different lock contexts for basic block

Signed-off-by: Sebastian Ott <sebott@redhat.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240723101204.7356-4-sebott@redhat.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-08-02 18:58:03 +00:00
Sebastian Ott
0aa34b37a7 KVM: arm64: fix kdoc warnings in W=1 builds
Fix kdoc warnings by adding missing function parameter
descriptions or by conversion to a normal comment.

Signed-off-by: Sebastian Ott <sebott@redhat.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240723101204.7356-3-sebott@redhat.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-08-02 18:58:03 +00:00
Sebastian Ott
963a08e586 KVM: arm64: fix override-init warnings in W=1 builds
Add -Wno-override-init to the build flags for sys_regs.c,
handle_exit.c, and switch.c to fix warnings like the following:

arch/arm64/kvm/hyp/vhe/switch.c:271:43: warning: initialized field overwritten [-Woverride-init]
  271 |         [ESR_ELx_EC_CP15_32]            = kvm_hyp_handle_cp15_32,
      |

Signed-off-by: Sebastian Ott <sebott@redhat.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240723101204.7356-2-sebott@redhat.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-08-02 18:58:03 +00:00
Danilo Krummrich
32b9a52f88 KVM: arm64: free kvm->arch.nested_mmus with kvfree()
kvm->arch.nested_mmus is allocated with kvrealloc(), hence free it with
kvfree() instead of kfree().

Fixes: 4f128f8e1a ("KVM: arm64: nv: Support multiple nested Stage-2 mmu structures")
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240723142204.758796-1-dakr@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-08-02 18:57:30 +00:00
Linus Torvalds
2c9b351240 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
 "ARM:

   - Initial infrastructure for shadow stage-2 MMUs, as part of nested
     virtualization enablement

   - Support for userspace changes to the guest CTR_EL0 value, enabling
     (in part) migration of VMs between heterogenous hardware

   - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1
     of the protocol

   - FPSIMD/SVE support for nested, including merged trap configuration
     and exception routing

   - New command-line parameter to control the WFx trap behavior under
     KVM

   - Introduce kCFI hardening in the EL2 hypervisor

   - Fixes + cleanups for handling presence/absence of FEAT_TCRX

   - Miscellaneous fixes + documentation updates

  LoongArch:

   - Add paravirt steal time support

   - Add support for KVM_DIRTY_LOG_INITIALLY_SET

   - Add perf kvm-stat support for loongarch

  RISC-V:

   - Redirect AMO load/store access fault traps to guest

   - perf kvm stat support

   - Use guest files for IMSIC virtualization, when available

  s390:

   - Assortment of tiny fixes which are not time critical

  x86:

   - Fixes for Xen emulation

   - Add a global struct to consolidate tracking of host values, e.g.
     EFER

   - Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the
     effective APIC bus frequency, because TDX

   - Print the name of the APICv/AVIC inhibits in the relevant
     tracepoint

   - Clean up KVM's handling of vendor specific emulation to
     consistently act on "compatible with Intel/AMD", versus checking
     for a specific vendor

   - Drop MTRR virtualization, and instead always honor guest PAT on
     CPUs that support self-snoop

   - Update to the newfangled Intel CPU FMS infrastructure

   - Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as
     it reads '0' and writes from userspace are ignored

   - Misc cleanups

  x86 - MMU:

   - Small cleanups, renames and refactoring extracted from the upcoming
     Intel TDX support

   - Don't allocate kvm_mmu_page.shadowed_translation for shadow pages
     that can't hold leafs SPTEs

   - Unconditionally drop mmu_lock when allocating TDP MMU page tables
     for eager page splitting, to avoid stalling vCPUs when splitting
     huge pages

   - Bug the VM instead of simply warning if KVM tries to split a SPTE
     that is non-present or not-huge. KVM is guaranteed to end up in a
     broken state because the callers fully expect a valid SPTE, it's
     all but dangerous to let more MMU changes happen afterwards

  x86 - AMD:

   - Make per-CPU save_area allocations NUMA-aware

   - Force sev_es_host_save_area() to be inlined to avoid calling into
     an instrumentable function from noinstr code

   - Base support for running SEV-SNP guests. API-wise, this includes a
     new KVM_X86_SNP_VM type, encrypting/measure the initial image into
     guest memory, and finalizing it before launching it. Internally,
     there are some gmem/mmu hooks needed to prepare gmem-allocated
     pages before mapping them into guest private memory ranges

     This includes basic support for attestation guest requests, enough
     to say that KVM supports the GHCB 2.0 specification

     There is no support yet for loading into the firmware those signing
     keys to be used for attestation requests, and therefore no need yet
     for the host to provide certificate data for those keys.

     To support fetching certificate data from userspace, a new KVM exit
     type will be needed to handle fetching the certificate from
     userspace.

     An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS
     exit type to handle this was introduced in v1 of this patchset, but
     is still being discussed by community, so for now this patchset
     only implements a stub version of SNP Extended Guest Requests that
     does not provide certificate data

  x86 - Intel:

   - Remove an unnecessary EPT TLB flush when enabling hardware

   - Fix a series of bugs that cause KVM to fail to detect nested
     pending posted interrupts as valid wake eents for a vCPU executing
     HLT in L2 (with HLT-exiting disable by L1)

   - KVM: x86: Suppress MMIO that is triggered during task switch
     emulation

     Explicitly suppress userspace emulated MMIO exits that are
     triggered when emulating a task switch as KVM doesn't support
     userspace MMIO during complex (multi-step) emulation

     Silently ignoring the exit request can result in the
     WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace
     for some other reason prior to purging mmio_needed

     See commit 0dc902267c ("KVM: x86: Suppress pending MMIO write
     exits if emulator detects exception") for more details on KVM's
     limitations with respect to emulated MMIO during complex emulator
     flows

  Generic:

   - Rename the AS_UNMOVABLE flag that was introduced for KVM to
     AS_INACCESSIBLE, because the special casing needed by these pages
     is not due to just unmovability (and in fact they are only
     unmovable because the CPU cannot access them)

   - New ioctl to populate the KVM page tables in advance, which is
     useful to mitigate KVM page faults during guest boot or after live
     migration. The code will also be used by TDX, but (probably) not
     through the ioctl

   - Enable halt poll shrinking by default, as Intel found it to be a
     clear win

   - Setup empty IRQ routing when creating a VM to avoid having to
     synchronize SRCU when creating a split IRQCHIP on x86

   - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with
     a flag that arch code can use for hooking both sched_in() and
     sched_out()

   - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
     truncating a bogus value from userspace, e.g. to help userspace
     detect bugs

   - Mark a vCPU as preempted if and only if it's scheduled out while in
     the KVM_RUN loop, e.g. to avoid marking it preempted and thus
     writing guest memory when retrieving guest state during live
     migration blackout

  Selftests:

   - Remove dead code in the memslot modification stress test

   - Treat "branch instructions retired" as supported on all AMD Family
     17h+ CPUs

   - Print the guest pseudo-RNG seed only when it changes, to avoid
     spamming the log for tests that create lots of VMs

   - Make the PMU counters test less flaky when counting LLC cache
     misses by doing CLFLUSH{OPT} in every loop iteration"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits)
  crypto: ccp: Add the SNP_VLEK_LOAD command
  KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops
  KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops
  KVM: x86: Replace static_call_cond() with static_call()
  KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event
  x86/sev: Move sev_guest.h into common SEV header
  KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event
  KVM: x86: Suppress MMIO that is triggered during task switch emulation
  KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro
  KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE
  KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY
  KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory()
  KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level
  KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault"
  KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler
  KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory
  KVM: Document KVM_PRE_FAULT_MEMORY ioctl
  mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE
  perf kvm: Add kvm-stat for loongarch64
  LoongArch: KVM: Add PV steal time support in guest side
  ...
2024-07-20 12:41:03 -07:00
Paolo Bonzini
86014c1e20 Merge tag 'kvm-x86-generic-6.11' of https://github.com/kvm-x86/linux into HEAD
KVM generic changes for 6.11

 - Enable halt poll shrinking by default, as Intel found it to be a clear win.

 - Setup empty IRQ routing when creating a VM to avoid having to synchronize
   SRCU when creating a split IRQCHIP on x86.

 - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag
   that arch code can use for hooking both sched_in() and sched_out().

 - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
   truncating a bogus value from userspace, e.g. to help userspace detect bugs.

 - Mark a vCPU as preempted if and only if it's scheduled out while in the
   KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest
   memory when retrieving guest state during live migration blackout.

 - A few minor cleanups
2024-07-16 09:51:36 -04:00
Paolo Bonzini
1c5a0b55ab Merge tag 'kvmarm-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 changes for 6.11

 - Initial infrastructure for shadow stage-2 MMUs, as part of nested
   virtualization enablement

 - Support for userspace changes to the guest CTR_EL0 value, enabling
   (in part) migration of VMs between heterogenous hardware

 - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of
   the protocol

 - FPSIMD/SVE support for nested, including merged trap configuration
   and exception routing

 - New command-line parameter to control the WFx trap behavior under KVM

 - Introduce kCFI hardening in the EL2 hypervisor

 - Fixes + cleanups for handling presence/absence of FEAT_TCRX

 - Miscellaneous fixes + documentation updates
2024-07-16 09:50:44 -04:00
Linus Torvalds
c89d780cc1 Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
 "The biggest part is the virtual CPU hotplug that touches ACPI,
  irqchip. We also have some GICv3 optimisation for pseudo-NMIs that has
  been queued via the arm64 tree. Otherwise the usual perf updates,
  kselftest, various small cleanups.

  Core:

   - Virtual CPU hotplug support for arm64 ACPI systems

   - cpufeature infrastructure cleanups and making the FEAT_ECBHB ID
     bits visible to guests

   - CPU errata: expand the speculative SSBS workaround to more CPUs

   - GICv3, use compile-time PMR values: optimise the way regular IRQs
     are masked/unmasked when GICv3 pseudo-NMIs are used, removing the
     need for a static key in fast paths by using a priority value
     chosen dynamically at boot time

  ACPI:

   - 'acpi=nospcr' option to disable SPCR as default console for arm64

   - Move some ACPI code (cpuidle, FFH) to drivers/acpi/arm64/

  Perf updates:

   - Rework of the IMX PMU driver to enable support for I.MX95

   - Enable support for tertiary match groups in the CMN PMU driver

   - Initial refactoring of the CPU PMU code to prepare for the fixed
     instruction counter introduced by Arm v9.4

   - Add missing PMU driver MODULE_DESCRIPTION() strings

   - Hook up DT compatibles for recent CPU PMUs

  Kselftest updates:

   - Kernel mode NEON fp-stress

   - Cleanups, spelling mistakes

  Miscellaneous:

   - arm64 Documentation update with a minor clarification on TBI

   - Fix missing IPI statistics

   - Implement raw_smp_processor_id() using thread_info rather than a
     per-CPU variable (better code generation)

   - Make MTE checking of in-kernel asynchronous tag faults conditional
     on KASAN being enabled

   - Minor cleanups, typos"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (69 commits)
  selftests: arm64: tags: remove the result script
  selftests: arm64: tags_test: conform test to TAP output
  perf: add missing MODULE_DESCRIPTION() macros
  arm64: smp: Fix missing IPI statistics
  irqchip/gic-v3: Fix 'broken_rdists' unused warning when !SMP and !ACPI
  ACPI: Add acpi=nospcr to disable ACPI SPCR as default console on ARM64
  Documentation: arm64: Update memory.rst for TBI
  arm64/cpufeature: Replace custom macros with fields from ID_AA64PFR0_EL1
  KVM: arm64: Replace custom macros with fields from ID_AA64PFR0_EL1
  perf: arm_pmuv3: Include asm/arm_pmuv3.h from linux/perf/arm_pmuv3.h
  perf: arm_v6/7_pmu: Drop non-DT probe support
  perf/arm: Move 32-bit PMU drivers to drivers/perf/
  perf: arm_pmuv3: Drop unnecessary IS_ENABLED(CONFIG_ARM64) check
  perf: arm_pmuv3: Avoid assigning fixed cycle counter with threshold
  arm64: Kconfig: Fix dependencies to enable ACPI_HOTPLUG_CPU
  perf: imx_perf: add support for i.MX95 platform
  perf: imx_perf: fix counter start and config sequence
  perf: imx_perf: refactor driver for imx93
  perf: imx_perf: let the driver manage the counter usage rather the user
  perf: imx_perf: add macro definitions for parsing config attr
  ...
2024-07-15 17:06:19 -07:00
Oliver Upton
bc2e3253ca Merge branch kvm-arm64/nv-tcr2 into kvmarm/next
* kvm-arm64/nv-tcr2:
  : Fixes to the handling of TCR_EL1, courtesy of Marc Zyngier
  :
  : Series addresses a couple gaps that are present in KVM (from cover
  : letter):
  :
  :   - VM configuration: HCRX_EL2.TCR2En is forced to 1, and we blindly
  :     save/restore stuff.
  :
  :   - trap bit description and routing: none, obviously, since we make a
  :     point in not trapping.
  KVM: arm64: Honor trap routing for TCR2_EL1
  KVM: arm64: Make PIR{,E0}_EL1 save/restore conditional on FEAT_TCRX
  KVM: arm64: Make TCR2_EL1 save/restore dependent on the VM features
  KVM: arm64: Get rid of HCRX_GUEST_FLAGS
  KVM: arm64: Correctly honor the presence of FEAT_TCRX

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-07-14 00:28:37 +00:00
Oliver Upton
8c2899e770 Merge branch kvm-arm64/nv-sve into kvmarm/next
* kvm-arm64/nv-sve:
  : CPTR_EL2, FPSIMD/SVE support for nested
  :
  : This series brings support for honoring the guest hypervisor's CPTR_EL2
  : trap configuration when running a nested guest, along with support for
  : FPSIMD/SVE usage at L1 and L2.
  KVM: arm64: Allow the use of SVE+NV
  KVM: arm64: nv: Add additional trap setup for CPTR_EL2
  KVM: arm64: nv: Add trap description for CPTR_EL2
  KVM: arm64: nv: Add TCPAC/TTA to CPTR->CPACR conversion helper
  KVM: arm64: nv: Honor guest hypervisor's FP/SVE traps in CPTR_EL2
  KVM: arm64: nv: Load guest FP state for ZCR_EL2 trap
  KVM: arm64: nv: Handle CPACR_EL1 traps
  KVM: arm64: Spin off helper for programming CPTR traps
  KVM: arm64: nv: Ensure correct VL is loaded before saving SVE state
  KVM: arm64: nv: Use guest hypervisor's max VL when running nested guest
  KVM: arm64: nv: Save guest's ZCR_EL2 when in hyp context
  KVM: arm64: nv: Load guest hyp's ZCR into EL1 state
  KVM: arm64: nv: Handle ZCR_EL2 traps
  KVM: arm64: nv: Forward SVE traps to guest hypervisor
  KVM: arm64: nv: Forward FP/ASIMD traps to guest hypervisor

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-07-14 00:27:06 +00:00
Oliver Upton
1270dad310 Merge branch kvm-arm64/el2-kcfi into kvmarm/next
* kvm-arm64/el2-kcfi:
  : kCFI support in the EL2 hypervisor, courtesy of Pierre-Clément Tosi
  :
  : Enable the usage fo CONFIG_CFI_CLANG (kCFI) for hardening indirect
  : branches in the EL2 hypervisor. Unlike kernel support for the feature,
  : CFI failures at EL2 are always fatal.
  KVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2
  KVM: arm64: Introduce print_nvhe_hyp_panic helper
  arm64: Introduce esr_brk_comment, esr_is_cfi_brk
  KVM: arm64: VHE: Mark __hyp_call_panic __noreturn
  KVM: arm64: nVHE: gen-hyprel: Skip R_AARCH64_ABS32
  KVM: arm64: nVHE: Simplify invalid_host_el2_vect
  KVM: arm64: Fix __pkvm_init_switch_pgd call ABI
  KVM: arm64: Fix clobbered ELR in sync abort/SError

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-07-14 00:23:32 +00:00
Oliver Upton
377d0e5d77 Merge branch kvm-arm64/ctr-el0 into kvmarm/next
* kvm-arm64/ctr-el0:
  : Support for user changes to CTR_EL0, courtesy of Sebastian Ott
  :
  : Allow userspace to change the guest-visible value of CTR_EL0 for a VM,
  : so long as the requested value represents a subset of features supported
  : by hardware. In other words, prevent the VMM from over-promising the
  : capabilities of hardware.
  :
  : Make this happen by fitting CTR_EL0 into the existing infrastructure for
  : feature ID registers.
  KVM: selftests: Assert that MPIDR_EL1 is unchanged across vCPU reset
  KVM: arm64: nv: Unfudge ID_AA64PFR0_EL1 masking
  KVM: selftests: arm64: Test writes to CTR_EL0
  KVM: arm64: rename functions for invariant sys regs
  KVM: arm64: show writable masks for feature registers
  KVM: arm64: Treat CTR_EL0 as a VM feature ID register
  KVM: arm64: unify code to prepare traps
  KVM: arm64: nv: Use accessors for modifying ID registers
  KVM: arm64: Add helper for writing ID regs
  KVM: arm64: Use read-only helper for reading VM ID registers
  KVM: arm64: Make idregs debugfs iterator search sysreg table directly
  KVM: arm64: Get sys_reg encoding from descriptor in idregs_debug_show()

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-07-14 00:22:32 +00:00
Oliver Upton
435a9f60ed Merge branch kvm-arm64/shadow-mmu into kvmarm/next
* kvm-arm64/shadow-mmu:
  : Shadow stage-2 MMU support for NV, courtesy of Marc Zyngier
  :
  : Initial implementation of shadow stage-2 page tables to support a guest
  : hypervisor. In the author's words:
  :
  :   So here's the 10000m (approximately 30000ft for those of you stuck
  :   with the wrong units) view of what this is doing:
  :
  :     - for each {VMID,VTTBR,VTCR} tuple the guest uses, we use a
  :       separate shadow s2_mmu context. This context has its own "real"
  :       VMID and a set of page tables that are the combination of the
  :       guest's S2 and the host S2, built dynamically one fault at a time.
  :
  :     - these shadow S2 contexts are ephemeral, and behave exactly as
  :       TLBs. For all intent and purposes, they *are* TLBs, and we discard
  :       them pretty often.
  :
  :     - TLB invalidation takes three possible paths:
  :
  :       * either this is an EL2 S1 invalidation, and we directly emulate
  :         it as early as possible
  :
  :       * or this is an EL1 S1 invalidation, and we need to apply it to
  :         the shadow S2s (plural!) that match the VMID set by the L1 guest
  :
  :       * or finally, this is affecting S2, and we need to teardown the
  :         corresponding part of the shadow S2s, which invalidates the TLBs
  KVM: arm64: nv: Truely enable nXS TLBI operations
  KVM: arm64: nv: Add handling of NXS-flavoured TLBI operations
  KVM: arm64: nv: Add handling of range-based TLBI operations
  KVM: arm64: nv: Add handling of outer-shareable TLBI operations
  KVM: arm64: nv: Invalidate TLBs based on shadow S2 TTL-like information
  KVM: arm64: nv: Tag shadow S2 entries with guest's leaf S2 level
  KVM: arm64: nv: Handle FEAT_TTL hinted TLB operations
  KVM: arm64: nv: Handle TLBI IPAS2E1{,IS} operations
  KVM: arm64: nv: Handle TLBI ALLE1{,IS} operations
  KVM: arm64: nv: Handle TLBI VMALLS12E1{,IS} operations
  KVM: arm64: nv: Handle TLB invalidation targeting L2 stage-1
  KVM: arm64: nv: Handle EL2 Stage-1 TLB invalidation
  KVM: arm64: nv: Add Stage-1 EL2 invalidation primitives
  KVM: arm64: nv: Unmap/flush shadow stage 2 page tables
  KVM: arm64: nv: Handle shadow stage 2 page faults
  KVM: arm64: nv: Implement nested Stage-2 page table walk logic
  KVM: arm64: nv: Support multiple nested Stage-2 mmu structures

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-07-14 00:11:45 +00:00
Oliver Upton
a35d5b2032 Merge branch kvm-arm64/ffa-1p1 into kvmarm/next
* kvm-arm64/ffa-1p1:
  : Improvements to the pKVM FF-A Proxy, courtesy of Sebastian Ene
  :
  : Various minor improvements to how host FF-A calls are proxied with the
  : TEE, along with support for v1.1 of the protocol.
  KVM: arm64: Use FF-A 1.1 with pKVM
  KVM: arm64: Update the identification range for the FF-A smcs
  KVM: arm64: Add support for FFA_PARTITION_INFO_GET
  KVM: arm64: Trap FFA_VERSION host call in pKVM

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-07-14 00:11:34 +00:00
Catalin Marinas
3346c56685 Merge branches 'for-next/cpufeature', 'for-next/misc', 'for-next/kselftest', 'for-next/mte', 'for-next/errata', 'for-next/acpi', 'for-next/gic-v3-pmr' and 'for-next/doc', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf:
  perf: add missing MODULE_DESCRIPTION() macros
  perf: arm_pmuv3: Include asm/arm_pmuv3.h from linux/perf/arm_pmuv3.h
  perf: arm_v6/7_pmu: Drop non-DT probe support
  perf/arm: Move 32-bit PMU drivers to drivers/perf/
  perf: arm_pmuv3: Drop unnecessary IS_ENABLED(CONFIG_ARM64) check
  perf: arm_pmuv3: Avoid assigning fixed cycle counter with threshold
  perf: imx_perf: add support for i.MX95 platform
  perf: imx_perf: fix counter start and config sequence
  perf: imx_perf: refactor driver for imx93
  perf: imx_perf: let the driver manage the counter usage rather the user
  perf: imx_perf: add macro definitions for parsing config attr
  dt-bindings: perf: fsl-imx-ddr: Add i.MX95 compatible
  perf: pmuv3: Add new Cortex and Neoverse PMUs
  dt-bindings: arm: pmu: Add new Cortex and Neoverse cores
  perf/arm-cmn: Enable support for tertiary match group
  perf/arm-cmn: Decouple wp_config registers from filter group number

* for-next/cpufeature:
  : Various cpufeature infrastructure patches
  arm64/cpufeature: Replace custom macros with fields from ID_AA64PFR0_EL1
  KVM: arm64: Replace custom macros with fields from ID_AA64PFR0_EL1
  arm64/cpufeatures/kvm: Add ARMv8.9 FEAT_ECBHB bits in ID_AA64MMFR1 register

* for-next/misc:
  : Miscellaneous patches
  arm64: smp: Fix missing IPI statistics
  arm64: Cleanup __cpu_set_tcr_t0sz()
  arm64/mm: Stop using ESR_ELx_FSC_TYPE during fault
  arm64: Kconfig: fix typo in __builtin_return_adddress
  ARM64: reloc_test: add missing MODULE_DESCRIPTION() macro
  arm64: implement raw_smp_processor_id() using thread_info
  arm64/arch_timer: include <linux/percpu.h>

* for-next/kselftest:
  : arm64 kselftest updates
  selftests: arm64: tags: remove the result script
  selftests: arm64: tags_test: conform test to TAP output
  kselftest/arm64: Fix a couple of spelling mistakes
  kselftest/arm64: Fix redundancy of a testcase
  kselftest/arm64: Include kernel mode NEON in fp-stress

* for-next/mte:
  : MTE updates
  arm64: mte: Make mte_check_tfsr_*() conditional on KASAN instead of MTE

* for-next/errata:
  : Arm CPU errata workarounds
  arm64: errata: Expand speculative SSBS workaround
  arm64: errata: Unify speculative SSBS errata logic
  arm64: cputype: Add Cortex-X925 definitions
  arm64: cputype: Add Cortex-A720 definitions
  arm64: cputype: Add Cortex-X3 definitions

* for-next/acpi:
  : arm64 ACPI patches
  ACPI: Add acpi=nospcr to disable ACPI SPCR as default console on ARM64
  ACPI / amba: Drop unnecessary check for registered amba_dummy_clk
  arm64: FFH: Move ACPI specific code into drivers/acpi/arm64/
  arm64: cpuidle: Move ACPI specific code into drivers/acpi/arm64/
  ACPI: arm64: Sort entries alphabetically

* for-next/gic-v3-pmr:
  : arm64: irqchip/gic-v3: Use compiletime constant PMR values
  arm64: irqchip/gic-v3: Select priorities at boot time
  irqchip/gic-v3: Detect GICD_CTRL.DS and SCR_EL3.FIQ earlier
  irqchip/gic-v3: Make distributor priorities variables
  irqchip/gic-common: Remove sync_access callback
  wordpart.h: Add REPEAT_BYTE_U32()

* for-next/doc:
  : arm64 documentation updates
  Documentation: arm64: Update memory.rst for TBI
2024-07-11 19:09:35 +01:00
Oliver Upton
cb52b5c8b8 Revert "KVM: arm64: nv: Fix RESx behaviour of disabled FGTs with negative polarity"
This reverts commit eb9d53d4a9.

As Marc pointed out on the list [*], this patch is wrong, and those who
find themselves in the SOB chain should have their heads checked.

Annoyingly, the architecture has some FGT trap bits that are negative
(i.e. 0 implies trap), and there was some confusion how KVM handles
this for nested guests. However, it is clear now that KVM honors the
RES0-ness of FGT traps already, meaning traps for features never exposed
to the guest hypervisor get handled at L0. As they should.

Link: https://lore.kernel.org/kvmarm/86bk3c3uss.wl-maz@kernel.org/T/#mb9abb3dd79f6a4544a91cb35676bd637c3a5e836
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-07-08 19:50:51 +00:00
Anshuman Khandual
42b9fed388 KVM: arm64: Replace custom macros with fields from ID_AA64PFR0_EL1
This replaces custom macros usage (i.e ID_AA64PFR0_EL1_ELx_64BIT_ONLY and
ID_AA64PFR0_EL1_ELx_32BIT_64BIT) and instead directly uses register fields
from ID_AA64PFR0_EL1 sysreg definition.

Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: kvmarm@lists.linux.dev
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20240613102710.3295108-2-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-07-04 12:21:12 +01:00
Marc Zyngier
3cfde36df7 KVM: arm64: nv: Truely enable nXS TLBI operations
Although we now have support for nXS-flavoured TLBI instructions,
we still don't expose the feature to the guest thanks to a mixture
of misleading comment and use of a bunch of magic values.

Fix the comment and correctly express the masking of LS64, which
is enough to expose nXS to the world. Not that anyone cares...

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240703154743.824824-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-07-03 22:46:14 +00:00
Rob Herring (Arm)
d688ffa269 perf: arm_pmuv3: Include asm/arm_pmuv3.h from linux/perf/arm_pmuv3.h
The arm64 asm/arm_pmuv3.h depends on defines from
linux/perf/arm_pmuv3.h. Rather than depend on include order, follow the
usual pattern of "linux" headers including "asm" headers of the same
name.

With this change, the include of linux/kvm_host.h is problematic due to
circular includes:

In file included from ../arch/arm64/include/asm/arm_pmuv3.h:9,
                 from ../include/linux/perf/arm_pmuv3.h:312,
                 from ../include/kvm/arm_pmu.h:11,
                 from ../arch/arm64/include/asm/kvm_host.h:38,
                 from ../arch/arm64/mm/init.c:41:
../include/linux/kvm_host.h:383:30: error: field 'arch' has incomplete type

Switching to asm/kvm_host.h solves the issue.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20240626-arm-pmu-3-9-icntr-v2-5-c9784b4f4065@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2024-07-03 14:07:14 +01:00
Marc Zyngier
91e9cc70b7 KVM: arm64: Honor trap routing for TCR2_EL1
TCR2_EL1 handling is missing the handling of its trap configuration:

- HCRX_EL2.TCR2En must be handled in conjunction with HCR_EL2.{TVM,TRVM}

- HFG{R,W}TR_EL2.TCR_EL1 does apply to TCR2_EL1 as well

Without these two controls being implemented, it is impossible to
correctly route TCR2_EL1 traps.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240625130042.259175-7-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-27 00:04:25 +00:00
Marc Zyngier
663abf04ee KVM: arm64: Make PIR{,E0}_EL1 save/restore conditional on FEAT_TCRX
As per the architecture, if FEAT_S1PIE is implemented, then FEAT_TCRX
must be implemented as well.

Take advantage of this to avoid checking for S1PIE when TCRX isn't
implemented.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240625130042.259175-6-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-27 00:04:25 +00:00
Marc Zyngier
1b04fd4027 KVM: arm64: Make TCR2_EL1 save/restore dependent on the VM features
As for other registers, save/restore of TCR2_EL1 should be gated
on the feature being actually present.

In the case of a nVHE hypervisor, it is perfectly fine to leave
the host value in the register, as HCRX_EL2.TCREn==0 imposes that
TCR2_EL1 is treated as 0.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240625130042.259175-4-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-27 00:04:25 +00:00
Marc Zyngier
a3ee9ce88b KVM: arm64: Get rid of HCRX_GUEST_FLAGS
HCRX_GUEST_FLAGS gives random KVM hackers the impression that
they can stuff bits in this macro and unconditionally enable
features in the guest.

In general, this is wrong (we have been there with FEAT_MOPS,
and again with FEAT_TCRX).

Document that HCRX_EL2.SMPME is an exception rather than the rule,
and get rid of HCRX_GUEST_FLAGS.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20240625130042.259175-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-27 00:04:25 +00:00
Marc Zyngier
9b58e665d6 KVM: arm64: Correctly honor the presence of FEAT_TCRX
We currently blindly enable TCR2_EL1 use in a guest, irrespective
of the feature set. This is obviously wrong, and we should actually
honor the guest configuration and handle the possible trap resulting
from the guest being buggy.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20240625130042.259175-2-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-27 00:04:25 +00:00
Oliver Upton
33d85a93c6 KVM: arm64: nv: Unfudge ID_AA64PFR0_EL1 masking
Marc reports that L1 VMs aren't booting with the NV series applied to
today's kvmarm/next. After bisecting the issue, it appears that
44241f34fa ("KVM: arm64: nv: Use accessors for modifying ID
registers") is to blame.

Poking around at the issue a bit further, it'd appear that the value for
ID_AA64PFR0_EL1 is complete garbage, as 'val' still contains the value
we set ID_AA64ISAR1_EL1 to.

Fix the read-modify-write pattern to actually use ID_AA64PFR0_EL1 as the
starting point. Excuse me as I return to my shame cube.

Reported-by: Marc Zyngier <maz@kernel.org>
Fixes: 44241f34fa ("KVM: arm64: nv: Use accessors for modifying ID registers")
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240621224044.2465901-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-22 17:21:50 +00:00
Oliver Upton
f1ee914fb6 KVM: arm64: Allow the use of SVE+NV
Allow SVE and NV to mix now that everything is in place to handle it
correctly.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-16-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 19:04:49 +00:00
Marc Zyngier
cd931bd609 KVM: arm64: nv: Add additional trap setup for CPTR_EL2
We need to teach KVM a couple of new tricks. CPTR_EL2 and its
VHE accessor CPACR_EL1 need to be handled specially:

- CPACR_EL1 is trapped on VHE so that we can track the TCPAC
  and TTA bits

- CPTR_EL2.{TCPAC,E0POE} are propagated from L1 to L2

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-15-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 19:04:49 +00:00
Marc Zyngier
e19d533126 KVM: arm64: nv: Add trap description for CPTR_EL2
Add trap description for CPTR_EL2.{TCPAC,TAM,E0POE,TTA}.

TTA is a bit annoying as it changes location depending on E2H.
This forces us to add yet another "complex" trap condition.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-14-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 19:04:49 +00:00
Oliver Upton
5326303bb7 KVM: arm64: nv: Honor guest hypervisor's FP/SVE traps in CPTR_EL2
Start folding the guest hypervisor's FP/SVE traps into the value
programmed in hardware. Note that as of writing this is dead code, since
KVM does a full put() / load() for every nested exception boundary which
saves + flushes the FP/SVE state.

However, this will become useful when we can keep the guest's FP/SVE
state alive across a nested exception boundary and the host no longer
needs to conservatively program traps.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-12-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 19:04:49 +00:00
Oliver Upton
0cfc85b8f5 KVM: arm64: nv: Load guest FP state for ZCR_EL2 trap
Round out the ZCR_EL2 gymnastics by loading SVE state in the fast path
when the guest hypervisor tries to access SVE state.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-11-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 19:04:49 +00:00
Marc Zyngier
493da2b1c4 KVM: arm64: nv: Handle CPACR_EL1 traps
Handle CPACR_EL1 accesses when running a VHE guest. In order to
limit the cost of the emulation, implement it ass a shallow exit.

In the other cases:

- this is a nVHE L1 which will write to memory, and we don't trap

- this is a L2 guest:

  * the L1 has CPTR_EL2.TCPAC==0, and the L2 has direct register
   access

  * the L1 has CPTR_EL2.TCPAC==1, and the L2 will trap, but the
    handling is defered to the general handling for forwarding

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-10-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 19:04:49 +00:00
Oliver Upton
1785f020b1 KVM: arm64: Spin off helper for programming CPTR traps
A subsequent change to KVM will add preliminary support for merging a
guest hypervisor's CPTR traps with that of KVM. Prepare by spinning off
a new helper for managing CPTR traps.

Avoid reading CPACR_EL1 for the baseline trap config, and start off with
the most restrictive set of traps that is subsequently relaxed.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-9-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 19:04:10 +00:00
Oliver Upton
2e3cf82063 KVM: arm64: nv: Ensure correct VL is loaded before saving SVE state
It is possible that the guest hypervisor has selected a smaller VL than
the maximum for its nested guest. As such, ZCR_EL2 may be configured for
a different VL when exiting a nested guest.

Set ZCR_EL2 (via the EL1 alias) to the maximum VL for the VM before
saving SVE state as the SVE save area is dimensioned by the max VL.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-8-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 19:02:40 +00:00
Oliver Upton
9092aca9fe KVM: arm64: nv: Use guest hypervisor's max VL when running nested guest
The max VL for nested guests is additionally constrained by the max VL
selected by the guest hypervisor. Use that instead of KVM's max VL when
running a nested guest.

Note that the guest hypervisor's ZCR_EL2 is sanitised against the VM's
max VL at the time of access, so there's no additional handling required
at the time of use.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-7-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 19:02:40 +00:00
Oliver Upton
b7e5c94264 KVM: arm64: nv: Save guest's ZCR_EL2 when in hyp context
When running a guest hypervisor, ZCR_EL2 is an alias for the counterpart
EL1 state.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-6-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 19:02:40 +00:00
Oliver Upton
069da3ffda KVM: arm64: nv: Load guest hyp's ZCR into EL1 state
Load the guest hypervisor's ZCR_EL2 into the corresponding EL1 register
when restoring SVE state, as ZCR_EL2 affects the VL in the hypervisor
context.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-5-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 19:02:40 +00:00
Oliver Upton
b3d29a8230 KVM: arm64: nv: Handle ZCR_EL2 traps
Unlike other SVE-related registers, ZCR_EL2 takes a sysreg trap to EL2
when HCR_EL2.NV = 1. KVM still needs to honor the guest hypervisor's
trap configuration, which expects an SVE trap (i.e. ESR_EL2.EC = 0x19)
when CPTR traps are enabled for the vCPU's current context.

Otherwise, if the guest hypervisor has traps disabled, emulate the
access by mapping the requested VL into ZCR_EL1.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-4-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 19:01:20 +00:00
Oliver Upton
399debfc97 KVM: arm64: nv: Forward SVE traps to guest hypervisor
Similar to FPSIMD traps, don't load SVE state if the guest hypervisor
has SVE traps enabled and forward the trap instead. Note that ZCR_EL2
will require some special handling, as it takes a sysreg trap to EL2
when HCR_EL2.NV = 1.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 19:01:20 +00:00
Jintack Lim
d2b2ecba8d KVM: arm64: nv: Forward FP/ASIMD traps to guest hypervisor
Give precedence to the guest hypervisor's trap configuration when
routing an FP/ASIMD trap taken to EL2. Take advantage of the
infrastructure for translating CPTR_EL2 into the VHE (i.e. EL1) format
and base the trap decision solely on the VHE view of the register. The
in-memory value of CPTR_EL2 will always be up to date for the guest
hypervisor (more on that later), so just read it directly from memory.

Bury all of this behind a macro keyed off of the CPTR bitfield in
anticipation of supporting other traps (e.g. SVE).

[maz: account for HCR_EL2.E2H when testing for TFP/FPEN, with
 all the hard work actually being done by Chase Conklin]
[ oliver: translate nVHE->VHE format for testing traps; macro for reuse
 in other CPTR_EL2.xEN fields ]

Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 19:01:20 +00:00
Pierre-Clément Tosi
eca4ba5b6d KVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2
The compiler implements kCFI by adding type information (u32) above
every function that might be indirectly called and, whenever a function
pointer is called, injects a read-and-compare of that u32 against the
value corresponding to the expected type. In case of a mismatch, a BRK
instruction gets executed. When the hypervisor triggers such an
exception in nVHE, it panics and triggers and exception return to EL1.

Therefore, teach nvhe_hyp_panic_handler() to detect kCFI errors from the
ESR and report them. If necessary, remind the user that EL2 kCFI is not
affected by CONFIG_CFI_PERMISSIVE.

Pass $(CC_FLAGS_CFI) to the compiler when building the nVHE hyp code.

Use SYM_TYPED_FUNC_START() for __pkvm_init_switch_pgd, as nVHE can't
call it directly and must use a PA function pointer from C (because it
is part of the idmap page), which would trigger a kCFI failure if the
type ID wasn't present.

Signed-off-by: Pierre-Clément Tosi <ptosi@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240610063244.2828978-9-ptosi@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 17:40:54 +00:00
Pierre-Clément Tosi
8f3873a395 KVM: arm64: Introduce print_nvhe_hyp_panic helper
Add a helper to display a panic banner soon to also be used for kCFI
failures, to ensure that we remain consistent.

Signed-off-by: Pierre-Clément Tosi <ptosi@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240610063244.2828978-8-ptosi@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 17:40:54 +00:00