linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-05 13:29:17 -04:00

Author	SHA1	Message	Date
Sean Christopherson	d008dfdb0e	KVM: x86: Move init-only kvm_x86_ops to separate struct Move the kvm_x86_ops functions that are used only within the scope of kvm_init() into a separate struct, kvm_x86_init_ops. In addition to identifying the init-only functions without restorting to code comments, this also sets the stage for waiting until after ->hardware_setup() to set kvm_x86_ops. Setting kvm_x86_ops after ->hardware_setup() is desirable as many of the hooks are not usable until ->hardware_setup() completes. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200321202603.19355-3-sean.j.christopherson@intel.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-31 10:48:04 -04:00
Sean Christopherson	b990408537	KVM: Pass kvm_init()'s opaque param to additional arch funcs Pass @opaque to kvm_arch_hardware_setup() and kvm_arch_check_processor_compat() to allow architecture specific code to reference @opaque without having to stash it away in a temporary global variable. This will enable x86 to separate its vendor specific callback ops, which are passed via @opaque, into "init" and "runtime" ops without having to stash away the "init" ops. No functional change intended. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Tested-by: Cornelia Huck <cohuck@redhat.com> #s390 Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200321202603.19355-2-sean.j.christopherson@intel.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-31 10:48:03 -04:00
Paolo Bonzini	4f4af841f0	Merge tag 'kvm-ppc-next-5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD KVM PPC update for 5.7 * Add a capability for enabling secure guests under the Protected Execution Framework ultravisor * Various bug fixes and cleanups.	2020-03-31 10:45:49 -04:00
Paolo Bonzini	cf39d37539	Merge tag 'kvmarm-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm updates for Linux 5.7 - GICv4.1 support - 32bit host removal	2020-03-31 10:44:53 -04:00
Paolo Bonzini	830948eb68	Merge tag 'kvm-s390-next-5.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fix for error codes - return the proper error to userspace when a signal interrupts the KSM unsharing operation	2020-03-30 09:02:26 -04:00
Christian Borntraeger	7a2653612b	s390/gmap: return proper error code on ksm unsharing If a signal is pending we might return -ENOMEM instead of -EINTR. We should propagate the proper error during KSM unsharing. unmerge_ksm_pages returns -ERESTARTSYS on signal_pending. This gets translated by entry.S to -EINTR. It is important to get this error code so that userspace can retry. To make this clearer we also add -EINTR to the documentation of the PV_ENABLE call, which calls unmerge_ksm_pages. Fixes: `3ac8e38015` ("s390/mm: disable KSM for storage key enabled pages") Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com> Tested-by: Marc Hartmayer <mhartmay@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2020-03-27 06:42:53 -04:00
Paolo Bonzini	8bf8961332	Merge tag 'kvm-s390-next-5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: cleanups for 5.7 - mark sie control block as 512 byte aligned - use fallthrough;	2020-03-26 05:58:49 -04:00
Sean Christopherson	4b547a869d	KVM: selftests: Fix cosmetic copy-paste error in vm_mem_region_move() Fix a copy-paste typo in a comment and error message. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320205546.2396-3-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-26 05:58:28 -04:00
Sean Christopherson	0774a964ef	KVM: Fix out of range accesses to memslots Reset the LRU slot if it becomes invalid when deleting a memslot to fix an out-of-bounds/use-after-free access when searching through memslots. Explicitly check for there being no used slots in search_memslots(), and in the caller of s390's approximation variant. Fixes: `36947254e5` ("KVM: Dynamically size memslot array based on number of used slots") Reported-by: Qian Cai <cai@lca.pw> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320205546.2396-2-sean.j.christopherson@intel.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-26 05:58:27 -04:00
Wanpeng Li	d5361678e6	KVM: X86: Micro-optimize IPI fastpath delay This patch optimizes the virtual IPI fastpath emulation sequence: write ICR2 send virtual IPI read ICR2 write ICR2 send virtual IPI ==> write ICR write ICR We can observe ~0.67% performance improvement for IPI microbenchmark (https://lore.kernel.org/kvm/20171219085010.4081-1-ynorov@caviumnetworks.com/) on Skylake server. Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1585189202-1708-4-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-26 05:58:26 -04:00
Wanpeng Li	8a1038de11	KVM: X86: Delay read msr data iff writes ICR MSR Delay read msr data until we identify guest accesses ICR MSR to avoid to penalize all other MSR writes. Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1585189202-1708-2-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-26 05:58:25 -04:00
Paul Mackerras	9a5788c615	KVM: PPC: Book3S HV: Add a capability for enabling secure guests At present, on Power systems with Protected Execution Facility hardware and an ultravisor, a KVM guest can transition to being a secure guest at will. Userspace (QEMU) has no way of knowing whether a host system is capable of running secure guests. This will present a problem in future when the ultravisor is capable of migrating secure guests from one host to another, because virtualization management software will have no way to ensure that secure guests only run in domains where all of the hosts can support secure guests. This adds a VM capability which has two functions: (a) userspace can query it to find out whether the host can support secure guests, and (b) userspace can enable it for a guest, which allows that guest to become a secure guest. If userspace does not enable it, KVM will return an error when the ultravisor does the hypercall that indicates that the guest is starting to transition to a secure guest. The ultravisor will then abort the transition and the guest will terminate. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Ram Pai <linuxram@us.ibm.com>	2020-03-26 11:09:04 +11:00
Marc Zyngier	4630505997	Merge tag 'kvm-arm-removal' into kvmarm-master/next Goodbye KVM/arm Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-03-24 17:43:53 +00:00
Marc Zyngier	cc98702c17	Merge branch 'kvm-arm64/gic-v4.1' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-03-24 12:45:27 +00:00
Marc Zyngier	dab4fe3bf6	KVM: arm64: GICv4.1: Expose HW-based SGIs in debugfs The vgic-state debugfs file could do with showing the pending state of the HW-backed SGIs. Plug it into the low-level code. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-24-maz@kernel.org	2020-03-24 12:15:52 +00:00
Marc Zyngier	7bdabad127	KVM: arm64: GICv4.1: Allow non-trapping WFI when using HW SGIs Just like for VLPIs, it is beneficial to avoid trapping on WFI when the vcpu is using the GICv4.1 SGIs. Add such a check to vcpu_clear_wfx_traps(). Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-23-maz@kernel.org	2020-03-24 12:15:51 +00:00
Marc Zyngier	d9c3872cd2	KVM: arm64: GICv4.1: Reload VLPI configuration on distributor enable/disable Each time a Group-enable bit gets flipped, the state of these bits needs to be forwarded to the hardware. This is a pretty heavy handed operation, requiring all vcpus to reload their GICv4 configuration. It is thus implemented as a new request type. These enable bits are programmed into the HW by setting the VGrp{0,1}En fields of GICR_VPENDBASER when the vPEs are made resident again. Of course, we only support Group-1 for now... Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-22-maz@kernel.org	2020-03-24 12:15:51 +00:00
Marc Zyngier	2291ff2f2a	KVM: arm64: GICv4.1: Plumb SGI implementation selection in the distributor The GICv4.1 architecture gives the hypervisor the option to let the guest choose whether it wants the good old SGIs with an active state, or the new, HW-based ones that do not have one. For this, plumb the configuration of SGIs into the GICv3 MMIO handling, present the GICD_TYPER2.nASSGIcap to the guest, and handle the GICD_CTLR.nASSGIreq setting. In order to be able to deal with the restore of a guest, also apply the GICD_CTLR.nASSGIreq setting at first run so that we can move the restored SGIs to the HW if that's what the guest had selected in a previous life. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-21-maz@kernel.org	2020-03-24 12:15:51 +00:00
Marc Zyngier	bacf2c6054	KVM: arm64: GICv4.1: Allow SGIs to switch between HW and SW interrupts In order to let a guest buy in the new, active-less SGIs, we need to be able to switch between the two modes. Handle this by stopping all guest activity, transfer the state from one mode to the other, and resume the guest. Nothing calls this code so far, but a later patch will plug it into the MMIO emulation. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-20-maz@kernel.org	2020-03-24 12:15:51 +00:00
Marc Zyngier	ef1820be47	KVM: arm64: GICv4.1: Add direct injection capability to SGI registers Most of the GICv3 emulation code that deals with SGIs now has to be aware of the v4.1 capabilities in order to benefit from it. Add such support, keyed on the interrupt having the hw flag set and being a SGI. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-19-maz@kernel.org	2020-03-24 12:15:51 +00:00
Marc Zyngier	9879b79aef	KVM: arm64: GICv4.1: Let doorbells be auto-enabled As GICv4.1 understands the life cycle of doorbells (instead of just randomly firing them at the most inconvenient time), just enable them at irq_request time, and be done with it. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-18-maz@kernel.org	2020-03-24 12:15:51 +00:00
Marc Zyngier	009384b380	irqchip/gic-v4.1: Eagerly vmap vPEs Now that we have HW-accelerated SGIs being delivered to VPEs, it becomes required to map the VPEs on all ITSs instead of relying on the lazy approach that we would use when using the ITS-list mechanism. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-17-maz@kernel.org	2020-03-24 12:15:51 +00:00
Marc Zyngier	d50676f5ce	irqchip/gic-v4.1: Add VSGI property setup Add the SGI configuration entry point for KVM to use. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-16-maz@kernel.org	2020-03-24 12:15:51 +00:00
Marc Zyngier	6d31b6ff98	irqchip/gic-v4.1: Add VSGI allocation/teardown Allocate per-VPE SGIs when initializing the GIC-specific part of the VPE data structure. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-15-maz@kernel.org	2020-03-24 12:15:51 +00:00
Marc Zyngier	ae699ad348	irqchip/gic-v4.1: Move doorbell management to the GICv4 abstraction layer In order to hide some of the differences between v4.0 and v4.1, move the doorbell management out of the KVM code, and into the GICv4-specific layer. This allows the calling code to ask for the doorbell when blocking, and otherwise to leave the doorbell permanently disabled. This matches the v4.1 code perfectly, and only results in a minor refactoring of the v4.0 code. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-14-maz@kernel.org	2020-03-24 12:15:51 +00:00
Marc Zyngier	05d32df13c	irqchip/gic-v4.1: Plumb set_vcpu_affinity SGI callbacks Just like for vLPIs, there is some configuration information that cannot be directly communicated through the normal irqchip API, and we have to use our good old friend set_vcpu_affinity as a side-band communication mechanism. This is used to configure group and priority for a given vSGI. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-13-maz@kernel.org	2020-03-24 12:15:51 +00:00
Marc Zyngier	7017ff0ee1	irqchip/gic-v4.1: Plumb get/set_irqchip_state SGI callbacks To implement the get/set_irqchip_state callbacks (limited to the PENDING state), we have to use a particular set of hacks: - Reading the pending state is done by using a pair of new redistributor registers (GICR_VSGIR, GICR_VSGIPENDR), which allow the 16 interrupts state to be retrieved. - Setting the pending state is done by generating it as we'd otherwise do for a guest (writing to GITS_SGIR). - Clearing the pending state is done by emitting a VSGI command with the "clear" bit set. This requires some interesting locking though: - When talking to the redistributor, we must make sure that the VPE affinity doesn't change, hence taking the VPE lock. - At the same time, we must ensure that nobody accesses the same redistributor's GICR_VSGIR registers for a different VPE, which would corrupt the reading of the pending bits. We thus take the per-RD spinlock. Much fun. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-12-maz@kernel.org	2020-03-24 12:05:09 +00:00
Marc Zyngier	b4e8d644ec	irqchip/gic-v4.1: Plumb mask/unmask SGI callbacks Implement mask/unmask for virtual SGIs by calling into the configuration helper. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-11-maz@kernel.org	2020-03-24 12:05:09 +00:00
Marc Zyngier	e252cf8a34	irqchip/gic-v4.1: Add initial SGI configuration The GICv4.1 ITS has yet another new command (VSGI) which allows a VPE-targeted SGI to be configured (or have its pending state cleared). Add support for this command and plumb it into the activate irqdomain callback so that it is ready to be used. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-10-maz@kernel.org	2020-03-24 12:05:08 +00:00
Marc Zyngier	166cba7181	irqchip/gic-v4.1: Plumb skeletal VSGI irqchip Since GICv4.1 has the capability to inject 16 SGIs into each VPE, and that I'm keen not to invent too many specific interfaces to manipulate these interrupts, let's pretend that each of these SGIs is an actual Linux interrupt. For that matter, let's introduce a minimal irqchip and irqdomain setup that will get fleshed up in the following patches. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-9-maz@kernel.org	2020-03-24 12:05:04 +00:00
Marc Zyngier	544e56aa63	MAINTAINERS: RIP KVM/arm Drop the KVM/arm entries from the MAINTAINERS file. Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-03-24 10:56:05 +00:00
Marc Zyngier	15ff9a39cd	arm: Remove the ability to set HYP vectors outside of the decompressor Although we have to bounce between HYP and SVC to decompress and relocate the kernel, we don't need to be able to use it in the kernel itself. So let's drop the functionnality. Since the vectors are never changed, there is no need to reset them either, and nobody calls that stub anyway. The last function (SOFT_RESTART) is still present in order to support kexec. Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-03-24 10:56:05 +00:00
Marc Zyngier	59c1d9cc52	arm: Remove GICv3 vgic compatibility macros We used to use a set of macros to provide support of vgic-v3 to 32bit without duplicating everything. We don't need it anymore, so drop it. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Will Deacon <will@kernel.org> Acked-by: Vladimir Murzin <vladimir.murzin@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@arm.com>	2020-03-24 10:56:05 +00:00
Marc Zyngier	3fbb96c054	arm: Remove HYP/Stage-2 page-table support Remove all traces of Stage-2 and HYP page table support. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Will Deacon <will@kernel.org> Acked-by: Vladimir Murzin <vladimir.murzin@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@arm.com>	2020-03-24 10:56:05 +00:00
Marc Zyngier	541ad0150c	arm: Remove 32bit KVM host support That's it. Remove all references to KVM itself, and document that although it is no more, the ABI between SVC and HYP still exists. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Will Deacon <will@kernel.org> Acked-by: Vladimir Murzin <vladimir.murzin@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@arm.com>	2020-03-24 10:56:04 +00:00
Marc Zyngier	bb7c62bcb8	arm: Remove KVM from config files Only one platform is building KVM by default. How crazy! Remove it whilst nobody is watching. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Will Deacon <will@kernel.org> Acked-by: Vladimir Murzin <vladimir.murzin@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@arm.com>	2020-03-24 10:55:50 +00:00
Marc Zyngier	8a90a3228b	arm: Unplug KVM from the build system As we're about to drop KVM/arm on the floor, carefully unplug it from the build system. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Will Deacon <will@kernel.org> Acked-by: Vladimir Murzin <vladimir.murzin@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@arm.com>	2020-03-24 10:55:50 +00:00
Laurent Dufour	377f02d487	KVM: PPC: Book3S HV: H_SVM_INIT_START must call UV_RETURN When the call to UV_REGISTER_MEM_SLOT is failing, for instance because there is not enough free secured memory, the Hypervisor (HV) has to call UV_RETURN to report the error to the Ultravisor (UV). Then the UV will call H_SVM_INIT_ABORT to abort the securing phase and go back to the calling VM. If the kvm->arch.secure_guest is not set, in the return path rfid is called but there is no valid context to get back to the SVM since the Hcall has been routed by the Ultravisor. Move the setting of kvm->arch.secure_guest earlier in kvmppc_h_svm_init_start() so in the return path, UV_RETURN will be called instead of rfid. Cc: Bharata B Rao <bharata@linux.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Ram Pai <linuxram@us.ibm.com> Tested-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2020-03-24 13:08:51 +11:00
Laurent Dufour	8c47b6ff29	KVM: PPC: Book3S HV: Check caller of H_SVM_* Hcalls The Hcall named H_SVM_* are reserved to the Ultravisor. However, nothing prevent a malicious VM or SVM to call them. This could lead to weird result and should be filtered out. Checking the Secure bit of the calling MSR ensure that the call is coming from either the Ultravisor or a SVM. But any system call made from a SVM are going through the Ultravisor, and the Ultravisor should filter out these malicious call. This way, only the Ultravisor is able to make such a Hcall. Cc: Bharata B Rao <bharata@linux.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Ram Pai <linuxram@us.ibnm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2020-03-24 13:08:51 +11:00
Fabiano Rosas	9bee484b28	KVM: PPC: Book3S HV: Skip kvmppc_uvmem_free if Ultravisor is not supported kvmppc_uvmem_init checks for Ultravisor support and returns early if it is not present. Calling kvmppc_uvmem_free at module exit will cause an Oops: $ modprobe -r kvm-hv Oops: Kernel access of bad area, sig: 11 [#1] <snip> NIP: c000000000789e90 LR: c000000000789e8c CTR: c000000000401030 REGS: c000003fa7bab9a0 TRAP: 0300 Not tainted (5.6.0-rc6-00033-g6c90b86a745a-dirty) MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24002282 XER: 00000000 CFAR: c000000000dae880 DAR: 0000000000000008 DSISR: 40000000 IRQMASK: 1 GPR00: c000000000789e8c c000003fa7babc30 c0000000016fe500 0000000000000000 GPR04: 0000000000000000 0000000000000006 0000000000000000 c000003faf205c00 GPR08: 0000000000000000 0000000000000001 000000008000002d c00800000ddde140 GPR12: c000000000401030 c000003ffffd9080 0000000000000001 0000000000000000 GPR16: 0000000000000000 0000000000000000 000000013aad0074 000000013aaac978 GPR20: 000000013aad0070 0000000000000000 00007fffd1b37158 0000000000000000 GPR24: 000000014fef0d58 0000000000000000 000000014fef0cf0 0000000000000001 GPR28: 0000000000000000 0000000000000000 c0000000018b2a60 0000000000000000 NIP [c000000000789e90] percpu_ref_kill_and_confirm+0x40/0x170 LR [c000000000789e8c] percpu_ref_kill_and_confirm+0x3c/0x170 Call Trace: [c000003fa7babc30] [c000003faf2064d4] 0xc000003faf2064d4 (unreliable) [c000003fa7babcb0] [c000000000400e8c] dev_pagemap_kill+0x6c/0x80 [c000003fa7babcd0] [c000000000401064] memunmap_pages+0x34/0x2f0 [c000003fa7babd50] [c00800000dddd548] kvmppc_uvmem_free+0x30/0x80 [kvm_hv] [c000003fa7babd80] [c00800000ddcef18] kvmppc_book3s_exit_hv+0x20/0x78 [kvm_hv] [c000003fa7babda0] [c0000000002084d0] sys_delete_module+0x1d0/0x2c0 [c000003fa7babe20] [c00000000000b9d0] system_call+0x5c/0x68 Instruction dump: 3fc2001b fb81ffe0 fba1ffe8 fbe1fff8 7c7f1b78 7c9c2378 3bde4560 7fc3f378 f8010010 f821ff81 486249a1 60000000 <e93f0008> 7c7d1b78 712a0002 40820084 ---[ end trace 5774ef4dc2c98279 ]--- So this patch checks if kvmppc_uvmem_init actually allocated anything before running kvmppc_uvmem_free. Fixes: `ca9f494267` ("KVM: PPC: Book3S HV: Support for running secure guests") Cc: stable@vger.kernel.org # v5.5+ Reported-by: Greg Kurz <groug@kaod.org> Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Tested-by: Greg Kurz <groug@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2020-03-24 13:08:51 +11:00
Sean Christopherson	4f6ea0a876	KVM: VMX: Gracefully handle faults on VMXON Gracefully handle faults on VMXON, e.g. #GP due to VMX being disabled by BIOS, instead of letting the fault crash the system. Now that KVM uses cpufeatures to query support instead of reading MSR_IA32_FEAT_CTL directly, it's possible for a bug in a different subsystem to cause KVM to incorrectly attempt VMXON[]. Crashing the system is especially annoying if the system is configured such that hardware_enable() will be triggered during boot. Oppurtunistically rename @addr to @vmxon_pointer and use a named param to reference it in the inline assembly. Print 0xdeadbeef in the ultra-"rare" case that reading MSR_IA32_FEAT_CTL also faults. [] https://lkml.kernel.org/r/20200226231615.13664-1-sean.j.christopherson@intel.com Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200321193751.24985-4-sean.j.christopherson@intel.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-23 15:44:26 -04:00
Sean Christopherson	d260f9ef50	KVM: VMX: Fold loaded_vmcs_init() into alloc_loaded_vmcs() Subsume loaded_vmcs_init() into alloc_loaded_vmcs(), its only remaining caller, and drop the VMCLEAR on the shadow VMCS, which is guaranteed to be NULL. loaded_vmcs_init() was previously used by loaded_vmcs_clear(), but loaded_vmcs_clear() also subsumed loaded_vmcs_init() to properly handle smp_wmb() with respect to VMCLEAR. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200321193751.24985-3-sean.j.christopherson@intel.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-23 15:44:26 -04:00
Sean Christopherson	31603d4fc2	KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support VMCLEAR all in-use VMCSes during a crash, even if kdump's NMI shootdown interrupted a KVM update of the percpu in-use VMCS list. Because NMIs are not blocked by disabling IRQs, it's possible that crash_vmclear_local_loaded_vmcss() could be called while the percpu list of VMCSes is being modified, e.g. in the middle of list_add() in vmx_vcpu_load_vmcs(). This potential corner case was called out in the original commit[], but the analysis of its impact was wrong. Skipping the VMCLEARs is wrong because it all but guarantees that a loaded, and therefore cached, VMCS will live across kexec and corrupt memory in the new kernel. Corruption will occur because the CPU's VMCS cache is non-coherent, i.e. not snooped, and so the writeback of VMCS memory on its eviction will overwrite random memory in the new kernel. The VMCS will live because the NMI shootdown also disables VMX, i.e. the in-progress VMCLEAR will #UD, and existing Intel CPUs do not flush the VMCS cache on VMXOFF. Furthermore, interrupting list_add() and list_del() is safe due to crash_vmclear_local_loaded_vmcss() using forward iteration. list_add() ensures the new entry is not visible to forward iteration unless the entire add completes, via WRITE_ONCE(prev->next, new). A bad "prev" pointer could be observed if the NMI shootdown interrupted list_del() or list_add(), but list_for_each_entry() does not consume ->prev. In addition to removing the temporary disabling of VMCLEAR, open code loaded_vmcs_init() in __loaded_vmcs_clear() and reorder VMCLEAR so that the VMCS is deleted from the list only after it's been VMCLEAR'd. Deleting the VMCS before VMCLEAR would allow a race where the NMI shootdown could arrive between list_del() and vmcs_clear() and thus neither flow would execute a successful VMCLEAR. Alternatively, more code could be moved into loaded_vmcs_init(), but that gets rather silly as the only other user, alloc_loaded_vmcs(), doesn't need the smp_wmb() and would need to work around the list_del(). Update the smp_() comments related to the list manipulation, and opportunistically reword them to improve clarity. [*] https://patchwork.kernel.org/patch/1675731/#3720461 Fixes: `8f536b7697` ("KVM: VMX: provide the vmclear function and a bitmap to support VMCLEAR in kdump") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200321193751.24985-2-sean.j.christopherson@intel.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-23 15:44:25 -04:00
Zhenyu Wang	e3747407c4	KVM: x86: Expose fast short REP MOV for supported cpuid For CPU supporting fast short REP MOV (XF86_FEATURE_FSRM) e.g Icelake, Tigerlake, expose it in KVM supported cpuid as well. Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Message-Id: <20200323092236.3703-1-zhenyuw@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-23 15:44:24 -04:00
Stefan Raspl	0c794dcefb	tools/kvm_stat: add command line switch '-c' to log in csv format Add an alternative format that can be more easily used for further processing later on. Note that we add a timestamp in the first column for both, the regular and the new csv format. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Message-Id: <20200306114250.57585-5-raspl@linux.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-23 15:44:21 -04:00
Stefan Raspl	3cbb394d9f	tools/kvm_stat: add command line switch '-s' to set update interval This now controls both, the refresh rate of the interactive mode as well as the logging mode. Which, as a consequence, means that the default of logging mode is now 3s, too (use command line switch '-s' to adjust to your liking). Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Message-Id: <20200306114250.57585-4-raspl@linux.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-23 15:44:20 -04:00
Stefan Raspl	0e6618fba8	tools/kvm_stat: switch to argparse optparse is deprecated for a while, hence switching over to argparse (which also works with python2). As a consequence, help output has some subtle changes, the most significant one being that the options are all listed explicitly instead of a universal '[options]' indicator. Also, some of the error messages are phrased slightly different. While at it, squashed a number of minor PEP8 issues. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Message-Id: <20200306114250.57585-3-raspl@linux.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-23 15:44:20 -04:00
Stefan Raspl	eecda7a956	tools/kvm_stat: rework command line sequence and message texts Make sure command line arguments are sorted alphabetically everywhere, and adjusted existing texts for interactive command 's' to become consistent with the long form --set-delay. Throwing in some PEP8 fixes (all cosmetics) for good measure. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Message-Id: <20200306114250.57585-2-raspl@linux.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-23 15:44:19 -04:00
Christian Borntraeger	f3dd18d444	KVM: s390: mark sie block as 512 byte aligned The sie block must be aligned to 512 bytes. Mark it as such. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com>	2020-03-23 18:30:33 +01:00
Joe Perches	3b684a420b	KVM: s390: Use fallthrough; Convert the various uses of fallthrough comments to fallthrough; Done via script Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Link: https://lore.kernel.org/r/d63c86429f3e5aa806aa3e185c97d213904924a5.1583896348.git.joe@perches.com [borntrager@de.ibm.com: Fix link to tool and subject] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2020-03-23 18:30:07 +01:00

1 2 3 4 5 ...

901720 Commits