Commit Graph

168197 Commits

Author SHA1 Message Date
Christoph Hellwig
cfe2ce49b9 Revert "KVM: x86: enable -Werror"
This reverts commit ead68df94d.

Using the -Werror flag breaks the build for me due to mostly harmless
KASAN or similar warnings:

  arch/x86/kvm/x86.c: In function ‘kvm_timer_init’:
  arch/x86/kvm/x86.c:7209:1: error: the frame size of 1112 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]

Feel free to add a CONFIG_WERROR if you care strong enough, but don't
break peoples builds for absolutely no good reason.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-26 09:59:58 -08:00
Linus Torvalds
c5f8689118 Merge tag 'riscv-for-linux-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
 "This contains a handful of RISC-V related fixes that I've collected
  and would like to target for 5.6-rc4:

   - A fix to set up the PMPs on boot, which allows the kernel to access
     memory on systems that don't set up permissive PMPs before getting
     to Linux. This only effects machine-mode kernels, which currently
     means only NOMMU kernels.

   - A fix to avoid enabling supervisor-mode interrupts when running in
     machine-mode, also only for NOMMU kernels.

   - A pair of fixes to our KASAN support to avoid corrupting memory.

   - A gitignore fix.

  This boots on QEMU's virt board for me"

* tag 'riscv-for-linux-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: adjust the indent
  riscv: allocate a complete page size for each page table
  riscv: Fix gitignore
  RISC-V: Don't enable all interrupts in trap_init()
  riscv: set pmp configuration if kernel is running in M-mode
2020-02-25 10:14:39 -08:00
Linus Torvalds
d67f250e96 Merge branch 'mips-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Paul Burton:
 "Here are a few MIPS fixes, and a MAINTAINERS update to hand over MIPS
  maintenance to Thomas Bogendoerfer - this will be my final pull
  request as MIPS maintainer.

  Thanks for your helpful comments, useful corrections & responsiveness
  during the time I've fulfilled the role, and I'm sure I'll pop up
  elsewhere in the tree somewhere down the line"

* 'mips-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MAINTAINERS: Hand MIPS over to Thomas
  MIPS: ingenic: DTS: Fix watchdog nodes
  MIPS: X1000: Fix clock of watchdog node.
  MIPS: vdso: Wrap -mexplicit-relocs in cc-option
  MIPS: VPE: Fix a double free and a memory leak in 'release_vpe()'
  MIPS: cavium_octeon: Fix syncw generation.
  mips: vdso: add build time check that no 'jalr t9' calls left
  MIPS: Disable VDSO time functionality on microMIPS
  mips: vdso: fix 'jalr t9' crash in vdso code
2020-02-25 10:09:41 -08:00
Zong Li
8458ca147c riscv: adjust the indent
Adjust the indent to match Linux coding style.

Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-02-24 13:12:53 -08:00
Zong Li
a0a31fd84f riscv: allocate a complete page size for each page table
Each page table should be created by allocating a complete page size
for it. Otherwise, the content of the page table would be corrupted
somewhere through memory allocation which allocates the memory at the
middle of the page table for other use.

Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-02-24 13:12:49 -08:00
Linus Torvalds
63623fd449 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "Bugfixes, including the fix for CVE-2020-2732 and a few issues found
  by 'make W=1'"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: s390: rstify new ioctls in api.rst
  KVM: nVMX: Check IO instruction VM-exit conditions
  KVM: nVMX: Refactor IO bitmap checks into helper function
  KVM: nVMX: Don't emulate instructions in guest mode
  KVM: nVMX: Emulate MTF when performing instruction emulation
  KVM: fix error handling in svm_hardware_setup
  KVM: SVM: Fix potential memory leak in svm_cpu_init()
  KVM: apic: avoid calculating pending eoi from an uninitialized val
  KVM: nVMX: clear PIN_BASED_POSTED_INTR from nested pinbased_ctls only when apicv is globally disabled
  KVM: nVMX: handle nested posted interrupts when apicv is disabled for L1
  kvm: x86: svm: Fix NULL pointer dereference when AVIC not enabled
  KVM: VMX: Add VMX_FEATURE_USR_WAIT_PAUSE
  KVM: nVMX: Hold KVM's srcu lock when syncing vmcs12->shadow
  KVM: x86: don't notify userspace IOAPIC on edge-triggered interrupt EOI
  kvm/emulate: fix a -Werror=cast-function-type
  KVM: x86: fix incorrect comparison in trace event
  KVM: nVMX: Fix some obsolete comments and grammar error
  KVM: x86: fix missing prototypes
  KVM: x86: enable -Werror
2020-02-24 11:48:17 -08:00
Linus Torvalds
c6188dff33 Merge tag 'csky-for-linus-5.6-rc3' of git://github.com/c-sky/csky-linux
Pull csky updates from Guo Ren:
 "Sorry, I missed 5.6-rc1 merge window, but in this pull request the
  most are the fixes and the rests are between fixes and features. The
  only outside modification is the MAINTAINERS file update with our
  mailing list.

   - cache flush implementation fixes

   - ftrace modify panic fix

   - CONFIG_SMP boot problem fix

   - fix pt_regs saving for atomic.S

   - fix fixaddr_init without highmem.

   - fix stack protector support

   - fix fake Tightly-Coupled Memory code compile and use

   - fix some typos and coding convention"

* tag 'csky-for-linus-5.6-rc3' of git://github.com/c-sky/csky-linux: (23 commits)
  csky: Replace <linux/clk-provider.h> by <linux/of_clk.h>
  csky: Implement copy_thread_tls
  csky: Add PCI support
  csky: Minimize defconfig to support buildroot config.fragment
  csky: Add setup_initrd check code
  csky: Cleanup old Kconfig options
  arch/csky: fix some Kconfig typos
  csky: Fixup compile warning for three unimplemented syscalls
  csky: Remove unused cache implementation
  csky: Fixup ftrace modify panic
  csky: Add flush_icache_mm to defer flush icache all
  csky: Optimize abiv2 copy_to_user_page with VM_EXEC
  csky: Enable defer flush_dcache_page for abiv2 cpus (807/810/860)
  csky: Remove unnecessary flush_icache_* implementation
  csky: Support icache flush without specific instructions
  csky/Kconfig: Add Kconfig.platforms to support some drivers
  csky/smp: Fixup boot failed when CONFIG_SMP
  csky: Set regs->usp to kernel sp, when the exception is from kernel
  csky/mm: Fixup export invalid_pte_table symbol
  csky: Separate fixaddr_init from highmem
  ...
2020-02-23 09:37:41 -08:00
Oliver Upton
35a571346a KVM: nVMX: Check IO instruction VM-exit conditions
Consult the 'unconditional IO exiting' and 'use IO bitmaps' VM-execution
controls when checking instruction interception. If the 'use IO bitmaps'
VM-execution control is 1, check the instruction access against the IO
bitmaps to determine if the instruction causes a VM-exit.

Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-23 10:16:32 +01:00
Oliver Upton
e71237d3ff KVM: nVMX: Refactor IO bitmap checks into helper function
Checks against the IO bitmap are useful for both instruction emulation
and VM-exit reflection. Refactor the IO bitmap checks into a helper
function.

Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-23 10:14:40 +01:00
Paolo Bonzini
07721feee4 KVM: nVMX: Don't emulate instructions in guest mode
vmx_check_intercept is not yet fully implemented. To avoid emulating
instructions disallowed by the L1 hypervisor, refuse to emulate
instructions by default.

Cc: stable@vger.kernel.org
[Made commit, added commit msg - Oliver]
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-23 10:11:55 +01:00
Oliver Upton
5ef8acbdd6 KVM: nVMX: Emulate MTF when performing instruction emulation
Since commit 5f3d45e7f2 ("kvm/x86: add support for
MONITOR_TRAP_FLAG"), KVM has allowed an L1 guest to use the monitor trap
flag processor-based execution control for its L2 guest. KVM simply
forwards any MTF VM-exits to the L1 guest, which works for normal
instruction execution.

However, when KVM needs to emulate an instruction on the behalf of an L2
guest, the monitor trap flag is not emulated. Add the necessary logic to
kvm_skip_emulated_instruction() to synthesize an MTF VM-exit to L1 upon
instruction emulation for L2.

Fixes: 5f3d45e7f2 ("kvm/x86: add support for MONITOR_TRAP_FLAG")
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-23 09:36:23 +01:00
Li RongQing
dd58f3c95c KVM: fix error handling in svm_hardware_setup
rename svm_hardware_unsetup as svm_hardware_teardown, move
it before svm_hardware_setup, and call it to free all memory
if fail to setup in svm_hardware_setup, otherwise memory will
be leaked

remove __exit attribute for it since it is called in __init
function

Signed-off-by: Li RongQing <lirongqing@baidu.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-23 09:34:26 +01:00
Geert Uytterhoeven
99db590b08 csky: Replace <linux/clk-provider.h> by <linux/of_clk.h>
The C-Sky platform code is not a clock provider, and just needs to call
of_clk_init().

Hence it can include <linux/of_clk.h> instead of <linux/clk-provider.h>.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-23 12:48:55 +08:00
Linus Torvalds
dca132a60f Merge tag 'ras-urgent-2020-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS fixes from Thomas Gleixner:
 "Two fixes for the AMD MCE driver:

   - Populate the per CPU MCA bank descriptor pointer only after it has
     been completely set up to prevent a use-after-free in case that one
     of the subsequent initialization step fails

   - Implement a proper release function for the sysfs entries of MCA
     threshold controls instead of freeing the memory right in the CPU
     teardown code, which leads to another use-after-free when the
     associated sysfs file is opened and accessed"

* tag 'ras-urgent-2020-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce/amd: Fix kobject lifetime
  x86/mce/amd: Publish the bank pointer only after setup has succeeded
2020-02-22 18:02:10 -08:00
Linus Torvalds
fca1037864 Merge tag 'x86-urgent-2020-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "Two fixes for x86:

   - Remove the __force_oder definiton from the kaslr boot code as it is
     already defined in the page table code which makes GCC 10 builds
     fail because it changed the default to -fno-common.

   - Address the AMD erratum 1054 concerning the IRPERF capability and
     enable the Instructions Retired fixed counter on machines which are
     not affected by the erratum"

* tag 'x86-urgent-2020-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu/amd: Enable the fixed Instructions Retired counter IRPERF
  x86/boot/compressed: Don't declare __force_order in kaslr_64.c
2020-02-22 17:08:16 -08:00
Linus Torvalds
591dd4c101 Merge tag 's390-5.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:

 - Remove ieee_emulation_warnings sysctl which is a dead code.

 - Avoid triggering rebuild of the kernel during make install.

 - Enable protected virtualization guest support in default configs.

 - Fix cio_ignore seq_file .next function to increase position index.
   And use kobj_to_dev instead of container_of in cio code.

 - Fix storage block address lists to contain absolute addresses in qdio
   code.

 - Few clang warnings and spelling fixes.

* tag 's390-5.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/qdio: fill SBALEs with absolute addresses
  s390/qdio: fill SL with absolute addresses
  s390: remove obsolete ieee_emulation_warnings
  s390: make 'install' not depend on vmlinux
  s390/kaslr: Fix casts in get_random
  s390/mm: Explicitly compare PAGE_DEFAULT_KEY against zero in storage_key_init_range
  s390/pkey/zcrypt: spelling s/crytp/crypt/
  s390/cio: use kobj_to_dev() API
  s390/defconfig: enable CONFIG_PROTECTED_VIRTUALIZATION_GUEST
  s390/cio: cio_ignore_proc_seq_next should increase position index
2020-02-22 10:43:41 -08:00
Linus Torvalds
54dedb5b57 Merge tag 'for-linus-5.6-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
 "Two small fixes for Xen:

   - a fix to avoid warnings with new gcc

   - a fix for incorrectly disabled interrupts when calling
     _cond_resched()"

* tag 'for-linus-5.6-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: Enable interrupts when calling _cond_resched()
  x86/xen: Distribute switch variables for initialization
2020-02-21 16:10:10 -08:00
Linus Torvalds
63f01d852c Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
 "It's all straightforward apart from the changes to mmap()/mremap() in
  relation to their handling of address arguments from userspace with
  non-zero tag bits in the upper byte.

  The change to brk() is necessary to fix a nasty user-visible
  regression in malloc(), but we tightened up mmap() and mremap() at the
  same time because they also allow the user to create virtual aliases
  by accident. It's much less likely than brk() to matter in practice,
  but enforcing the principle of "don't permit the creation of mappings
  using tagged addresses" leads to a straightforward ABI without having
  to worry about the "but what if a crazy program did foo?" aspect of
  things.

  Summary:

   - Fix regression in malloc() caused by ignored address tags in brk()

   - Add missing brackets around argument to untagged_addr() macro

   - Fix clang build when using binutils assembler

   - Fix silly typo in virtual memory map documentation"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  mm: Avoid creating virtual address aliases in brk()/mmap()/mremap()
  docs: arm64: fix trivial spelling enought to enough in memory.rst
  arm64: memory: Add missing brackets to untagged_addr() macro
  arm64: lse: Fix LSE atomics with LLVM
2020-02-21 16:03:36 -08:00
Linus Torvalds
2865936259 Merge tag 'powerpc-5.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
 "Some more powerpc fixes for 5.6. This is two weeks worth as I was out
  sick last week:

   - Three fixes for the recently added VMAP_STACK on 32-bit.

   - Three fixes related to hugepages on 8xx (32-bit).

   - A fix for a bug in our transactional memory handling that could
     lead to a kernel crash if we saw a page fault during signal
     delivery.

   - A fix for a deadlock in our PCI EEH (Enhanced Error Handling) code.

   - A couple of other minor fixes.

  Thanks to: Christophe Leroy, Erhard F, Frederic Barrat, Gustavo Luiz
  Duarte, Larry Finger, Leonardo Bras, Oliver O'Halloran, Sam Bobroff"

* tag 'powerpc-5.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/entry: Fix an #if which should be an #ifdef in entry_32.S
  powerpc/xmon: Fix whitespace handling in getstring()
  powerpc/6xx: Fix power_save_ppc32_restore() with CONFIG_VMAP_STACK
  powerpc/chrp: Fix enter_rtas() with CONFIG_VMAP_STACK
  powerpc/32s: Fix DSI and ISI exceptions for CONFIG_VMAP_STACK
  powerpc/tm: Fix clearing MSR[TS] in current when reclaiming on signal delivery
  powerpc/8xx: Fix clearing of bits 20-23 in ITLB miss
  powerpc/hugetlb: Fix 8M hugepages on 8xx
  powerpc/hugetlb: Fix 512k hugepages on 8xx with 16k page size
  powerpc/eeh: Fix deadlock handling dead PHB
2020-02-21 15:57:56 -08:00
Miaohe Lin
d80b64ff29 KVM: SVM: Fix potential memory leak in svm_cpu_init()
When kmalloc memory for sd->sev_vmcbs failed, we forget to free the page
held by sd->save_area. Also get rid of the var r as '-ENOMEM' is actually
the only possible outcome here.

Reviewed-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-21 18:06:04 +01:00
Miaohe Lin
23520b2def KVM: apic: avoid calculating pending eoi from an uninitialized val
When pv_eoi_get_user() fails, 'val' may remain uninitialized and the return
value of pv_eoi_get_pending() becomes random. Fix the issue by initializing
the variable.

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-21 18:05:44 +01:00
Vitaly Kuznetsov
a444326780 KVM: nVMX: clear PIN_BASED_POSTED_INTR from nested pinbased_ctls only when apicv is globally disabled
When apicv is disabled on a vCPU (e.g. by enabling KVM_CAP_HYPERV_SYNIC*),
nothing happens to VMX MSRs on the already existing vCPUs, however, all new
ones are created with PIN_BASED_POSTED_INTR filtered out. This is very
confusing and results in the following picture inside the guest:

$ rdmsr -ax 0x48d
ff00000016
7f00000016
7f00000016
7f00000016

This is observed with QEMU and 4-vCPU guest: QEMU creates vCPU0, does
KVM_CAP_HYPERV_SYNIC2 and then creates the remaining three.

L1 hypervisor may only check CPU0's controls to find out what features
are available and it will be very confused later. Switch to setting
PIN_BASED_POSTED_INTR control based on global 'enable_apicv' setting.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-21 18:05:35 +01:00
Vitaly Kuznetsov
91a5f413af KVM: nVMX: handle nested posted interrupts when apicv is disabled for L1
Even when APICv is disabled for L1 it can (and, actually, is) still
available for L2, this means we need to always call
vmx_deliver_nested_posted_interrupt() when attempting an interrupt
delivery.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-21 18:05:21 +01:00
Suravee Suthikulpanit
93fd9666c2 kvm: x86: svm: Fix NULL pointer dereference when AVIC not enabled
Launching VM w/ AVIC disabled together with pass-through device
results in NULL pointer dereference bug with the following call trace.

    RIP: 0010:svm_refresh_apicv_exec_ctrl+0x17e/0x1a0 [kvm_amd]

    Call Trace:
     kvm_vcpu_update_apicv+0x44/0x60 [kvm]
     kvm_arch_vcpu_ioctl_run+0x3f4/0x1c80 [kvm]
     kvm_vcpu_ioctl+0x3d8/0x650 [kvm]
     do_vfs_ioctl+0xaa/0x660
     ? tomoyo_file_ioctl+0x19/0x20
     ksys_ioctl+0x67/0x90
     __x64_sys_ioctl+0x1a/0x20
     do_syscall_64+0x57/0x190
     entry_SYSCALL_64_after_hwframe+0x44/0xa9

Investigation shows that this is due to the uninitialized usage of
struct vapu_svm.ir_list in the svm_set_pi_irte_mode(), which is
called from svm_refresh_apicv_exec_ctrl().

The ir_list is initialized only if AVIC is enabled. So, fixes by
adding a check if AVIC is enabled in the svm_refresh_apicv_exec_ctrl().

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206579
Fixes: 8937d76239 ("kvm: x86: svm: Add support to (de)activate posted interrupts.")
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-21 18:05:20 +01:00
Xiaoyao Li
624e18f92f KVM: VMX: Add VMX_FEATURE_USR_WAIT_PAUSE
Commit 159348784f ("x86/vmx: Introduce VMX_FEATURES_*") missed
bit 26 (enable user wait and pause) of Secondary Processor-based
VM-Execution Controls.

Add VMX_FEATURE_USR_WAIT_PAUSE flag so that it shows up in /proc/cpuinfo,
and use it to define SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE to make them
uniform.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-21 18:05:19 +01:00
wanpeng li
c9dfd3fb08 KVM: nVMX: Hold KVM's srcu lock when syncing vmcs12->shadow
For the duration of mapping eVMCS, it derefences ->memslots without holding
->srcu or ->slots_lock when accessing hv assist page. This patch fixes it by
moving nested_sync_vmcs12_to_shadow to prepare_guest_switch, where the SRCU
is already taken.

It can be reproduced by running kvm's evmcs_test selftest.

  =============================
  warning: suspicious rcu usage
  5.6.0-rc1+ #53 tainted: g        w ioe
  -----------------------------
  ./include/linux/kvm_host.h:623 suspicious rcu_dereference_check() usage!

  other info that might help us debug this:

   rcu_scheduler_active = 2, debug_locks = 1
  1 lock held by evmcs_test/8507:
   #0: ffff9ddd156d00d0 (&vcpu->mutex){+.+.}, at:
kvm_vcpu_ioctl+0x85/0x680 [kvm]

  stack backtrace:
  cpu: 6 pid: 8507 comm: evmcs_test tainted: g        w ioe     5.6.0-rc1+ #53
  hardware name: dell inc. optiplex 7040/0jctf8, bios 1.4.9 09/12/2016
  call trace:
   dump_stack+0x68/0x9b
   kvm_read_guest_cached+0x11d/0x150 [kvm]
   kvm_hv_get_assist_page+0x33/0x40 [kvm]
   nested_enlightened_vmentry+0x2c/0x60 [kvm_intel]
   nested_vmx_handle_enlightened_vmptrld.part.52+0x32/0x1c0 [kvm_intel]
   nested_sync_vmcs12_to_shadow+0x439/0x680 [kvm_intel]
   vmx_vcpu_run+0x67a/0xe60 [kvm_intel]
   vcpu_enter_guest+0x35e/0x1bc0 [kvm]
   kvm_arch_vcpu_ioctl_run+0x40b/0x670 [kvm]
   kvm_vcpu_ioctl+0x370/0x680 [kvm]
   ksys_ioctl+0x235/0x850
   __x64_sys_ioctl+0x16/0x20
   do_syscall_64+0x77/0x780
   entry_syscall_64_after_hwframe+0x49/0xbe

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-21 18:05:03 +01:00
Miaohe Lin
7455a83276 KVM: x86: don't notify userspace IOAPIC on edge-triggered interrupt EOI
Commit 13db77347d ("KVM: x86: don't notify userspace IOAPIC on edge
EOI") said, edge-triggered interrupts don't set a bit in TMR, which means
that IOAPIC isn't notified on EOI. And var level indicates level-triggered
interrupt.
But commit 3159d36ad7 ("KVM: x86: use generic function for MSI parsing")
replace var level with irq.level by mistake. Fix it by changing irq.level
to irq.trig_mode.

Cc: stable@vger.kernel.org
Fixes: 3159d36ad7 ("KVM: x86: use generic function for MSI parsing")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-21 18:04:57 +01:00
Guo Ren
0b9f386c4b csky: Implement copy_thread_tls
This is required for clone3 which passes the TLS value through a
struct rather than a register.

Cc: Amanieu d'Antras <amanieu@gmail.com>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:25 +08:00
MaJun
5b49c82dad csky: Add PCI support
Add the pci related code for csky arch to support basic pci virtual
function, such as qemu virt-pci-9pfs.

Signed-off-by: MaJun <majun258@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:25 +08:00
Ma Jun
dc2efc0028 csky: Minimize defconfig to support buildroot config.fragment
Some bsp (eg: buildroot) has defconfig.fragment design to add more
configs into the defconfig in linux source code tree. For example,
we could put different cpu configs into different defconfig.fragments,
but they all use the same defconfig in Linux.

Signed-off-by: Ma Jun <majun258@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:25 +08:00
Guo Ren
d46869aaab csky: Add setup_initrd check code
We should give some necessary check for initrd just like other
architectures and it seems that setup_initrd() could be a common
code for all architectures.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:25 +08:00
Krzysztof Kozlowski
4ec575b785 csky: Cleanup old Kconfig options
CONFIG_CLKSRC_OF is gone since commit bb0eb050a5
("clocksource/drivers: Rename CLKSRC_OF to TIMER_OF").  The platform
already selects TIMER_OF.

CONFIG_HAVE_DMA_API_DEBUG is gone since commit 6e88628d03 ("dma-debug:
remove CONFIG_HAVE_DMA_API_DEBUG").

CONFIG_DEFAULT_DEADLINE is gone since commit f382fb0bce ("block:
remove legacy IO schedulers").

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:25 +08:00
Randy Dunlap
bebd26ab62 arch/csky: fix some Kconfig typos
Fix wording in help text for the CPU_HAS_LDSTEX symbol.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:25 +08:00
Guo Ren
2305f60b76 csky: Fixup compile warning for three unimplemented syscalls
Implement fstat64, fstatat64, clone3 syscalls to fixup
checksyscalls.sh compile warnings.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:25 +08:00
Guo Ren
9025fd48a8 csky: Remove unused cache implementation
Only for coding convention, these codes are unnecessary for abiv2.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:25 +08:00
Guo Ren
359ae00d12 csky: Fixup ftrace modify panic
During ftrace init, linux will replace all function prologues
(call_mcout) with nops, but it need flush_dcache and
invalidate_icache to make it work. So flush_cache functions
couldn't be nested called by ftrace framework.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:24 +08:00
Guo Ren
997153b9a7 csky: Add flush_icache_mm to defer flush icache all
Some CPUs don't support icache.va instruction to maintain the whole
smp cores' icache. Using icache.all + IPI casue a lot on performace
and using defer mechanism could reduce the number of calling icache
_flush_all functions.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:24 +08:00
Guo Ren
cc1f6563a9 csky: Optimize abiv2 copy_to_user_page with VM_EXEC
Only when vma is for VM_EXEC, we need sync dcache & icache. eg:
 - gdb ptrace modify user space instruction code area.

Add VM_EXEC condition to reduce unnecessary cache flush.

The abiv1 cpus' cache are all VIPT, so we still need to deal with
dcache aliasing problem. But there is optimized way to use cache
color, just like what's done in arch/csky/abiv1/inc/abi/page.h.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:24 +08:00
Guo Ren
d936a7e708 csky: Enable defer flush_dcache_page for abiv2 cpus (807/810/860)
Instead of flushing cache per update_mmu_cache() called, we use
flush_dcache_page to reduce the frequency of flashing the cache.

As abiv2 cpus are all PIPT for icache & dcache, we needn't handle
dcache aliasing problem. But their icache can't snoop dcache, so
we still need sync_icache_dcache in update_mmu_cache().

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:24 +08:00
Guo Ren
a117673413 csky: Remove unnecessary flush_icache_* implementation
The abiv2 CPUs are all PIPT cache, so there is no need to implement
flush_icache_page function.

The function flush_icache_user_range hasn't been used, so just
remove it.

The function flush_cache_range is not necessary for PIPT cache when
tlb mapping changed.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:24 +08:00
Guo Ren
761b4f694c csky: Support icache flush without specific instructions
Some CPUs don't support icache specific instructions to flush icache
lines in broadcast way. We use cpu control registers to flush local
icache and use IPI to notify other cores.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:24 +08:00
Guo Ren
a736fa1ed7 csky/Kconfig: Add Kconfig.platforms to support some drivers
Such as snps,dw-apb-ictl

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:24 +08:00
Guo Ren
c9492737b2 csky/smp: Fixup boot failed when CONFIG_SMP
If we use a non-ipi-support interrupt controller, it will cause panic here.
We should let cpu up and work with CONFIG_SMP, when we use a non-ipi intc.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:24 +08:00
Guo Ren
f8e17c17b8 csky: Set regs->usp to kernel sp, when the exception is from kernel
In the past, we didn't care about kernel sp when saving pt_reg. But in some
cases, we still need pt_reg->usp to represent the kernel stack before enter
exception.

For cmpxhg in atomic.S, we need save and restore usp for above.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:24 +08:00
Guo Ren
7f4a567332 csky/mm: Fixup export invalid_pte_table symbol
There is no present bit in csky pmd hardware, so we need to prepare invalid_pte_table
for empty pmd entry and the functions (pmd_none & pmd_present) in pgtable.h need
invalid_pte_talbe to get result. If a module use these functions, we need export the
symbol for it.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Cc: Mo Qihui <qihui.mo@verisilicon.com>
Cc: Zhange Jian <zhang_jian5@dahuatech.com>
2020-02-21 15:43:24 +08:00
Guo Ren
f136008f31 csky: Separate fixaddr_init from highmem
After fixaddr_init is separated from highmem, we could use tcm
without highmem selected. (610 (abiv1) don't support highmem,
but it could use tcm now.)

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:24 +08:00
Guo Ren
f525bb2c9e csky: Tightly-Coupled Memory or Sram support
The implementation are not only used by TCM but also used by sram on
SOC bus. It follow existed linux tcm software interface, so that old
tcm application codes could be re-used directly.

Software interface list in asm/tcm.h:
 - Variables/Const: 	__tcmdata, __tcmconst
 - Functions:		__tcmfunc, __tcmlocalfunc
 - Malloc/Free:		tcm_alloc, tcm_free

In linux menuconfig:
 - Choose a TCM contain instrctions + data or separated in ITCM/DTCM.
 - Determine TCM_BASE (DTCM_BASE) in phyiscal address.
 - Determine size of TCM or ITCM(DTCM) in page counts.

Here is hello tcm example from Documentation/arm/tcm.rst which could
be directly used:

/* Uninitialized data */
static u32 __tcmdata tcmvar;
/* Initialized data */
static u32 __tcmdata tcmassigned = 0x2BADBABEU;
/* Constant */
static const u32 __tcmconst tcmconst = 0xCAFEBABEU;

static void __tcmlocalfunc tcm_to_tcm(void)
{
	int i;
	for (i = 0; i < 100; i++)
		tcmvar ++;
}

static void __tcmfunc hello_tcm(void)
{
	/* Some abstract code that runs in ITCM */
	int i;
	for (i = 0; i < 100; i++) {
		tcmvar ++;
	}
	tcm_to_tcm();
}

static void __init test_tcm(void)
{
	u32 *tcmem;
	int i;

	hello_tcm();
	printk("Hello TCM executed from ITCM RAM\n");

	printk("TCM variable from testrun: %u @ %p\n", tcmvar, &tcmvar);
	tcmvar = 0xDEADBEEFU;
	printk("TCM variable: 0x%x @ %p\n", tcmvar, &tcmvar);

	printk("TCM assigned variable: 0x%x @ %p\n", tcmassigned, &tcmassigned);

	printk("TCM constant: 0x%x @ %p\n", tcmconst, &tcmconst);

	/* Allocate some TCM memory from the pool */
	tcmem = tcm_alloc(20);
	if (tcmem) {
		printk("TCM Allocated 20 bytes of TCM @ %p\n", tcmem);
		tcmem[0] = 0xDEADBEEFU;
		tcmem[1] = 0x2BADBABEU;
		tcmem[2] = 0xCAFEBABEU;
		tcmem[3] = 0xDEADBEEFU;
		tcmem[4] = 0x2BADBABEU;
		for (i = 0; i < 5; i++)
			printk("TCM tcmem[%d] = %08x\n", i, tcmem[i]);
		tcm_free(tcmem, 20);
	}
}

TODO:
 - Separate fixup mapping from highmem
 - Support abiv1

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:24 +08:00
Mao Han
2f78c73f78 csky: Initial stack protector support
This is a basic -fstack-protector support without per-task canary
switching. The protector will report something like when stack
corruption is detected:

It's tested with strcpy local array overflow in sys_kill and get:
stack-protector: Kernel stack is corrupted in: sys_kill+0x23c/0x23c

TODO:
 - Support task switch for different cannary

Signed-off-by: Mao Han <han_mao@c-sky.com>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-02-21 15:43:24 +08:00
Linus Torvalds
ebe7acadf5 Merge branch 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull IMA fixes from Mimi Zohar:
 "Two bug fixes and an associated change for each.

  The one that adds SM3 to the IMA list of supported hash algorithms is
  a simple change, but could be considered a new feature"

* 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  ima: add sm3 algorithm to hash algorithm configuration list
  crypto: rename sm3-256 to sm3 in hash_algo_name
  efi: Only print errors about failing to get certs if EFI vars are found
  x86/ima: use correct identifier for SetupMode variable
2020-02-20 15:15:16 -08:00
Qian Cai
b78a8552d7 kvm/emulate: fix a -Werror=cast-function-type
arch/x86/kvm/emulate.c: In function 'x86_emulate_insn':
arch/x86/kvm/emulate.c:5686:22: error: cast between incompatible
function types from 'int (*)(struct x86_emulate_ctxt *)' to 'void
(*)(struct fastop *)' [-Werror=cast-function-type]
    rc = fastop(ctxt, (fastop_t)ctxt->execute);

Fix it by using an unnamed union of a (*execute) function pointer and a
(*fastop) function pointer.

Fixes: 3009afc6e3 ("KVM: x86: Use a typedef for fastop functions")
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-20 18:13:45 +01:00