Commit Graph

188885 Commits

Author SHA1 Message Date
Linus Torvalds
291073a566 kvm: fix objtool relocation warning
The recent change to make objtool aware of more symbol relocation types
(commit 24ff652573: "objtool: Teach get_alt_entry() about more
relocation types") also added another check, and resulted in this
objtool warning when building kvm on x86:

    arch/x86/kvm/emulate.o: warning: objtool: __ex_table+0x4: don't know how to handle reloc symbol type: kvm_fastop_exception

The reason seems to be that kvm_fastop_exception() is marked as a global
symbol, which causes the relocation to ke kept around for objtool.  And
at the same time, the kvm_fastop_exception definition (which is done as
an inline asm statement) doesn't actually set the type of the global,
which then makes objtool unhappy.

The minimal fix is to just not mark kvm_fastop_exception as being a
global symbol.  It's only used in that one compilation unit anyway, so
it was always pointless.  That's how all the other local exception table
labels are done.

I'm not entirely happy about the kinds of games that the kvm code plays
with doing its own exception handling, and the fact that it confused
objtool is most definitely a symptom of the code being a bit too subtle
and ad-hoc.  But at least this trivial one-liner makes objtool no longer
upset about what is going on.

Fixes: 24ff652573 ("objtool: Teach get_alt_entry() about more relocation types")
Link: https://lore.kernel.org/lkml/CAHk-=wiZwq-0LknKhXN4M+T8jbxn_2i9mcKpO+OaBSSq_Eh7tg@mail.gmail.com/
Cc: Borislav Petkov <bp@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-03 13:34:19 -07:00
Linus Torvalds
3a399a2bc4 Merge tag 'perf_urgent_for_v5.15_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Borislav Petkov:

 - Make sure the destroy callback is reset when a event initialization
   fails

 - Update the event constraints for Icelake

 - Make sure the active time of an event is updated even for inactive
   events

* tag 'perf_urgent_for_v5.15_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: fix userpage->time_enabled of inactive events
  perf/x86/intel: Update event constraints for ICX
  perf/x86: Reset destroy callback on event init failure
2021-10-03 10:32:27 -07:00
Linus Torvalds
b2626f1e32 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more kvm fixes from Paolo Bonzini:
 "Small x86 fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: selftests: Ensure all migrations are performed when test is affined
  KVM: x86: Swap order of CPUID entry "index" vs. "significant flag" checks
  ptp: Fix ptp_kvm_getcrosststamp issue for x86 ptp_kvm
  x86/kvmclock: Move this_cpu_pvti into kvmclock.h
  selftests: KVM: Don't clobber XMM register when read
  KVM: VMX: Fix a TSX_CTRL_CPUID_CLEAR field mask issue
2021-10-01 11:08:07 -07:00
Kan Liang
ecc2123e09 perf/x86/intel: Update event constraints for ICX
According to the latest event list, the event encoding 0xEF is only
available on the first 4 counters. Add it into the event constraints
table.

Fixes: 6017608936 ("perf/x86/intel: Add Icelake support")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1632842343-25862-1-git-send-email-kan.liang@linux.intel.com
2021-10-01 13:57:54 +02:00
Anand K Mistry
02d029a41d perf/x86: Reset destroy callback on event init failure
perf_init_event tries multiple init callbacks and does not reset the
event state between tries. When x86_pmu_event_init runs, it
unconditionally sets the destroy callback to hw_perf_event_destroy. On
the next init attempt after x86_pmu_event_init, in perf_try_init_event,
if the pmu's capabilities includes PERF_PMU_CAP_NO_EXCLUDE, the destroy
callback will be run. However, if the next init didn't set the destroy
callback, hw_perf_event_destroy will be run (since the callback wasn't
reset).

Looking at other pmu init functions, the common pattern is to only set
the destroy callback on a successful init. Resetting the callback on
failure tries to replicate that pattern.

This was discovered after commit f11dd0d805 ("perf/x86/amd/ibs: Extend
PERF_PMU_CAP_NO_EXCLUDE to IBS Op") when the second (and only second)
run of the perf tool after a reboot results in 0 samples being
generated. The extra run of hw_perf_event_destroy results in
active_events having an extra decrement on each perf run. The second run
has active_events == 0 and every subsequent run has active_events < 0.
When active_events == 0, the NMI handler will early-out and not record
any samples.

Signed-off-by: Anand K Mistry <amistry@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210929170405.1.I078b98ee7727f9ae9d6df8262bad7e325e40faf0@changeid
2021-10-01 13:57:54 +02:00
Linus Torvalds
4de593fb96 Merge tag 'net-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Networking fixes, including fixes from mac80211, netfilter and bpf.

  Current release - regressions:

   - bpf, cgroup: assign cgroup in cgroup_sk_alloc when called from
     interrupt

   - mdio: revert mechanical patches which broke handling of optional
     resources

   - dev_addr_list: prevent address duplication

  Previous releases - regressions:

   - sctp: break out if skb_header_pointer returns NULL in sctp_rcv_ootb
     (NULL deref)

   - Revert "mac80211: do not use low data rates for data frames with no
     ack flag", fixing broadcast transmissions

   - mac80211: fix use-after-free in CCMP/GCMP RX

   - netfilter: include zone id in tuple hash again, minimize collisions

   - netfilter: nf_tables: unlink table before deleting it (race -> UAF)

   - netfilter: log: work around missing softdep backend module

   - mptcp: don't return sockets in foreign netns

   - sched: flower: protect fl_walk() with rcu (race -> UAF)

   - ixgbe: fix NULL pointer dereference in ixgbe_xdp_setup

   - smsc95xx: fix stalled rx after link change

   - enetc: fix the incorrect clearing of IF_MODE bits

   - ipv4: fix rtnexthop len when RTA_FLOW is present

   - dsa: mv88e6xxx: 6161: use correct MAX MTU config method for this
     SKU

   - e100: fix length calculation & buffer overrun in ethtool::get_regs

  Previous releases - always broken:

   - mac80211: fix using stale frag_tail skb pointer in A-MSDU tx

   - mac80211: drop frames from invalid MAC address in ad-hoc mode

   - af_unix: fix races in sk_peer_pid and sk_peer_cred accesses (race
     -> UAF)

   - bpf, x86: Fix bpf mapping of atomic fetch implementation

   - bpf: handle return value of BPF_PROG_TYPE_STRUCT_OPS prog

   - netfilter: ip6_tables: zero-initialize fragment offset

   - mhi: fix error path in mhi_net_newlink

   - af_unix: return errno instead of NULL in unix_create1() when over
     the fs.file-max limit

  Misc:

   - bpf: exempt CAP_BPF from checks against bpf_jit_limit

   - netfilter: conntrack: make max chain length random, prevent
     guessing buckets by attackers

   - netfilter: nf_nat_masquerade: make async masq_inet6_event handling
     generic, defer conntrack walk to work queue (prevent hogging RTNL
     lock)"

* tag 'net-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits)
  af_unix: fix races in sk_peer_pid and sk_peer_cred accesses
  net: stmmac: fix EEE init issue when paired with EEE capable PHYs
  net: dev_addr_list: handle first address in __hw_addr_add_ex
  net: sched: flower: protect fl_walk() with rcu
  net: introduce and use lock_sock_fast_nested()
  net: phy: bcm7xxx: Fixed indirect MMD operations
  net: hns3: disable firmware compatible features when uninstall PF
  net: hns3: fix always enable rx vlan filter problem after selftest
  net: hns3: PF enable promisc for VF when mac table is overflow
  net: hns3: fix show wrong state when add existing uc mac address
  net: hns3: fix mixed flag HCLGE_FLAG_MQPRIO_ENABLE and HCLGE_FLAG_DCB_ENABLE
  net: hns3: don't rollback when destroy mqprio fail
  net: hns3: remove tc enable checking
  net: hns3: do not allow call hns3_nic_net_open repeatedly
  ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup
  net: bridge: mcast: Associate the seqcount with its protecting lock.
  net: mdio-ipq4019: Fix the error for an optional regs resource
  net: hns3: fix hclge_dbg_dump_tm_pg() stack usage
  net: mdio: mscc-miim: Fix the mdio controller
  af_unix: Return errno instead of NULL in unix_create1().
  ...
2021-09-30 14:28:05 -07:00
Sean Christopherson
e8a747d088 KVM: x86: Swap order of CPUID entry "index" vs. "significant flag" checks
Check whether a CPUID entry's index is significant before checking for a
matching index to hack-a-fix an undefined behavior bug due to consuming
uninitialized data.  RESET/INIT emulation uses kvm_cpuid() to retrieve
CPUID.0x1, which does _not_ have a significant index, and fails to
initialize the dummy variable that doubles as EBX/ECX/EDX output _and_
ECX, a.k.a. index, input.

Practically speaking, it's _extremely_  unlikely any compiler will yield
code that causes problems, as the compiler would need to inline the
kvm_cpuid() call to detect the uninitialized data, and intentionally hose
the kernel, e.g. insert ud2, instead of simply ignoring the result of
the index comparison.

Although the sketchy "dummy" pattern was introduced in SVM by commit
66f7b72e11 ("KVM: x86: Make register state after reset conform to
specification"), it wasn't actually broken until commit 7ff6c03503
("KVM: x86: Remove stateful CPUID handling") arbitrarily swapped the
order of operations such that "index" was checked before the significant
flag.

Avoid consuming uninitialized data by reverting to checking the flag
before the index purely so that the fix can be easily backported; the
offending RESET/INIT code has been refactored, moved, and consolidated
from vendor code to common x86 since the bug was introduced.  A future
patch will directly address the bad RESET/INIT behavior.

The undefined behavior was detected by syzbot + KernelMemorySanitizer.

  BUG: KMSAN: uninit-value in cpuid_entry2_find arch/x86/kvm/cpuid.c:68
  BUG: KMSAN: uninit-value in kvm_find_cpuid_entry arch/x86/kvm/cpuid.c:1103
  BUG: KMSAN: uninit-value in kvm_cpuid+0x456/0x28f0 arch/x86/kvm/cpuid.c:1183
   cpuid_entry2_find arch/x86/kvm/cpuid.c:68 [inline]
   kvm_find_cpuid_entry arch/x86/kvm/cpuid.c:1103 [inline]
   kvm_cpuid+0x456/0x28f0 arch/x86/kvm/cpuid.c:1183
   kvm_vcpu_reset+0x13fb/0x1c20 arch/x86/kvm/x86.c:10885
   kvm_apic_accept_events+0x58f/0x8c0 arch/x86/kvm/lapic.c:2923
   vcpu_enter_guest+0xfd2/0x6d80 arch/x86/kvm/x86.c:9534
   vcpu_run+0x7f5/0x18d0 arch/x86/kvm/x86.c:9788
   kvm_arch_vcpu_ioctl_run+0x245b/0x2d10 arch/x86/kvm/x86.c:10020

  Local variable ----dummy@kvm_vcpu_reset created at:
   kvm_vcpu_reset+0x1fb/0x1c20 arch/x86/kvm/x86.c:10812
   kvm_apic_accept_events+0x58f/0x8c0 arch/x86/kvm/lapic.c:2923

Reported-by: syzbot+f3985126b746b3d59c9d@syzkaller.appspotmail.com
Reported-by: Alexander Potapenko <glider@google.com>
Fixes: 2a24be79b6 ("KVM: VMX: Set EDX at INIT with CPUID.0x1, Family-Model-Stepping")
Fixes: 7ff6c03503 ("KVM: x86: Remove stateful CPUID handling")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <20210929222426.1855730-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30 04:20:33 -04:00
Zelin Deng
ad9af93068 x86/kvmclock: Move this_cpu_pvti into kvmclock.h
There're other modules might use hv_clock_per_cpu variable like ptp_kvm,
so move it into kvmclock.h and export the symbol to make it visiable to
other modules.

Signed-off-by: Zelin Deng <zelin.deng@linux.alibaba.com>
Cc: <stable@vger.kernel.org>
Message-Id: <1632892429-101194-2-git-send-email-zelin.deng@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30 04:08:01 -04:00
Linus Torvalds
6e439bbd43 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This contains fixes for a resource leak in ccp as well as stack
  corruption in x86/sm4"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: x86/sm4 - Fix frame pointer stack corruption
  crypto: ccp - fix resource leaks in ccp_run_aes_gcm_cmd()
2021-09-29 07:37:46 -07:00
Linus Torvalds
e7bd807e8c Merge tag 'm68k-for-v5.15-tag3' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull more m68k updates from Geert Uytterhoeven:

 - signal handling fixes

 - removal of set_fs()

[ The set_fs removal isn't strictly a fix, but it's been pending for a
  while and is very welcome. The signal handling fixes resolved an issue
  that was incorrectly attributed to the set_fs changes    - Linus ]

* tag 'm68k-for-v5.15-tag3' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k: Remove set_fs()
  m68k: Provide __{get,put}_kernel_nofault
  m68k: Factor the 8-byte lowlevel {get,put}_user code into helpers
  m68k: Use BUILD_BUG for passing invalid sizes to get_user/put_user
  m68k: Remove the 030 case in virt_to_phys_slow
  m68k: Document that access_ok is broken for !CONFIG_CPU_HAS_ADDRESS_SPACES
  m68k: Leave stack mangling to asm wrapper of sigreturn()
  m68k: Update ->thread.esp0 before calling syscall_trace() in ret_from_signal
  m68k: Handle arrivals of multiple signals correctly
2021-09-28 13:24:43 -07:00
Linus Torvalds
dca50f08a0 Merge tag 'nios2_fixes_for_v5.15_part1' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux
Pull nios2 fixes from Dinh Nguyen:

 - Fix build warning for unmet dependency for EARLY_PRINTK

 - Remove unused dram_start() function

* tag 'nios2_fixes_for_v5.15_part1' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
  NIOS2: setup.c: drop unused variable 'dram_start'
  NIOS2: fix kconfig unmet dependency warning for SERIAL_CORE_CONSOLE
2021-09-28 13:16:52 -07:00
David S. Miller
4ccb9f03fe Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-09-28

The following pull-request contains BPF updates for your *net* tree.

We've added 10 non-merge commits during the last 14 day(s) which contain
a total of 11 files changed, 139 insertions(+), 53 deletions(-).

The main changes are:

1) Fix MIPS JIT jump code emission for too large offsets, from Piotr Krysiuk.

2) Fix x86 JIT atomic/fetch emission when dst reg maps to rax, from Johan Almbladh.

3) Fix cgroup_sk_alloc corner case when called from interrupt, from Daniel Borkmann.

4) Fix segfault in libbpf's linker for objects without BTF, from Kumar Kartikeya Dwivedi.

5) Fix bpf_jit_charge_modmem for applications with CAP_BPF, from Lorenz Bauer.

6) Fix return value handling for struct_ops BPF programs, from Hou Tao.

7) Various fixes to BPF selftests, from Jiri Benc.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
,
2021-09-28 13:52:46 +01:00
Johan Almbladh
ced185824c bpf, x86: Fix bpf mapping of atomic fetch implementation
Fix the case where the dst register maps to %rax as otherwise this produces
an incorrect mapping with the implementation in 981f94c3e9 ("bpf: Add
bitwise atomic instructions") as %rax is clobbered given it's part of the
cmpxchg as operand.

The issue is similar to b29dd96b90 ("bpf, x86: Fix BPF_FETCH atomic and/or/
xor with r0 as src") just that the case of dst register was missed.

Before, dst=r0 (%rax) src=r2 (%rsi):

  [...]
  c5:   mov    %rax,%r10
  c8:   mov    0x0(%rax),%rax       <---+ (broken)
  cc:   mov    %rax,%r11                |
  cf:   and    %rsi,%r11                |
  d2:   lock cmpxchg %r11,0x0(%rax) <---+
  d8:   jne    0x00000000000000c8       |
  da:   mov    %rax,%rsi                |
  dd:   mov    %r10,%rax                |
  [...]                                 |
                                        |
After, dst=r0 (%rax) src=r2 (%rsi):     |
                                        |
  [...]                                 |
  da:	mov    %rax,%r10                |
  dd:	mov    0x0(%r10),%rax       <---+ (fixed)
  e1:	mov    %rax,%r11                |
  e4:	and    %rsi,%r11                |
  e7:	lock cmpxchg %r11,0x0(%r10) <---+
  ed:	jne    0x00000000000000dd
  ef:	mov    %rax,%rsi
  f2:	mov    %r10,%rax
  [...]

The remaining combinations were fine as-is though:

After, dst=r9 (%r15) src=r0 (%rax):

  [...]
  dc:	mov    %rax,%r10
  df:	mov    0x0(%r15),%rax
  e3:	mov    %rax,%r11
  e6:	and    %r10,%r11
  e9:	lock cmpxchg %r11,0x0(%r15)
  ef:	jne    0x00000000000000df      _
  f1:	mov    %rax,%r10                | (unneeded, but
  f4:	mov    %r10,%rax               _|  not a problem)
  [...]

After, dst=r9 (%r15) src=r4 (%rcx):

  [...]
  de:	mov    %rax,%r10
  e1:	mov    0x0(%r15),%rax
  e5:	mov    %rax,%r11
  e8:	and    %rcx,%r11
  eb:	lock cmpxchg %r11,0x0(%r15)
  f1:	jne    0x00000000000000e1
  f3:	mov    %rax,%rcx
  f6:	mov    %r10,%rax
  [...]

The case of dst == src register is rejected by the verifier and
therefore not supported, but x86 JIT also handles this case just
fine.

After, dst=r0 (%rax) src=r0 (%rax):

  [...]
  eb:	mov    %rax,%r10
  ee:	mov    0x0(%r10),%rax
  f2:	mov    %rax,%r11
  f5:	and    %r10,%r11
  f8:	lock cmpxchg %r11,0x0(%r10)
  fe:	jne    0x00000000000000ee
 100:	mov    %rax,%r10
 103:	mov    %r10,%rax
  [...]

Fixes: 981f94c3e9 ("bpf: Add bitwise atomic instructions")
Reported-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-09-28 12:10:29 +02:00
Linus Torvalds
9cccec2bf3 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "A bit late... I got sidetracked by back-from-vacation routines and
  conferences. But most of these patches are already a few weeks old and
  things look more calm on the mailing list than what this pull request
  would suggest.

  x86:

   - missing TLB flush

   - nested virtualization fixes for SMM (secure boot on nested
     hypervisor) and other nested SVM fixes

   - syscall fuzzing fixes

   - live migration fix for AMD SEV

   - mirror VMs now work for SEV-ES too

   - fixes for reset

   - possible out-of-bounds access in IOAPIC emulation

   - fix enlightened VMCS on Windows 2022

  ARM:

   - Add missing FORCE target when building the EL2 object

   - Fix a PMU probe regression on some platforms

  Generic:

   - KCSAN fixes

  selftests:

   - random fixes, mostly for clang compilation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits)
  selftests: KVM: Explicitly use movq to read xmm registers
  selftests: KVM: Call ucall_init when setting up in rseq_test
  KVM: Remove tlbs_dirty
  KVM: X86: Synchronize the shadow pagetable before link it
  KVM: X86: Fix missed remote tlb flush in rmap_write_protect()
  KVM: x86: nSVM: don't copy virt_ext from vmcb12
  KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround
  KVM: x86: selftests: test simultaneous uses of V_IRQ from L1 and L0
  KVM: x86: nSVM: restore int_vector in svm_clear_vintr
  kvm: x86: Add AMD PMU MSRs to msrs_to_save_all[]
  KVM: x86: nVMX: re-evaluate emulation_required on nested VM exit
  KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry
  KVM: x86: VMX: synthesize invalid VM exit when emulating invalid guest state
  KVM: x86: nSVM: refactor svm_leave_smm and smm_enter_smm
  KVM: x86: SVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM mode
  KVM: x86: reset pdptrs_from_userspace when exiting smm
  KVM: x86: nSVM: restore the L1 host state prior to resuming nested guest on SMM exit
  KVM: nVMX: Filter out all unsupported controls when eVMCS was activated
  KVM: KVM: Use cpumask_available() to check for NULL cpumask when kicking vCPUs
  KVM: Clean up benign vcpu->cpu data races when kicking vCPUs
  ...
2021-09-27 13:58:23 -07:00
Zhenzhong Duan
5c49d1850d KVM: VMX: Fix a TSX_CTRL_CPUID_CLEAR field mask issue
When updating the host's mask for its MSR_IA32_TSX_CTRL user return entry,
clear the mask in the found uret MSR instead of vmx->guest_uret_msrs[i].
Modifying guest_uret_msrs directly is completely broken as 'i' does not
point at the MSR_IA32_TSX_CTRL entry.  In fact, it's guaranteed to be an
out-of-bounds accesses as is always set to kvm_nr_uret_msrs in a prior
loop. By sheer dumb luck, the fallout is limited to "only" failing to
preserve the host's TSX_CTRL_CPUID_CLEAR.  The out-of-bounds access is
benign as it's guaranteed to clear a bit in a guest MSR value, which are
always zero at vCPU creation on both x86-64 and i386.

Cc: stable@vger.kernel.org
Fixes: 8ea8b8d6f8 ("KVM: VMX: Use common x86's uret MSR list as the one true list")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210926015545.281083-1-zhenzhong.duan@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-27 11:25:40 -04:00
Randy Dunlap
9523b33cc3 NIOS2: setup.c: drop unused variable 'dram_start'
This is a nuisance when CONFIG_WERROR is set, so drop the variable
declaration since the code that used it was removed.

../arch/nios2/kernel/setup.c: In function 'setup_arch':
../arch/nios2/kernel/setup.c:152:13: warning: unused variable 'dram_start' [-Wunused-variable]
  152 |         int dram_start;

Fixes: 7f7bc20bc4 ("nios2: Don't use _end for calculating min_low_pfn")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Andreas Oetken <andreas.oetken@siemens.com>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2021-09-27 08:34:54 -05:00
Linus Torvalds
5bb7b2107f Merge tag 'x86-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "A set of fixes for X86:

   - Prevent sending the wrong signal when protection keys are enabled
     and the kernel handles a fault in the vsyscall emulation.

   - Invoke early_reserve_memory() before invoking e820_memory_setup()
     which is required to make the Xen dom0 e820 hooks work correctly.

   - Use the correct data type for the SETZ operand in the EMQCMDS
     instruction wrapper.

   - Prevent undefined behaviour to the potential unaligned accesss in
     the instruction decoder library"

* tag 'x86-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses
  x86/asm: Fix SETZ size enqcmds() build failure
  x86/setup: Call early_reserve_memory() earlier
  x86/fault: Fix wrong signal when vsyscall fails with pkey
2021-09-26 10:09:20 -07:00
Linus Torvalds
a3b397b4ff Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "16 patches.

  Subsystems affected by this patch series: xtensa, sh, ocfs2, scripts,
  lib, and mm (memory-failure, kasan, damon, shmem, tools, pagecache,
  debug, and pagemap)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: fix uninitialized use in overcommit_policy_handler
  mm/memory_failure: fix the missing pte_unmap() call
  kasan: always respect CONFIG_KASAN_STACK
  sh: pgtable-3level: fix cast to pointer from integer of different size
  mm/debug: sync up latest migrate_reason to migrate_reason_names
  mm/debug: sync up MR_CONTIG_RANGE and MR_LONGTERM_PIN
  mm: fs: invalidate bh_lrus for only cold path
  lib/zlib_inflate/inffast: check config in C to avoid unused function warning
  tools/vm/page-types: remove dependency on opt_file for idle page tracking
  scripts/sorttable: riscv: fix undeclared identifier 'EM_RISCV' error
  ocfs2: drop acl cache for directories too
  mm/shmem.c: fix judgment error in shmem_is_huge()
  xtensa: increase size of gcc stack frame check
  mm/damon: don't use strnlen() with known-bogus source length
  kasan: fix Kconfig check of CC_HAS_WORKING_NOSANITIZE_ADDRESS
  mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable()
2021-09-25 16:20:34 -07:00
Linus Torvalds
5739844347 Merge tag 'for-linus-5.15b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
 "Some minor cleanups and fixes of some theoretical bugs, as well as a
  fix of a bug introduced in 5.15-rc1"

* tag 'for-linus-5.15b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/x86: fix PV trap handling on secondary processors
  xen/balloon: fix balloon kthread freezing
  swiotlb-xen: this is PV-only on x86
  xen/pci-swiotlb: reduce visibility of symbols
  PCI: only build xen-pcifront in PV-enabled environments
  swiotlb-xen: ensure to issue well-formed XENMEM_exchange requests
  Xen/gntdev: don't ignore kernel unmapping error
  xen/x86: drop redundant zeroing from cpu_initialize_context()
2021-09-25 15:37:31 -07:00
Linus Torvalds
2c4e969c38 Merge tag 'usb-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB driver fixes from Greg KH:
 "Here are some USB driver fixes and new device ids for 5.15-rc3.

  They include:

   - usb-storage quirk additions

   - usb-serial new device ids

   - usb-serial driver fixes

   - USB roothub registration bugfix to resolve a long-reported issue

   - usb gadget driver fixes for a large number of small things

   - dwc2 driver fixes

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'usb-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (28 commits)
  USB: serial: option: add device id for Foxconn T99W265
  USB: serial: cp210x: add ID for GW Instek GDM-834x Digital Multimeter
  USB: serial: cp210x: add part-number debug printk
  USB: serial: cp210x: fix dropped characters with CP2102
  MAINTAINERS: usb, update Peter Korsgaard's entries
  usb: musb: tusb6010: uninitialized data in tusb_fifo_write_unaligned()
  usb-storage: Add quirk for ScanLogic SL11R-IDE older than 2.6c
  Re-enable UAS for LaCie Rugged USB3-FW with fk quirk
  USB: serial: option: remove duplicate USB device ID
  USB: serial: mos7840: remove duplicated 0xac24 device ID
  arm64: dts: qcom: ipq8074: remove USB tx-fifo-resize property
  usb: gadget: f_uac2: Populate SS descriptors' wBytesPerInterval
  usb: gadget: f_uac2: Add missing companion descriptor for feedback EP
  usb: dwc2: gadget: Fix ISOC transfer complete handling for DDMA
  usb: core: hcd: Modularize HCD stop configuration in usb_stop_hcd()
  xhci: Set HCD flag to defer primary roothub registration
  usb: core: hcd: Add support for deferring roothub registration
  usb: dwc2: gadget: Fix ISOC flow for BDMA and Slave
  usb: dwc3: core: balance phy init and exit
  Revert "USB: bcma: Add a check for devm_gpiod_get"
  ...
2021-09-25 10:10:38 -07:00
Geert Uytterhoeven
e8e9f1e632 sh: pgtable-3level: fix cast to pointer from integer of different size
If X2TLB=y (CPU_SHX2=y or CPU_SHX3=y, e.g. migor_defconfig), pgd_t.pgd
is "unsigned long long", causing:

    In file included from arch/sh/include/asm/pgtable.h:13,
		     from include/linux/pgtable.h:6,
		     from include/linux/mm.h:33,
		     from arch/sh/kernel/asm-offsets.c:14:
    arch/sh/include/asm/pgtable-3level.h: In function `pud_pgtable':
    arch/sh/include/asm/pgtable-3level.h:37:9: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
       37 |  return (pmd_t *)pud_val(pud);
	  |         ^

Fix this by adding an intermediate cast to "unsigned long", which is
basically what the old code did before.

Link: https://lkml.kernel.org/r/2c2eef3c9a2f57e5609100a4864715ccf253d30f.1631713483.git.geert+renesas@glider.be
Fixes: 9cf6fa2458 ("mm: rename pud_page_vaddr to pud_pgtable and make it return pmd_t *")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Daniel Palmer <daniel@thingy.jp>
Acked-by: Rob Landley <rob@landley.net>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Cc: Rich Felker <dalias@libc.org>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Jacopo Mondi <jacopo+renesas@jmondi.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-24 16:13:35 -07:00
Linus Torvalds
ea1f9163ac Merge tag 'acpi-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
 "Revert a recent commit related to memory management that turned out to
  be problematic (Jia He)"

* tag 'acpi-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "ACPI: Add memory semantics to acpi_os_map_memory()"
2021-09-24 11:17:32 -07:00
Linus Torvalds
1b7eaf5701 Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:

 - It turns out that the optimised string routines merged in 5.14 are
   not safe with in-kernel MTE (KASAN_HW_TAGS) because of reading beyond
   the end of a string (strcmp, strncmp). Such reading may go across a
   16 byte tag granule and cause a tag check fault. When KASAN_HW_TAGS
   is enabled, use the generic strcmp/strncmp C implementation.

 - An errata workaround for ThunderX relied on the CPU capabilities
   being enabled in a specific order. This disappeared with the
   automatic generation of the cpucaps.h file (sorted alphabetically).
   Fix it by checking the current CPU only rather than the system-wide
   capability.

 - Add system_supports_mte() checks on the kernel entry/exit path and
   thread switching to avoid unnecessary barriers and function calls on
   systems where MTE is not supported.

 - kselftests: skip arm64 tests if the required features are missing.

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Restore forced disabling of KPTI on ThunderX
  kselftest/arm64: signal: Skip tests if required features are missing
  arm64: Mitigate MTE issues with str{n}cmp()
  arm64: add MTE supported check to thread switching and syscall entry/exit
2021-09-24 11:12:17 -07:00
Randy Dunlap
adfc8f9d2f NIOS2: fix kconfig unmet dependency warning for SERIAL_CORE_CONSOLE
SERIAL_CORE_CONSOLE depends on TTY so EARLY_PRINTK should also
depend on TTY so that it does not select SERIAL_CORE_CONSOLE
inadvertently.

WARNING: unmet direct dependencies detected for SERIAL_CORE_CONSOLE
  Depends on [n]: TTY [=n] && HAS_IOMEM [=y]
  Selected by [y]:
  - EARLY_PRINTK [=y]

Fixes: e8bf5bc776 ("nios2: add early printk support")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2021-09-24 08:49:39 -05:00
Christoph Hellwig
9fde034864 m68k: Remove set_fs()
Add a m68k-only set_fc helper to set the SFC and DFC registers for the
few places that need to override it for special MM operations, but
disconnect that from the deprecated kernel-wide set_fs() API.

Note that the SFC/DFC registers are context switched, so there is no need
to disable preemption.

Partially based on an earlier patch from
Linus Torvalds <torvalds@linux-foundation.org>.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Link: https://lore.kernel.org/r/20210916070405.52750-7-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-09-24 13:35:07 +02:00
Christoph Hellwig
8ade833909 m68k: Provide __{get,put}_kernel_nofault
Allow non-faulting access to kernel addresses without overriding the
address space.  Implemented by passing the instruction name to the
low-level assembly macros as an argument, and force the use of the
normal move instructions for kernel access.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Link: https://lore.kernel.org/r/20210916070405.52750-6-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-09-24 13:35:07 +02:00
Christoph Hellwig
01eec1af5e m68k: Factor the 8-byte lowlevel {get,put}_user code into helpers
Add new helpers for doing the grunt work of the 8-byte {get,put}_user
routines to allow for better reuse.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Link: https://lore.kernel.org/r/20210916070405.52750-5-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-09-24 13:35:06 +02:00
Christoph Hellwig
25d2cae4a5 m68k: Use BUILD_BUG for passing invalid sizes to get_user/put_user
Simplify the handling a bit by using the common helper instead of
referencing undefined symbols.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Link: https://lore.kernel.org/r/20210916070405.52750-4-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-09-24 13:35:06 +02:00
Christoph Hellwig
c4f607c312 m68k: Remove the 030 case in virt_to_phys_slow
The 030 case in virt_to_phys_slow can't ever be reached, so remove it.

Suggested-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Link: https://lore.kernel.org/r/20210916070405.52750-3-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-09-24 13:35:06 +02:00
Christoph Hellwig
1dc4027bc8 m68k: Document that access_ok is broken for !CONFIG_CPU_HAS_ADDRESS_SPACES
Document that access_ok is completely broken for coldfire and friends at
the moment.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Link: https://lore.kernel.org/r/20210916070405.52750-2-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-09-24 13:35:06 +02:00
Al Viro
0d20abde98 m68k: Leave stack mangling to asm wrapper of sigreturn()
sigreturn has to deal with an unpleasant problem - exception stack frames
have different sizes, depending upon the exception (and processor model, as
well) and variable-sized part of exception frame may contain information
needed for instruction restart.  So when signal handler terminates and calls
sigreturn to resume the execution at the place where we'd been when we caught
the signal, it has to rearrange the frame at the bottom of kernel stack.
Worse, it might need to open a gap in the kernel stack, shifting pt_regs
towards lower addresses.

Doing that from C is insane - we'd need to shift stack frames (return addresses,
local variables, etc.) of C call chain, right under the nose of compiler and
hope it won't fall apart horribly.  What had been actually done is only slightly
less insane - an inline asm in mangle_kernel_stack() moved the stuff around,
then reset stack pointer and jumped to label in asm glue.

However, we can avoid all that mess if the asm wrapper we have to use anyway
would reserve some space on the stack between switch_stack and the C stack
frame of do_{rt_,}sigreturn().   Then C part can simply memmove() pt_regs +
switch_stack, memcpy() the variable part of exception frame into the opened
gap - all of that without inline asm, buggering C call chain, magical jumps
to asm labels, etc.

Asm wrapper would need to know where the moved switch_stack has ended up -
it might have been shifted into the gap we'd reserved before do_rt_sigreturn()
call.  That's where it needs to set the stack pointer to.  So let the C part
return just that and be done with that.

While we are at it, the call of berr_040cleanup() we need to do when
returning via 68040 bus error exception frame can be moved into C part
as well.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Link: https://lore.kernel.org/r/YP2dTQPm1wGPWFgD@zeniv-ca.linux.org.uk
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-09-24 13:35:03 +02:00
Al Viro
50e43a5733 m68k: Update ->thread.esp0 before calling syscall_trace() in ret_from_signal
We get there when sigreturn has performed obscene acts on kernel stack;
in particular, the location of pt_regs has shifted.  We are about to call
syscall_trace(), which might stop for tracer.  If that happens, we'd better
have task_pt_regs() returning correct result...

Fucked-up-by: Al Viro <viro@zeniv.linux.org.uk>
Fixes: bd6f56a75b ("m68k: Missing syscall_trace() on sigreturn")
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Link: https://lore.kernel.org/r/YP2dMWeV1LkHiOpr@zeniv-ca.linux.org.uk
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-09-24 13:35:01 +02:00
Al Viro
4bb0bd81ce m68k: Handle arrivals of multiple signals correctly
When we have several pending signals, have entered with the kernel
with large exception frame *and* have already built at least one
sigframe, regs->stkadj is going to be non-zero and regs->format/sr/pc
are going to be junk - the real values are in shifted exception stack
frame we'd built when putting together the first sigframe.

If that happens, subsequent sigframes are going to be garbage.
Not hard to fix - just need to find the "adjusted" frame first
and look for format/vector/sr/pc in it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Link: https://lore.kernel.org/r/YP2dBIAPTaVvHiZ6@zeniv-ca.linux.org.uk
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-09-24 13:34:46 +02:00
Numfor Mbiziwo-Tiapo
5ba1071f75 x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses
Don't perform unaligned loads in __get_next() and __peek_nbyte_next() as
these are forms of undefined behavior:

"A pointer to an object or incomplete type may be converted to a pointer
to a different object or incomplete type. If the resulting pointer
is not correctly aligned for the pointed-to type, the behavior is
undefined."

(from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)

These problems were identified using the undefined behavior sanitizer
(ubsan) with the tools version of the code and perf test.

 [ bp: Massage commit message. ]

Signed-off-by: Numfor Mbiziwo-Tiapo <nums@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20210923161843.751834-1-irogers@google.com
2021-09-24 12:37:38 +02:00
Paolo Bonzini
50b0781846 Merge tag 'kvmarm-fixes-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master
KVM/arm64 fixes for 5.15, take #1

- Add missing FORCE target when building the EL2 object
- Fix a PMU probe regression on some platforms
2021-09-24 06:04:42 -04:00
Josh Poimboeuf
0e14ef3866 crypto: x86/sm4 - Fix frame pointer stack corruption
sm4_aesni_avx_crypt8() sets up the frame pointer (which includes pushing
RBP) before doing a conditional sibling call to sm4_aesni_avx_crypt4(),
which sets up an additional frame pointer.  Things will not go well when
sm4_aesni_avx_crypt4() pops only the innermost single frame pointer and
then tries to return to the outermost frame pointer.

Sibling calls need to occur with an empty stack frame.  Do the
conditional sibling call *before* setting up the stack pointer.

This fixes the following warning:

  arch/x86/crypto/sm4-aesni-avx-asm_64.o: warning: objtool: sm4_aesni_avx_crypt8()+0x8: sibling call from callable instruction with modified stack frame

Fixes: a7ee22ee14 ("crypto: x86/sm4 - add AES-NI/AVX/x86_64 implementation")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Arnd Bergmann <arnd@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-09-24 15:58:50 +08:00
Jia He
12064c1768 Revert "ACPI: Add memory semantics to acpi_os_map_memory()"
This reverts commit 437b38c511.

The memory semantics added in commit 437b38c511 causes SystemMemory
Operation region, whose address range is not described in the EFI memory
map to be mapped as NormalNC memory on arm64 platforms (through
acpi_os_map_memory() in acpi_ex_system_memory_space_handler()).

This triggers the following abort on an ARM64 Ampere eMAG machine,
because presumably the physical address range area backing the Opregion
does not support NormalNC memory attributes driven on the bus.

 Internal error: synchronous external abort: 96000410 [#1] SMP
 Modules linked in:
 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0+ #462
 Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 0.14 02/22/2019
 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[...snip...]
 Call trace:
  acpi_ex_system_memory_space_handler+0x26c/0x2c8
  acpi_ev_address_space_dispatch+0x228/0x2c4
  acpi_ex_access_region+0x114/0x268
  acpi_ex_field_datum_io+0x128/0x1b8
  acpi_ex_extract_from_field+0x14c/0x2ac
  acpi_ex_read_data_from_field+0x190/0x1b8
  acpi_ex_resolve_node_to_value+0x1ec/0x288
  acpi_ex_resolve_to_value+0x250/0x274
  acpi_ds_evaluate_name_path+0xac/0x124
  acpi_ds_exec_end_op+0x90/0x410
  acpi_ps_parse_loop+0x4ac/0x5d8
  acpi_ps_parse_aml+0xe0/0x2c8
  acpi_ps_execute_method+0x19c/0x1ac
  acpi_ns_evaluate+0x1f8/0x26c
  acpi_ns_init_one_device+0x104/0x140
  acpi_ns_walk_namespace+0x158/0x1d0
  acpi_ns_initialize_devices+0x194/0x218
  acpi_initialize_objects+0x48/0x50
  acpi_init+0xe0/0x498

If the Opregion address range is not present in the EFI memory map there
is no way for us to determine the memory attributes to use to map it -
defaulting to NormalNC does not work (and it is not correct on a memory
region that may have read side-effects) and therefore commit
437b38c511 should be reverted, which means reverting back to the
original behavior whereby address ranges that are mapped using
acpi_os_map_memory() default to the safe devicenGnRnE attributes on
ARM64 if the mapped address range is not defined in the EFI memory map.

Fixes: 437b38c511 ("ACPI: Add memory semantics to acpi_os_map_memory()")
Signed-off-by: Jia He <justin.he@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-09-23 20:39:36 +02:00
Linus Torvalds
f10f0481a5 Merge tag 'for-linus-rseq' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull rseq fixes from Paolo Bonzini:
 "A fix for a bug with restartable sequences and KVM.

  KVM's handling of TIF_NOTIFY_RESUME, e.g. for task migration, clears
  the flag without informing rseq and leads to stale data in userspace's
  rseq struct"

* tag 'for-linus-rseq' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: selftests: Remove __NR_userfaultfd syscall fallback
  KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs
  tools: Move x86 syscall number fallbacks to .../uapi/
  entry: rseq: Call rseq_handle_notify_resume() in tracehook_notify_resume()
  KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest
2021-09-23 11:24:12 -07:00
Linus Torvalds
9bc62afe03 Merge tag 'net-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Current release - regressions:

   - dsa: bcm_sf2: fix array overrun in bcm_sf2_num_active_ports()

  Previous releases - regressions:

   - introduce a shutdown method to mdio device drivers, and make DSA
     switch drivers compatible with masters disappearing on shutdown;
     preventing infinite reference wait

   - fix issues in mdiobus users related to ->shutdown vs ->remove

   - virtio-net: fix pages leaking when building skb in big mode

   - xen-netback: correct success/error reporting for the
     SKB-with-fraglist

   - dsa: tear down devlink port regions when tearing down the devlink
     port on error

   - nexthop: fix division by zero while replacing a resilient group

   - hns3: check queue, vf, vlan ids range before using

  Previous releases - always broken:

   - napi: fix race against netpoll causing NAPI getting stuck

   - mlx4_en: ensure link operstate is updated even if link comes up
     before netdev registration

   - bnxt_en: fix TX timeout when TX ring size is set to the smallest

   - enetc: fix illegal access when reading affinity_hint; prevent oops
     on sysfs access

   - mtk_eth_soc: avoid creating duplicate offload entries

  Misc:

   - core: correct the sock::sk_lock.owned lockdep annotations"

* tag 'net-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits)
  atlantic: Fix issue in the pm resume flow.
  net/mlx4_en: Don't allow aRFS for encapsulated packets
  net: mscc: ocelot: fix forwarding from BLOCKING ports remaining enabled
  net: ethernet: mtk_eth_soc: avoid creating duplicate offload entries
  nfc: st-nci: Add SPI ID matching DT compatible
  MAINTAINERS: remove Guvenc Gulce as net/smc maintainer
  nexthop: Fix memory leaks in nexthop notification chain listeners
  mptcp: ensure tx skbs always have the MPTCP ext
  qed: rdma - don't wait for resources under hw error recovery flow
  s390/qeth: fix deadlock during failing recovery
  s390/qeth: Fix deadlock in remove_discipline
  s390/qeth: fix NULL deref in qeth_clear_working_pool_list()
  net: dsa: realtek: register the MDIO bus under devres
  net: dsa: don't allocate the slave_mii_bus using devres
  Doc: networking: Fox a typo in ice.rst
  net: dsa: fix dsa_tree_setup error path
  net/smc: fix 'workqueue leaked lock' in smc_conn_abort_work
  net/smc: add missing error check in smc_clc_prfx_set()
  net: hns3: fix a return value error in hclge_get_reset_status()
  net: hns3: check vlan id before using it
  ...
2021-09-23 10:30:31 -07:00
Lai Jiangshan
65855ed8b0 KVM: X86: Synchronize the shadow pagetable before link it
If gpte is changed from non-present to present, the guest doesn't need
to flush tlb per SDM.  So the host must synchronze sp before
link it.  Otherwise the guest might use a wrong mapping.

For example: the guest first changes a level-1 pagetable, and then
links its parent to a new place where the original gpte is non-present.
Finally the guest can access the remapped area without flushing
the tlb.  The guest's behavior should be allowed per SDM, but the host
kvm mmu makes it wrong.

Fixes: 4731d4c7a0 ("KVM: MMU: out of sync shadow core")
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210918005636.3675-3-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23 11:01:00 -04:00
dann frazier
22b70e6f2d arm64: Restore forced disabling of KPTI on ThunderX
A noted side-effect of commit 0c6c2d3615 ("arm64: Generate cpucaps.h")
is that cpucaps are now sorted, changing the enumeration order. This
assumed no dependencies between cpucaps, which turned out not to be true
in one case. UNMAP_KERNEL_AT_EL0 currently needs to be processed after
WORKAROUND_CAVIUM_27456. ThunderX systems are incompatible with KPTI, so
unmap_kernel_at_el0() bails if WORKAROUND_CAVIUM_27456 is set. But because
of the sorting, WORKAROUND_CAVIUM_27456 will not yet have been considered
when unmap_kernel_at_el0() checks for it, so the kernel tries to
run w/ KPTI - and quickly falls over.

Because all ThunderX implementations have homogeneous CPUs, we can remove
this dependency by just checking the current CPU for the erratum.

Fixes: 0c6c2d3615 ("arm64: Generate cpucaps.h")
Cc: <stable@vger.kernel.org> # 5.13.x
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210923145002.3394558-1-dann.frazier@canonical.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-09-23 15:59:15 +01:00
Lai Jiangshan
f81602958c KVM: X86: Fix missed remote tlb flush in rmap_write_protect()
When kvm->tlbs_dirty > 0, some rmaps might have been deleted
without flushing tlb remotely after kvm_sync_page().  If @gfn
was writable before and it's rmaps was deleted in kvm_sync_page(),
and if the tlb entry is still in a remote running VCPU,  the @gfn
is not safely protected.

To fix the problem, kvm_sync_page() does the remote flush when
needed to avoid the problem.

Fixes: a4ee1ca4a3 ("KVM: MMU: delay flush all tlbs on sync_page path")
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210918005636.3675-2-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23 10:28:44 -04:00
Maxim Levitsky
faf6b75562 KVM: x86: nSVM: don't copy virt_ext from vmcb12
These field correspond to features that we don't expose yet to L2

While currently there are no CVE worthy features in this field,
if AMD adds more features to this field, that could allow guest
escapes similar to CVE-2021-3653 and CVE-2021-3656.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210914154825.104886-6-mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23 10:06:46 -04:00
Maxim Levitsky
d1cba6c922 KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround
GP SVM errata workaround made the #GP handler always emulate
the SVM instructions.

However these instructions #GP in case the operand is not 4K aligned,
but the workaround code didn't check this and we ended up
emulating these instructions anyway.

This is only an emulation accuracy check bug as there is no harm for
KVM to read/write unaligned vmcb images.

Fixes: 82a11e9c6f ("KVM: SVM: Add emulation support for #GP triggered by SVM instructions")

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210914154825.104886-4-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23 10:05:29 -04:00
Maxim Levitsky
aee77e1169 KVM: x86: nSVM: restore int_vector in svm_clear_vintr
In svm_clear_vintr we try to restore the virtual interrupt
injection that might be pending, but we fail to restore
the interrupt vector.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210914154825.104886-2-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23 10:04:40 -04:00
Kees Cook
d81ff5fe14 x86/asm: Fix SETZ size enqcmds() build failure
When building under GCC 4.9 and 5.5:

  arch/x86/include/asm/special_insns.h: Assembler messages:
  arch/x86/include/asm/special_insns.h:286: Error: operand size mismatch for `setz'

Change the type to "bool" for condition code arguments, as documented.

Fixes: 7f5933f81b ("x86/asm: Add an enqcmds() wrapper for the ENQCMDS instruction")
Co-developed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210910223332.3224851-1-keescook@chromium.org
2021-09-22 19:45:48 +02:00
Fares Mehanna
e1fc1553cd kvm: x86: Add AMD PMU MSRs to msrs_to_save_all[]
Intel PMU MSRs is in msrs_to_save_all[], so add AMD PMU MSRs to have a
consistent behavior between Intel and AMD when using KVM_GET_MSRS,
KVM_SET_MSRS or KVM_GET_MSR_INDEX_LIST.

We have to add legacy and new MSRs to handle guests running without
X86_FEATURE_PERFCTR_CORE.

Signed-off-by: Fares Mehanna <faresx@amazon.de>
Message-Id: <20210915133951.22389-1-faresx@amazon.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 11:30:14 -04:00
Maxim Levitsky
dbab610a5b KVM: x86: nVMX: re-evaluate emulation_required on nested VM exit
If L1 had invalid state on VM entry (can happen on SMM transactions
when we enter from real mode, straight to nested guest),

then after we load 'host' state from VMCS12, the state has to become
valid again, but since we load the segment registers with
__vmx_set_segment we weren't always updating emulation_required.

Update emulation_required explicitly at end of load_vmcs12_host_state.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210913140954.165665-8-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:47:50 -04:00
Maxim Levitsky
c8607e4a08 KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry
It is possible that when non root mode is entered via special entry
(!from_vmentry), that is from SMM or from loading the nested state,
the L2 state could be invalid in regard to non unrestricted guest mode,
but later it can become valid.

(for example when RSM emulation restores segment registers from SMRAM)

Thus delay the check to VM entry, where we will check this and fail.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210913140954.165665-7-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:47:50 -04:00
Maxim Levitsky
c42dec148b KVM: x86: VMX: synthesize invalid VM exit when emulating invalid guest state
Since no actual VM entry happened, the VM exit information is stale.
To avoid this, synthesize an invalid VM guest state VM exit.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210913140954.165665-6-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:47:49 -04:00