Commit Graph

188899 Commits

Author SHA1 Message Date
Johan Almbladh
ebcbacfa50 mips, bpf: Remove old BPF JIT implementations
This patch removes the old 32-bit cBPF and 64-bit eBPF JIT implementations.
They are replaced by a new eBPF implementation that supports both 32-bit
and 64-bit MIPS CPUs.

Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211005165408.2305108-8-johan.almbladh@anyfinetworks.com
2021-10-06 12:28:34 -07:00
Johan Almbladh
01bdc58e94 mips, bpf: Enable eBPF JITs
This patch enables the new eBPF JITs for 32-bit and 64-bit MIPS. It also
disables the old cBPF JIT to so cBPF programs are converted to use the
new JIT.

Workarounds for R4000 CPU errata are not implemented by the JIT, so the
JIT is disabled if any of those workarounds are configured.

Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211005165408.2305108-7-johan.almbladh@anyfinetworks.com
2021-10-06 12:28:30 -07:00
Johan Almbladh
72570224bb mips, bpf: Add JIT workarounds for CPU errata
This patch adds workarounds for the following CPU errata to the MIPS
eBPF JIT, if enabled in the kernel configuration.

  - R10000 ll/sc weak ordering
  - Loongson-3 ll/sc weak ordering
  - Loongson-2F jump hang

The Loongson-2F nop errata is implemented in uasm, which the JIT uses,
so no additional mitigations are needed for that.

Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Link: https://lore.kernel.org/bpf/20211005165408.2305108-6-johan.almbladh@anyfinetworks.com
2021-10-06 12:28:25 -07:00
Johan Almbladh
fbc802de6b mips, bpf: Add new eBPF JIT for 64-bit MIPS
This is an implementation on of an eBPF JIT for 64-bit MIPS III-V and
MIPS64r1-r6. It uses the same framework introduced by the 32-bit JIT.

Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211005165408.2305108-5-johan.almbladh@anyfinetworks.com
2021-10-06 12:28:20 -07:00
Johan Almbladh
eb63cfcd2e mips, bpf: Add eBPF JIT for 32-bit MIPS
This is an implementation of an eBPF JIT for 32-bit MIPS I-V and MIPS32.
The implementation supports all 32-bit and 64-bit ALU and JMP operations,
including the recently-added atomics. 64-bit div/mod and 64-bit atomics
are implemented using function calls to math64 and atomic64 functions,
respectively. All 32-bit operations are implemented natively by the JIT,
except if the CPU lacks ll/sc instructions.

Register mapping
================
All 64-bit eBPF registers are mapped to native 32-bit MIPS register pairs,
and does not use any stack scratch space for register swapping. This means
that all eBPF register data is kept in CPU registers all the time, and
this simplifies the register management a lot. It also reduces the JIT's
pressure on temporary registers since we do not have to move data around.

Native register pairs are ordered according to CPU endiannes, following
the O32 calling convention for passing 64-bit arguments and return values.
The eBPF return value, arguments and callee-saved registers are mapped to
their native MIPS equivalents.

Since the 32 highest bits in the eBPF FP (frame pointer) register are
always zero, only one general-purpose register is actually needed for the
mapping. The MIPS fp register is used for this purpose. The high bits are
mapped to MIPS register r0. This saves us one CPU register, which is much
needed for temporaries, while still allowing us to treat the R10 (FP)
register just like any other eBPF register in the JIT.

The MIPS gp (global pointer) and at (assembler temporary) registers are
used as internal temporary registers for constant blinding. CPU registers
t6-t9 are used internally by the JIT when constructing more complex 64-bit
operations. This is precisely what is needed - two registers to store an
operand value, and two more as scratch registers when performing the
operation.

The register mapping is shown below.

    R0 - $v1, $v0   return value
    R1 - $a1, $a0   argument 1, passed in registers
    R2 - $a3, $a2   argument 2, passed in registers
    R3 - $t1, $t0   argument 3, passed on stack
    R4 - $t3, $t2   argument 4, passed on stack
    R5 - $t4, $t3   argument 5, passed on stack
    R6 - $s1, $s0   callee-saved
    R7 - $s3, $s2   callee-saved
    R8 - $s5, $s4   callee-saved
    R9 - $s7, $s6   callee-saved
    FP - $r0, $fp   32-bit frame pointer
    AX - $gp, $at   constant-blinding
         $t6 - $t9  unallocated, JIT temporaries

Jump offsets
============
The JIT tries to map all conditional JMP operations to MIPS conditional
PC-relative branches. The MIPS branch offset field is 18 bits, in bytes,
which is equivalent to the eBPF 16-bit instruction offset. However, since
the JIT may emit more than one CPU instruction per eBPF instruction, the
field width may overflow. If that happens, the JIT converts the long
conditional jump to a short PC-relative branch with the condition
inverted, jumping over a long unconditional absolute jmp (j).

This conversion will change the instruction offset mapping used for jumps,
and may in turn result in more branch offset overflows. The JIT therefore
dry-runs the translation until no more branches are converted and the
offsets do not change anymore. There is an upper bound on this of course,
and if the JIT hits that limit, the last two iterations are run with all
branches being converted.

Tail call count
===============
The current tail call count is stored in the 16-byte area of the caller's
stack frame that is reserved for the callee in the o32 ABI. The value is
initialized in the prologue, and propagated to the tail-callee by skipping
the initialization instructions when emitting the tail call.

Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211005165408.2305108-4-johan.almbladh@anyfinetworks.com
2021-10-06 12:28:14 -07:00
Johan Almbladh
f7c036c15b mips, uasm: Add workaround for Loongson-2F nop CPU errata
This patch implements a workaround for the Loongson-2F nop in generated,
code, if the existing option CONFIG_CPU_NOP_WORKAROUND is set. Before,
the binutils option -mfix-loongson2f-nop was enabled, but no workaround
was done when emitting MIPS code. Now, the nop pseudo instruction is
emitted as "or ax,ax,zero" instead of the default "sll zero,zero,0". This
is consistent with the workaround implemented by binutils.

Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Link: https://sourceware.org/legacy-ml/binutils/2009-11/msg00387.html
Link: https://lore.kernel.org/bpf/20211005165408.2305108-3-johan.almbladh@anyfinetworks.com
2021-10-06 12:28:09 -07:00
Tony Ambardar
e737547eab mips, uasm: Enable muhu opcode for MIPS R6
Enable the 'muhu' instruction, complementing the existing 'mulu', needed
to implement a MIPS32 BPF JIT.

Also fix a typo in the existing definition of 'dmulu'.

Signed-off-by: Tony Ambardar <Tony.Ambardar@gmail.com>
Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211005165408.2305108-2-johan.almbladh@anyfinetworks.com
2021-10-06 12:28:02 -07:00
Jie Meng
57a610f1c5 bpf, x64: Save bytes for DIV by reducing reg copies
Instead of unconditionally performing push/pop on %rax/%rdx in case of
division/modulo, we can save a few bytes in case of destination register
being either BPF r0 (%rax) or r3 (%rdx) since the result is written in
there anyway.

Also, we do not need to copy the source to %r11 unless the source is either
%rax, %rdx or an immediate.

For example, before the patch:

  22:   push   %rax
  23:   push   %rdx
  24:   mov    %rsi,%r11
  27:   xor    %edx,%edx
  29:   div    %r11
  2c:   mov    %rax,%r11
  2f:   pop    %rdx
  30:   pop    %rax
  31:   mov    %r11,%rax

After:

  22:   push   %rdx
  23:   xor    %edx,%edx
  25:   div    %rsi
  28:   pop    %rdx

Signed-off-by: Jie Meng <jmeng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211002035626.2041910-1-jmeng@fb.com
2021-10-06 15:24:36 +02:00
David S. Miller
fb8ece514d sparc: Fix typo.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-05 11:34:17 +01:00
Jakub Kicinski
4e9b9de65c arch: use eth_hw_addr_set()
Commit 406f42fa0d ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.

Convert misc arch drivers from memcpy(... ETH_ADDR) to eth_hw_addr_set():

  @@
  expression dev, np;
  @@
  - memcpy(dev->dev_addr, np, ETH_ALEN)
  + eth_hw_addr_set(dev, np)

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-02 14:18:25 +01:00
Stephen Rothwell
10d48705d5 fix up for "net: add new socket option SO_RESERVE_MEM"
Some architectures do not include uapi/asm/socket.h

Fixes: 2bb2f5fb21 ("net: add new socket option SO_RESERVE_MEM")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-01 15:00:21 +01:00
David S. Miller
bfaf03935f sparc: add SO_RESERVE_MEM definition.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-01 14:38:10 +01:00
Jakub Kicinski
dd9a887b35 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/phy/bcm7xxx.c
  d88fd1b546 ("net: phy: bcm7xxx: Fixed indirect MMD operations")
  f68d08c437 ("net: phy: bcm7xxx: Add EPHY entry for 72165")

net/sched/sch_api.c
  b193e15ac6 ("net: prevent user from passing illegal stab size")
  69508d4333 ("net_sched: Use struct_size() and flex_array_size() helpers")

Both cases trivial - adjacent code additions.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-30 14:49:21 -07:00
Linus Torvalds
4de593fb96 Merge tag 'net-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Networking fixes, including fixes from mac80211, netfilter and bpf.

  Current release - regressions:

   - bpf, cgroup: assign cgroup in cgroup_sk_alloc when called from
     interrupt

   - mdio: revert mechanical patches which broke handling of optional
     resources

   - dev_addr_list: prevent address duplication

  Previous releases - regressions:

   - sctp: break out if skb_header_pointer returns NULL in sctp_rcv_ootb
     (NULL deref)

   - Revert "mac80211: do not use low data rates for data frames with no
     ack flag", fixing broadcast transmissions

   - mac80211: fix use-after-free in CCMP/GCMP RX

   - netfilter: include zone id in tuple hash again, minimize collisions

   - netfilter: nf_tables: unlink table before deleting it (race -> UAF)

   - netfilter: log: work around missing softdep backend module

   - mptcp: don't return sockets in foreign netns

   - sched: flower: protect fl_walk() with rcu (race -> UAF)

   - ixgbe: fix NULL pointer dereference in ixgbe_xdp_setup

   - smsc95xx: fix stalled rx after link change

   - enetc: fix the incorrect clearing of IF_MODE bits

   - ipv4: fix rtnexthop len when RTA_FLOW is present

   - dsa: mv88e6xxx: 6161: use correct MAX MTU config method for this
     SKU

   - e100: fix length calculation & buffer overrun in ethtool::get_regs

  Previous releases - always broken:

   - mac80211: fix using stale frag_tail skb pointer in A-MSDU tx

   - mac80211: drop frames from invalid MAC address in ad-hoc mode

   - af_unix: fix races in sk_peer_pid and sk_peer_cred accesses (race
     -> UAF)

   - bpf, x86: Fix bpf mapping of atomic fetch implementation

   - bpf: handle return value of BPF_PROG_TYPE_STRUCT_OPS prog

   - netfilter: ip6_tables: zero-initialize fragment offset

   - mhi: fix error path in mhi_net_newlink

   - af_unix: return errno instead of NULL in unix_create1() when over
     the fs.file-max limit

  Misc:

   - bpf: exempt CAP_BPF from checks against bpf_jit_limit

   - netfilter: conntrack: make max chain length random, prevent
     guessing buckets by attackers

   - netfilter: nf_nat_masquerade: make async masq_inet6_event handling
     generic, defer conntrack walk to work queue (prevent hogging RTNL
     lock)"

* tag 'net-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits)
  af_unix: fix races in sk_peer_pid and sk_peer_cred accesses
  net: stmmac: fix EEE init issue when paired with EEE capable PHYs
  net: dev_addr_list: handle first address in __hw_addr_add_ex
  net: sched: flower: protect fl_walk() with rcu
  net: introduce and use lock_sock_fast_nested()
  net: phy: bcm7xxx: Fixed indirect MMD operations
  net: hns3: disable firmware compatible features when uninstall PF
  net: hns3: fix always enable rx vlan filter problem after selftest
  net: hns3: PF enable promisc for VF when mac table is overflow
  net: hns3: fix show wrong state when add existing uc mac address
  net: hns3: fix mixed flag HCLGE_FLAG_MQPRIO_ENABLE and HCLGE_FLAG_DCB_ENABLE
  net: hns3: don't rollback when destroy mqprio fail
  net: hns3: remove tc enable checking
  net: hns3: do not allow call hns3_nic_net_open repeatedly
  ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup
  net: bridge: mcast: Associate the seqcount with its protecting lock.
  net: mdio-ipq4019: Fix the error for an optional regs resource
  net: hns3: fix hclge_dbg_dump_tm_pg() stack usage
  net: mdio: mscc-miim: Fix the mdio controller
  af_unix: Return errno instead of NULL in unix_create1().
  ...
2021-09-30 14:28:05 -07:00
Linus Torvalds
6e439bbd43 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This contains fixes for a resource leak in ccp as well as stack
  corruption in x86/sm4"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: x86/sm4 - Fix frame pointer stack corruption
  crypto: ccp - fix resource leaks in ccp_run_aes_gcm_cmd()
2021-09-29 07:37:46 -07:00
Linus Torvalds
e7bd807e8c Merge tag 'm68k-for-v5.15-tag3' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull more m68k updates from Geert Uytterhoeven:

 - signal handling fixes

 - removal of set_fs()

[ The set_fs removal isn't strictly a fix, but it's been pending for a
  while and is very welcome. The signal handling fixes resolved an issue
  that was incorrectly attributed to the set_fs changes    - Linus ]

* tag 'm68k-for-v5.15-tag3' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k: Remove set_fs()
  m68k: Provide __{get,put}_kernel_nofault
  m68k: Factor the 8-byte lowlevel {get,put}_user code into helpers
  m68k: Use BUILD_BUG for passing invalid sizes to get_user/put_user
  m68k: Remove the 030 case in virt_to_phys_slow
  m68k: Document that access_ok is broken for !CONFIG_CPU_HAS_ADDRESS_SPACES
  m68k: Leave stack mangling to asm wrapper of sigreturn()
  m68k: Update ->thread.esp0 before calling syscall_trace() in ret_from_signal
  m68k: Handle arrivals of multiple signals correctly
2021-09-28 13:24:43 -07:00
Linus Torvalds
dca50f08a0 Merge tag 'nios2_fixes_for_v5.15_part1' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux
Pull nios2 fixes from Dinh Nguyen:

 - Fix build warning for unmet dependency for EARLY_PRINTK

 - Remove unused dram_start() function

* tag 'nios2_fixes_for_v5.15_part1' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
  NIOS2: setup.c: drop unused variable 'dram_start'
  NIOS2: fix kconfig unmet dependency warning for SERIAL_CORE_CONSOLE
2021-09-28 13:16:52 -07:00
David S. Miller
4ccb9f03fe Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-09-28

The following pull-request contains BPF updates for your *net* tree.

We've added 10 non-merge commits during the last 14 day(s) which contain
a total of 11 files changed, 139 insertions(+), 53 deletions(-).

The main changes are:

1) Fix MIPS JIT jump code emission for too large offsets, from Piotr Krysiuk.

2) Fix x86 JIT atomic/fetch emission when dst reg maps to rax, from Johan Almbladh.

3) Fix cgroup_sk_alloc corner case when called from interrupt, from Daniel Borkmann.

4) Fix segfault in libbpf's linker for objects without BTF, from Kumar Kartikeya Dwivedi.

5) Fix bpf_jit_charge_modmem for applications with CAP_BPF, from Lorenz Bauer.

6) Fix return value handling for struct_ops BPF programs, from Hou Tao.

7) Various fixes to BPF selftests, from Jiri Benc.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
,
2021-09-28 13:52:46 +01:00
Johan Almbladh
ced185824c bpf, x86: Fix bpf mapping of atomic fetch implementation
Fix the case where the dst register maps to %rax as otherwise this produces
an incorrect mapping with the implementation in 981f94c3e9 ("bpf: Add
bitwise atomic instructions") as %rax is clobbered given it's part of the
cmpxchg as operand.

The issue is similar to b29dd96b90 ("bpf, x86: Fix BPF_FETCH atomic and/or/
xor with r0 as src") just that the case of dst register was missed.

Before, dst=r0 (%rax) src=r2 (%rsi):

  [...]
  c5:   mov    %rax,%r10
  c8:   mov    0x0(%rax),%rax       <---+ (broken)
  cc:   mov    %rax,%r11                |
  cf:   and    %rsi,%r11                |
  d2:   lock cmpxchg %r11,0x0(%rax) <---+
  d8:   jne    0x00000000000000c8       |
  da:   mov    %rax,%rsi                |
  dd:   mov    %r10,%rax                |
  [...]                                 |
                                        |
After, dst=r0 (%rax) src=r2 (%rsi):     |
                                        |
  [...]                                 |
  da:	mov    %rax,%r10                |
  dd:	mov    0x0(%r10),%rax       <---+ (fixed)
  e1:	mov    %rax,%r11                |
  e4:	and    %rsi,%r11                |
  e7:	lock cmpxchg %r11,0x0(%r10) <---+
  ed:	jne    0x00000000000000dd
  ef:	mov    %rax,%rsi
  f2:	mov    %r10,%rax
  [...]

The remaining combinations were fine as-is though:

After, dst=r9 (%r15) src=r0 (%rax):

  [...]
  dc:	mov    %rax,%r10
  df:	mov    0x0(%r15),%rax
  e3:	mov    %rax,%r11
  e6:	and    %r10,%r11
  e9:	lock cmpxchg %r11,0x0(%r15)
  ef:	jne    0x00000000000000df      _
  f1:	mov    %rax,%r10                | (unneeded, but
  f4:	mov    %r10,%rax               _|  not a problem)
  [...]

After, dst=r9 (%r15) src=r4 (%rcx):

  [...]
  de:	mov    %rax,%r10
  e1:	mov    0x0(%r15),%rax
  e5:	mov    %rax,%r11
  e8:	and    %rcx,%r11
  eb:	lock cmpxchg %r11,0x0(%r15)
  f1:	jne    0x00000000000000e1
  f3:	mov    %rax,%rcx
  f6:	mov    %r10,%rax
  [...]

The case of dst == src register is rejected by the verifier and
therefore not supported, but x86 JIT also handles this case just
fine.

After, dst=r0 (%rax) src=r0 (%rax):

  [...]
  eb:	mov    %rax,%r10
  ee:	mov    0x0(%r10),%rax
  f2:	mov    %rax,%r11
  f5:	and    %r10,%r11
  f8:	lock cmpxchg %r11,0x0(%r10)
  fe:	jne    0x00000000000000ee
 100:	mov    %rax,%r10
 103:	mov    %r10,%rax
  [...]

Fixes: 981f94c3e9 ("bpf: Add bitwise atomic instructions")
Reported-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-09-28 12:10:29 +02:00
Linus Torvalds
9cccec2bf3 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "A bit late... I got sidetracked by back-from-vacation routines and
  conferences. But most of these patches are already a few weeks old and
  things look more calm on the mailing list than what this pull request
  would suggest.

  x86:

   - missing TLB flush

   - nested virtualization fixes for SMM (secure boot on nested
     hypervisor) and other nested SVM fixes

   - syscall fuzzing fixes

   - live migration fix for AMD SEV

   - mirror VMs now work for SEV-ES too

   - fixes for reset

   - possible out-of-bounds access in IOAPIC emulation

   - fix enlightened VMCS on Windows 2022

  ARM:

   - Add missing FORCE target when building the EL2 object

   - Fix a PMU probe regression on some platforms

  Generic:

   - KCSAN fixes

  selftests:

   - random fixes, mostly for clang compilation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits)
  selftests: KVM: Explicitly use movq to read xmm registers
  selftests: KVM: Call ucall_init when setting up in rseq_test
  KVM: Remove tlbs_dirty
  KVM: X86: Synchronize the shadow pagetable before link it
  KVM: X86: Fix missed remote tlb flush in rmap_write_protect()
  KVM: x86: nSVM: don't copy virt_ext from vmcb12
  KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround
  KVM: x86: selftests: test simultaneous uses of V_IRQ from L1 and L0
  KVM: x86: nSVM: restore int_vector in svm_clear_vintr
  kvm: x86: Add AMD PMU MSRs to msrs_to_save_all[]
  KVM: x86: nVMX: re-evaluate emulation_required on nested VM exit
  KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry
  KVM: x86: VMX: synthesize invalid VM exit when emulating invalid guest state
  KVM: x86: nSVM: refactor svm_leave_smm and smm_enter_smm
  KVM: x86: SVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM mode
  KVM: x86: reset pdptrs_from_userspace when exiting smm
  KVM: x86: nSVM: restore the L1 host state prior to resuming nested guest on SMM exit
  KVM: nVMX: Filter out all unsupported controls when eVMCS was activated
  KVM: KVM: Use cpumask_available() to check for NULL cpumask when kicking vCPUs
  KVM: Clean up benign vcpu->cpu data races when kicking vCPUs
  ...
2021-09-27 13:58:23 -07:00
Randy Dunlap
9523b33cc3 NIOS2: setup.c: drop unused variable 'dram_start'
This is a nuisance when CONFIG_WERROR is set, so drop the variable
declaration since the code that used it was removed.

../arch/nios2/kernel/setup.c: In function 'setup_arch':
../arch/nios2/kernel/setup.c:152:13: warning: unused variable 'dram_start' [-Wunused-variable]
  152 |         int dram_start;

Fixes: 7f7bc20bc4 ("nios2: Don't use _end for calculating min_low_pfn")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Andreas Oetken <andreas.oetken@siemens.com>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2021-09-27 08:34:54 -05:00
Linus Torvalds
5bb7b2107f Merge tag 'x86-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "A set of fixes for X86:

   - Prevent sending the wrong signal when protection keys are enabled
     and the kernel handles a fault in the vsyscall emulation.

   - Invoke early_reserve_memory() before invoking e820_memory_setup()
     which is required to make the Xen dom0 e820 hooks work correctly.

   - Use the correct data type for the SETZ operand in the EMQCMDS
     instruction wrapper.

   - Prevent undefined behaviour to the potential unaligned accesss in
     the instruction decoder library"

* tag 'x86-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses
  x86/asm: Fix SETZ size enqcmds() build failure
  x86/setup: Call early_reserve_memory() earlier
  x86/fault: Fix wrong signal when vsyscall fails with pkey
2021-09-26 10:09:20 -07:00
Linus Torvalds
a3b397b4ff Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "16 patches.

  Subsystems affected by this patch series: xtensa, sh, ocfs2, scripts,
  lib, and mm (memory-failure, kasan, damon, shmem, tools, pagecache,
  debug, and pagemap)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: fix uninitialized use in overcommit_policy_handler
  mm/memory_failure: fix the missing pte_unmap() call
  kasan: always respect CONFIG_KASAN_STACK
  sh: pgtable-3level: fix cast to pointer from integer of different size
  mm/debug: sync up latest migrate_reason to migrate_reason_names
  mm/debug: sync up MR_CONTIG_RANGE and MR_LONGTERM_PIN
  mm: fs: invalidate bh_lrus for only cold path
  lib/zlib_inflate/inffast: check config in C to avoid unused function warning
  tools/vm/page-types: remove dependency on opt_file for idle page tracking
  scripts/sorttable: riscv: fix undeclared identifier 'EM_RISCV' error
  ocfs2: drop acl cache for directories too
  mm/shmem.c: fix judgment error in shmem_is_huge()
  xtensa: increase size of gcc stack frame check
  mm/damon: don't use strnlen() with known-bogus source length
  kasan: fix Kconfig check of CC_HAS_WORKING_NOSANITIZE_ADDRESS
  mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable()
2021-09-25 16:20:34 -07:00
Linus Torvalds
5739844347 Merge tag 'for-linus-5.15b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
 "Some minor cleanups and fixes of some theoretical bugs, as well as a
  fix of a bug introduced in 5.15-rc1"

* tag 'for-linus-5.15b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/x86: fix PV trap handling on secondary processors
  xen/balloon: fix balloon kthread freezing
  swiotlb-xen: this is PV-only on x86
  xen/pci-swiotlb: reduce visibility of symbols
  PCI: only build xen-pcifront in PV-enabled environments
  swiotlb-xen: ensure to issue well-formed XENMEM_exchange requests
  Xen/gntdev: don't ignore kernel unmapping error
  xen/x86: drop redundant zeroing from cpu_initialize_context()
2021-09-25 15:37:31 -07:00
Linus Torvalds
2c4e969c38 Merge tag 'usb-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB driver fixes from Greg KH:
 "Here are some USB driver fixes and new device ids for 5.15-rc3.

  They include:

   - usb-storage quirk additions

   - usb-serial new device ids

   - usb-serial driver fixes

   - USB roothub registration bugfix to resolve a long-reported issue

   - usb gadget driver fixes for a large number of small things

   - dwc2 driver fixes

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'usb-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (28 commits)
  USB: serial: option: add device id for Foxconn T99W265
  USB: serial: cp210x: add ID for GW Instek GDM-834x Digital Multimeter
  USB: serial: cp210x: add part-number debug printk
  USB: serial: cp210x: fix dropped characters with CP2102
  MAINTAINERS: usb, update Peter Korsgaard's entries
  usb: musb: tusb6010: uninitialized data in tusb_fifo_write_unaligned()
  usb-storage: Add quirk for ScanLogic SL11R-IDE older than 2.6c
  Re-enable UAS for LaCie Rugged USB3-FW with fk quirk
  USB: serial: option: remove duplicate USB device ID
  USB: serial: mos7840: remove duplicated 0xac24 device ID
  arm64: dts: qcom: ipq8074: remove USB tx-fifo-resize property
  usb: gadget: f_uac2: Populate SS descriptors' wBytesPerInterval
  usb: gadget: f_uac2: Add missing companion descriptor for feedback EP
  usb: dwc2: gadget: Fix ISOC transfer complete handling for DDMA
  usb: core: hcd: Modularize HCD stop configuration in usb_stop_hcd()
  xhci: Set HCD flag to defer primary roothub registration
  usb: core: hcd: Add support for deferring roothub registration
  usb: dwc2: gadget: Fix ISOC flow for BDMA and Slave
  usb: dwc3: core: balance phy init and exit
  Revert "USB: bcma: Add a check for devm_gpiod_get"
  ...
2021-09-25 10:10:38 -07:00
Geert Uytterhoeven
e8e9f1e632 sh: pgtable-3level: fix cast to pointer from integer of different size
If X2TLB=y (CPU_SHX2=y or CPU_SHX3=y, e.g. migor_defconfig), pgd_t.pgd
is "unsigned long long", causing:

    In file included from arch/sh/include/asm/pgtable.h:13,
		     from include/linux/pgtable.h:6,
		     from include/linux/mm.h:33,
		     from arch/sh/kernel/asm-offsets.c:14:
    arch/sh/include/asm/pgtable-3level.h: In function `pud_pgtable':
    arch/sh/include/asm/pgtable-3level.h:37:9: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
       37 |  return (pmd_t *)pud_val(pud);
	  |         ^

Fix this by adding an intermediate cast to "unsigned long", which is
basically what the old code did before.

Link: https://lkml.kernel.org/r/2c2eef3c9a2f57e5609100a4864715ccf253d30f.1631713483.git.geert+renesas@glider.be
Fixes: 9cf6fa2458 ("mm: rename pud_page_vaddr to pud_pgtable and make it return pmd_t *")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Daniel Palmer <daniel@thingy.jp>
Acked-by: Rob Landley <rob@landley.net>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Cc: Rich Felker <dalias@libc.org>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Jacopo Mondi <jacopo+renesas@jmondi.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-24 16:13:35 -07:00
Linus Torvalds
ea1f9163ac Merge tag 'acpi-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
 "Revert a recent commit related to memory management that turned out to
  be problematic (Jia He)"

* tag 'acpi-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "ACPI: Add memory semantics to acpi_os_map_memory()"
2021-09-24 11:17:32 -07:00
Linus Torvalds
1b7eaf5701 Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:

 - It turns out that the optimised string routines merged in 5.14 are
   not safe with in-kernel MTE (KASAN_HW_TAGS) because of reading beyond
   the end of a string (strcmp, strncmp). Such reading may go across a
   16 byte tag granule and cause a tag check fault. When KASAN_HW_TAGS
   is enabled, use the generic strcmp/strncmp C implementation.

 - An errata workaround for ThunderX relied on the CPU capabilities
   being enabled in a specific order. This disappeared with the
   automatic generation of the cpucaps.h file (sorted alphabetically).
   Fix it by checking the current CPU only rather than the system-wide
   capability.

 - Add system_supports_mte() checks on the kernel entry/exit path and
   thread switching to avoid unnecessary barriers and function calls on
   systems where MTE is not supported.

 - kselftests: skip arm64 tests if the required features are missing.

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Restore forced disabling of KPTI on ThunderX
  kselftest/arm64: signal: Skip tests if required features are missing
  arm64: Mitigate MTE issues with str{n}cmp()
  arm64: add MTE supported check to thread switching and syscall entry/exit
2021-09-24 11:12:17 -07:00
Randy Dunlap
adfc8f9d2f NIOS2: fix kconfig unmet dependency warning for SERIAL_CORE_CONSOLE
SERIAL_CORE_CONSOLE depends on TTY so EARLY_PRINTK should also
depend on TTY so that it does not select SERIAL_CORE_CONSOLE
inadvertently.

WARNING: unmet direct dependencies detected for SERIAL_CORE_CONSOLE
  Depends on [n]: TTY [=n] && HAS_IOMEM [=y]
  Selected by [y]:
  - EARLY_PRINTK [=y]

Fixes: e8bf5bc776 ("nios2: add early printk support")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2021-09-24 08:49:39 -05:00
Christoph Hellwig
9fde034864 m68k: Remove set_fs()
Add a m68k-only set_fc helper to set the SFC and DFC registers for the
few places that need to override it for special MM operations, but
disconnect that from the deprecated kernel-wide set_fs() API.

Note that the SFC/DFC registers are context switched, so there is no need
to disable preemption.

Partially based on an earlier patch from
Linus Torvalds <torvalds@linux-foundation.org>.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Link: https://lore.kernel.org/r/20210916070405.52750-7-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-09-24 13:35:07 +02:00
Christoph Hellwig
8ade833909 m68k: Provide __{get,put}_kernel_nofault
Allow non-faulting access to kernel addresses without overriding the
address space.  Implemented by passing the instruction name to the
low-level assembly macros as an argument, and force the use of the
normal move instructions for kernel access.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Link: https://lore.kernel.org/r/20210916070405.52750-6-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-09-24 13:35:07 +02:00
Christoph Hellwig
01eec1af5e m68k: Factor the 8-byte lowlevel {get,put}_user code into helpers
Add new helpers for doing the grunt work of the 8-byte {get,put}_user
routines to allow for better reuse.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Link: https://lore.kernel.org/r/20210916070405.52750-5-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-09-24 13:35:06 +02:00
Christoph Hellwig
25d2cae4a5 m68k: Use BUILD_BUG for passing invalid sizes to get_user/put_user
Simplify the handling a bit by using the common helper instead of
referencing undefined symbols.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Link: https://lore.kernel.org/r/20210916070405.52750-4-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-09-24 13:35:06 +02:00
Christoph Hellwig
c4f607c312 m68k: Remove the 030 case in virt_to_phys_slow
The 030 case in virt_to_phys_slow can't ever be reached, so remove it.

Suggested-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Link: https://lore.kernel.org/r/20210916070405.52750-3-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-09-24 13:35:06 +02:00
Christoph Hellwig
1dc4027bc8 m68k: Document that access_ok is broken for !CONFIG_CPU_HAS_ADDRESS_SPACES
Document that access_ok is completely broken for coldfire and friends at
the moment.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Link: https://lore.kernel.org/r/20210916070405.52750-2-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-09-24 13:35:06 +02:00
Al Viro
0d20abde98 m68k: Leave stack mangling to asm wrapper of sigreturn()
sigreturn has to deal with an unpleasant problem - exception stack frames
have different sizes, depending upon the exception (and processor model, as
well) and variable-sized part of exception frame may contain information
needed for instruction restart.  So when signal handler terminates and calls
sigreturn to resume the execution at the place where we'd been when we caught
the signal, it has to rearrange the frame at the bottom of kernel stack.
Worse, it might need to open a gap in the kernel stack, shifting pt_regs
towards lower addresses.

Doing that from C is insane - we'd need to shift stack frames (return addresses,
local variables, etc.) of C call chain, right under the nose of compiler and
hope it won't fall apart horribly.  What had been actually done is only slightly
less insane - an inline asm in mangle_kernel_stack() moved the stuff around,
then reset stack pointer and jumped to label in asm glue.

However, we can avoid all that mess if the asm wrapper we have to use anyway
would reserve some space on the stack between switch_stack and the C stack
frame of do_{rt_,}sigreturn().   Then C part can simply memmove() pt_regs +
switch_stack, memcpy() the variable part of exception frame into the opened
gap - all of that without inline asm, buggering C call chain, magical jumps
to asm labels, etc.

Asm wrapper would need to know where the moved switch_stack has ended up -
it might have been shifted into the gap we'd reserved before do_rt_sigreturn()
call.  That's where it needs to set the stack pointer to.  So let the C part
return just that and be done with that.

While we are at it, the call of berr_040cleanup() we need to do when
returning via 68040 bus error exception frame can be moved into C part
as well.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Link: https://lore.kernel.org/r/YP2dTQPm1wGPWFgD@zeniv-ca.linux.org.uk
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-09-24 13:35:03 +02:00
Al Viro
50e43a5733 m68k: Update ->thread.esp0 before calling syscall_trace() in ret_from_signal
We get there when sigreturn has performed obscene acts on kernel stack;
in particular, the location of pt_regs has shifted.  We are about to call
syscall_trace(), which might stop for tracer.  If that happens, we'd better
have task_pt_regs() returning correct result...

Fucked-up-by: Al Viro <viro@zeniv.linux.org.uk>
Fixes: bd6f56a75b ("m68k: Missing syscall_trace() on sigreturn")
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Link: https://lore.kernel.org/r/YP2dMWeV1LkHiOpr@zeniv-ca.linux.org.uk
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-09-24 13:35:01 +02:00
Al Viro
4bb0bd81ce m68k: Handle arrivals of multiple signals correctly
When we have several pending signals, have entered with the kernel
with large exception frame *and* have already built at least one
sigframe, regs->stkadj is going to be non-zero and regs->format/sr/pc
are going to be junk - the real values are in shifted exception stack
frame we'd built when putting together the first sigframe.

If that happens, subsequent sigframes are going to be garbage.
Not hard to fix - just need to find the "adjusted" frame first
and look for format/vector/sr/pc in it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Link: https://lore.kernel.org/r/YP2dBIAPTaVvHiZ6@zeniv-ca.linux.org.uk
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-09-24 13:34:46 +02:00
Numfor Mbiziwo-Tiapo
5ba1071f75 x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses
Don't perform unaligned loads in __get_next() and __peek_nbyte_next() as
these are forms of undefined behavior:

"A pointer to an object or incomplete type may be converted to a pointer
to a different object or incomplete type. If the resulting pointer
is not correctly aligned for the pointed-to type, the behavior is
undefined."

(from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)

These problems were identified using the undefined behavior sanitizer
(ubsan) with the tools version of the code and perf test.

 [ bp: Massage commit message. ]

Signed-off-by: Numfor Mbiziwo-Tiapo <nums@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20210923161843.751834-1-irogers@google.com
2021-09-24 12:37:38 +02:00
Paolo Bonzini
50b0781846 Merge tag 'kvmarm-fixes-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master
KVM/arm64 fixes for 5.15, take #1

- Add missing FORCE target when building the EL2 object
- Fix a PMU probe regression on some platforms
2021-09-24 06:04:42 -04:00
Josh Poimboeuf
0e14ef3866 crypto: x86/sm4 - Fix frame pointer stack corruption
sm4_aesni_avx_crypt8() sets up the frame pointer (which includes pushing
RBP) before doing a conditional sibling call to sm4_aesni_avx_crypt4(),
which sets up an additional frame pointer.  Things will not go well when
sm4_aesni_avx_crypt4() pops only the innermost single frame pointer and
then tries to return to the outermost frame pointer.

Sibling calls need to occur with an empty stack frame.  Do the
conditional sibling call *before* setting up the stack pointer.

This fixes the following warning:

  arch/x86/crypto/sm4-aesni-avx-asm_64.o: warning: objtool: sm4_aesni_avx_crypt8()+0x8: sibling call from callable instruction with modified stack frame

Fixes: a7ee22ee14 ("crypto: x86/sm4 - add AES-NI/AVX/x86_64 implementation")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Arnd Bergmann <arnd@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-09-24 15:58:50 +08:00
Jia He
12064c1768 Revert "ACPI: Add memory semantics to acpi_os_map_memory()"
This reverts commit 437b38c511.

The memory semantics added in commit 437b38c511 causes SystemMemory
Operation region, whose address range is not described in the EFI memory
map to be mapped as NormalNC memory on arm64 platforms (through
acpi_os_map_memory() in acpi_ex_system_memory_space_handler()).

This triggers the following abort on an ARM64 Ampere eMAG machine,
because presumably the physical address range area backing the Opregion
does not support NormalNC memory attributes driven on the bus.

 Internal error: synchronous external abort: 96000410 [#1] SMP
 Modules linked in:
 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0+ #462
 Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 0.14 02/22/2019
 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[...snip...]
 Call trace:
  acpi_ex_system_memory_space_handler+0x26c/0x2c8
  acpi_ev_address_space_dispatch+0x228/0x2c4
  acpi_ex_access_region+0x114/0x268
  acpi_ex_field_datum_io+0x128/0x1b8
  acpi_ex_extract_from_field+0x14c/0x2ac
  acpi_ex_read_data_from_field+0x190/0x1b8
  acpi_ex_resolve_node_to_value+0x1ec/0x288
  acpi_ex_resolve_to_value+0x250/0x274
  acpi_ds_evaluate_name_path+0xac/0x124
  acpi_ds_exec_end_op+0x90/0x410
  acpi_ps_parse_loop+0x4ac/0x5d8
  acpi_ps_parse_aml+0xe0/0x2c8
  acpi_ps_execute_method+0x19c/0x1ac
  acpi_ns_evaluate+0x1f8/0x26c
  acpi_ns_init_one_device+0x104/0x140
  acpi_ns_walk_namespace+0x158/0x1d0
  acpi_ns_initialize_devices+0x194/0x218
  acpi_initialize_objects+0x48/0x50
  acpi_init+0xe0/0x498

If the Opregion address range is not present in the EFI memory map there
is no way for us to determine the memory attributes to use to map it -
defaulting to NormalNC does not work (and it is not correct on a memory
region that may have read side-effects) and therefore commit
437b38c511 should be reverted, which means reverting back to the
original behavior whereby address ranges that are mapped using
acpi_os_map_memory() default to the safe devicenGnRnE attributes on
ARM64 if the mapped address range is not defined in the EFI memory map.

Fixes: 437b38c511 ("ACPI: Add memory semantics to acpi_os_map_memory()")
Signed-off-by: Jia He <justin.he@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-09-23 20:39:36 +02:00
Linus Torvalds
f10f0481a5 Merge tag 'for-linus-rseq' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull rseq fixes from Paolo Bonzini:
 "A fix for a bug with restartable sequences and KVM.

  KVM's handling of TIF_NOTIFY_RESUME, e.g. for task migration, clears
  the flag without informing rseq and leads to stale data in userspace's
  rseq struct"

* tag 'for-linus-rseq' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: selftests: Remove __NR_userfaultfd syscall fallback
  KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs
  tools: Move x86 syscall number fallbacks to .../uapi/
  entry: rseq: Call rseq_handle_notify_resume() in tracehook_notify_resume()
  KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest
2021-09-23 11:24:12 -07:00
Jakub Kicinski
2fcd14d0f7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/mptcp/protocol.c
  977d293e23 ("mptcp: ensure tx skbs always have the MPTCP ext")
  efe686ffce ("mptcp: ensure tx skbs always have the MPTCP ext")

same patch merged in both trees, keep net-next.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-23 11:19:49 -07:00
Linus Torvalds
9bc62afe03 Merge tag 'net-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Current release - regressions:

   - dsa: bcm_sf2: fix array overrun in bcm_sf2_num_active_ports()

  Previous releases - regressions:

   - introduce a shutdown method to mdio device drivers, and make DSA
     switch drivers compatible with masters disappearing on shutdown;
     preventing infinite reference wait

   - fix issues in mdiobus users related to ->shutdown vs ->remove

   - virtio-net: fix pages leaking when building skb in big mode

   - xen-netback: correct success/error reporting for the
     SKB-with-fraglist

   - dsa: tear down devlink port regions when tearing down the devlink
     port on error

   - nexthop: fix division by zero while replacing a resilient group

   - hns3: check queue, vf, vlan ids range before using

  Previous releases - always broken:

   - napi: fix race against netpoll causing NAPI getting stuck

   - mlx4_en: ensure link operstate is updated even if link comes up
     before netdev registration

   - bnxt_en: fix TX timeout when TX ring size is set to the smallest

   - enetc: fix illegal access when reading affinity_hint; prevent oops
     on sysfs access

   - mtk_eth_soc: avoid creating duplicate offload entries

  Misc:

   - core: correct the sock::sk_lock.owned lockdep annotations"

* tag 'net-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits)
  atlantic: Fix issue in the pm resume flow.
  net/mlx4_en: Don't allow aRFS for encapsulated packets
  net: mscc: ocelot: fix forwarding from BLOCKING ports remaining enabled
  net: ethernet: mtk_eth_soc: avoid creating duplicate offload entries
  nfc: st-nci: Add SPI ID matching DT compatible
  MAINTAINERS: remove Guvenc Gulce as net/smc maintainer
  nexthop: Fix memory leaks in nexthop notification chain listeners
  mptcp: ensure tx skbs always have the MPTCP ext
  qed: rdma - don't wait for resources under hw error recovery flow
  s390/qeth: fix deadlock during failing recovery
  s390/qeth: Fix deadlock in remove_discipline
  s390/qeth: fix NULL deref in qeth_clear_working_pool_list()
  net: dsa: realtek: register the MDIO bus under devres
  net: dsa: don't allocate the slave_mii_bus using devres
  Doc: networking: Fox a typo in ice.rst
  net: dsa: fix dsa_tree_setup error path
  net/smc: fix 'workqueue leaked lock' in smc_conn_abort_work
  net/smc: add missing error check in smc_clc_prfx_set()
  net: hns3: fix a return value error in hclge_get_reset_status()
  net: hns3: check vlan id before using it
  ...
2021-09-23 10:30:31 -07:00
Lai Jiangshan
65855ed8b0 KVM: X86: Synchronize the shadow pagetable before link it
If gpte is changed from non-present to present, the guest doesn't need
to flush tlb per SDM.  So the host must synchronze sp before
link it.  Otherwise the guest might use a wrong mapping.

For example: the guest first changes a level-1 pagetable, and then
links its parent to a new place where the original gpte is non-present.
Finally the guest can access the remapped area without flushing
the tlb.  The guest's behavior should be allowed per SDM, but the host
kvm mmu makes it wrong.

Fixes: 4731d4c7a0 ("KVM: MMU: out of sync shadow core")
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210918005636.3675-3-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23 11:01:00 -04:00
dann frazier
22b70e6f2d arm64: Restore forced disabling of KPTI on ThunderX
A noted side-effect of commit 0c6c2d3615 ("arm64: Generate cpucaps.h")
is that cpucaps are now sorted, changing the enumeration order. This
assumed no dependencies between cpucaps, which turned out not to be true
in one case. UNMAP_KERNEL_AT_EL0 currently needs to be processed after
WORKAROUND_CAVIUM_27456. ThunderX systems are incompatible with KPTI, so
unmap_kernel_at_el0() bails if WORKAROUND_CAVIUM_27456 is set. But because
of the sorting, WORKAROUND_CAVIUM_27456 will not yet have been considered
when unmap_kernel_at_el0() checks for it, so the kernel tries to
run w/ KPTI - and quickly falls over.

Because all ThunderX implementations have homogeneous CPUs, we can remove
this dependency by just checking the current CPU for the erratum.

Fixes: 0c6c2d3615 ("arm64: Generate cpucaps.h")
Cc: <stable@vger.kernel.org> # 5.13.x
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210923145002.3394558-1-dann.frazier@canonical.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-09-23 15:59:15 +01:00
Lai Jiangshan
f81602958c KVM: X86: Fix missed remote tlb flush in rmap_write_protect()
When kvm->tlbs_dirty > 0, some rmaps might have been deleted
without flushing tlb remotely after kvm_sync_page().  If @gfn
was writable before and it's rmaps was deleted in kvm_sync_page(),
and if the tlb entry is still in a remote running VCPU,  the @gfn
is not safely protected.

To fix the problem, kvm_sync_page() does the remote flush when
needed to avoid the problem.

Fixes: a4ee1ca4a3 ("KVM: MMU: delay flush all tlbs on sync_page path")
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210918005636.3675-2-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23 10:28:44 -04:00
Maxim Levitsky
faf6b75562 KVM: x86: nSVM: don't copy virt_ext from vmcb12
These field correspond to features that we don't expose yet to L2

While currently there are no CVE worthy features in this field,
if AMD adds more features to this field, that could allow guest
escapes similar to CVE-2021-3653 and CVE-2021-3656.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210914154825.104886-6-mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23 10:06:46 -04:00
Maxim Levitsky
d1cba6c922 KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround
GP SVM errata workaround made the #GP handler always emulate
the SVM instructions.

However these instructions #GP in case the operand is not 4K aligned,
but the workaround code didn't check this and we ended up
emulating these instructions anyway.

This is only an emulation accuracy check bug as there is no harm for
KVM to read/write unaligned vmcb images.

Fixes: 82a11e9c6f ("KVM: SVM: Add emulation support for #GP triggered by SVM instructions")

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210914154825.104886-4-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23 10:05:29 -04:00