Commit Graph

205068 Commits

Author SHA1 Message Date
Linus Torvalds
9e482602c5 Merge tag 'x86_urgent_for_v6.2_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Borislav Petkov:

 - Prevent the compiler from reordering accesses to debug regs which
   could cause a #VC exception in SEV-ES guests at the wrong place in
   the NMI handling path

* tag 'x86_urgent_for_v6.2_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/debug: Fix stack recursion caused by wrongly ordered DR7 accesses
2023-02-05 11:28:42 -08:00
Linus Torvalds
837c07cf68 Merge tag 'powerpc-6.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
 "It's a bit of a big batch for rc6, but just because I didn't send any
  fixes the last week or two while I was on vacation, next week should
  be quieter:

   - Fix a few objtool warnings since we recently enabled objtool.

   - Fix a deadlock with the hash MMU vs perf record.

   - Fix perf profiling of asynchronous interrupt handlers.

   - Revert the IMC PMU nest_init_lock to being a mutex.

   - Two commits fixing problems with the kexec_file FDT size
     estimation.

   - Two commits fixing problems with strict RWX vs kernels running at
     non-zero.

   - Reconnect tlb_flush() to hash__tlb_flush()

  Thanks to Kajol Jain, Nicholas Piggin, Sachin Sant Sathvika Vasireddy,
  and Sourabh Jain"

* tag 'powerpc-6.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s: Reconnect tlb_flush() to hash__tlb_flush()
  powerpc/kexec_file: Count hot-pluggable memory in FDT estimate
  powerpc/64s/radix: Fix RWX mapping with relocated kernel
  powerpc/64s/radix: Fix crash with unaligned relocated kernel
  powerpc/kexec_file: Fix division by zero in extra size estimation
  powerpc/imc-pmu: Revert nest_init_lock to being a mutex
  powerpc/64: Fix perf profiling asynchronous interrupt handlers
  powerpc/64s: Fix local irq disable when PMIs are disabled
  powerpc/kvm: Fix unannotated intra-function call warning
  powerpc/85xx: Fix unannotated intra-function call warning
2023-02-04 18:40:51 -08:00
Linus Torvalds
c00f4ddde0 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "ARM64:

   - Yet another fix for non-CPU accesses to the memory backing the
     VGICv3 subsystem

   - A set of fixes for the setlftest checking for the S1PTW behaviour
     after the fix that went in ealier in the cycle"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: selftests: aarch64: Test read-only PT memory regions
  KVM: selftests: aarch64: Fix check of dirty log PT write
  KVM: selftests: aarch64: Do not default to dirty PTE pages on all S1PTWs
  KVM: selftests: aarch64: Relax userfaultfd read vs. write checks
  KVM: arm64: Allow no running vcpu on saving vgic3 pending table
  KVM: arm64: Allow no running vcpu on restoring vgic3 LPI pending status
  KVM: arm64: Add helper vgic_write_guest_lock()
2023-02-04 11:21:27 -08:00
Linus Torvalds
2ab2ba494d Merge tag 'parisc-for-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc architecture fixes from Helge Deller:

 - Fix PTRACE_GETREGS/PTRACE_SETREGS for 32-bit userspace on a 64-bit
   kernel

 - pdc_iodc_print() dropped chars for newline in strings

 - Drop constants in favour of PRIV_USER

 - use safer strscpy() function in pdc_stable driver

* tag 'parisc-for-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Wire up PTRACE_GETREGS/PTRACE_SETREGS for compat case
  parisc: Replace hardcoded value with PRIV_USER constant in ptrace.c
  parisc: Fix return code of pdc_iodc_print()
  parisc: pdc_stable: use strscpy() to instead of strncpy()
2023-02-04 11:15:00 -08:00
Paolo Bonzini
25b72cf7da Merge tag 'kvmarm-fixes-6.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.2, take #3

- Yet another fix for non-CPU accesses to the memory backing
  the VGICv3 subsystem

- A set of fixes for the setlftest checking for the S1PTW
  behaviour after the fix that went in ealier in the cycle
2023-02-04 08:57:43 -05:00
Linus Torvalds
a30df1ea94 Merge tag 'riscv-for-linus-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:

 - A build fix to avoid static branches in cpu_relax(), which greatly
   inflates the jump tables and breaks at least
   CONFIG_CC_OPTIMIZE_FOR_SIZE=y.

 - A fix for a kernel panic when probing impossible instruction
   positions.

 - A fix to disable unwind tables, which are enabled by default for
   GCC-13 and result in unhandled relocations in modules.

* tag 'riscv-for-linus-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: disable generation of unwind tables
  riscv: kprobe: Fixup kernel panic when probing an illegal position
  riscv: Fix build with CONFIG_CC_OPTIMIZE_FOR_SIZE=y
2023-02-03 10:18:39 -08:00
Linus Torvalds
0c272a1d33 Merge tag 'mm-hotfixes-stable-2023-02-02-19-24-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
 "25 hotfixes, mainly for MM.  13 are cc:stable"

* tag 'mm-hotfixes-stable-2023-02-02-19-24-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (26 commits)
  mm: memcg: fix NULL pointer in mem_cgroup_track_foreign_dirty_slowpath()
  Kconfig.debug: fix the help description in SCHED_DEBUG
  mm/swapfile: add cond_resched() in get_swap_pages()
  mm: use stack_depot_early_init for kmemleak
  Squashfs: fix handling and sanity checking of xattr_ids count
  sh: define RUNTIME_DISCARD_EXIT
  highmem: round down the address passed to kunmap_flush_on_unmap()
  migrate: hugetlb: check for hugetlb shared PMD in node migration
  mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smaps
  mm/MADV_COLLAPSE: catch !none !huge !bad pmd lookups
  Revert "mm: kmemleak: alloc gray object for reserved region with direct map"
  freevxfs: Kconfig: fix spelling
  maple_tree: should get pivots boundary by type
  .mailmap: update e-mail address for Eugen Hristev
  mm, mremap: fix mremap() expanding for vma's with vm_ops->close()
  squashfs: harden sanity check in squashfs_read_xattr_id_table
  ia64: fix build error due to switch case label appearing next to declaration
  mm: multi-gen LRU: fix crash during cgroup migration
  Revert "mm: add nodes= arg to memory.reclaim"
  zsmalloc: fix a race with deferred_handles storing
  ...
2023-02-03 10:01:57 -08:00
Linus Torvalds
42c78a5b29 Merge tag 'soc-fixes-6.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
 "The majority of bugfixes is once more for the NXP i.MX platform,
  addressing issue with i.MX8M (UART, watchdog and ethernet) as well as
  imx8dxl power button and the USB modem on an imx7 board.

  The reason that i.MX always shows up here is obviously not that they
  are more buggy than the others, but they have the most boards and are
  good about getting fixes in quickly.

  The other DT fixes are for the Nuvoton wpcm450 flash controller and
  the i2c mux on an ASpeed board.

  Lastly, there are updates to the MAINTAINERS entries for Mediatek,
  AMD/Seattle and NXP SoCs, as well as a lone code fix for error
  handling in the allwinner 'rsb' bus driver"

* tag 'soc-fixes-6.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: dts: wpcm450: Add nuvoton,shm = <&shm> to FIU node
  MAINTAINERS: Update entry for MediaTek SoC support
  MAINTAINERS: amd: drop inactive Brijesh Singh
  ARM: dts: imx7d-smegw01: Fix USB host over-current polarity
  arm64: dts: imx8mm-verdin: Do not power down eth-phy
  MAINTAINERS: match freescale ARM64 DT directory in i.MX entry
  arm64: dts: imx8mm: Fix pad control for UART1_DTE_RX
  ARM: dts: aspeed: Fix pca9849 compatible
  arm64: dts: freescale: imx8dxl: fix sc_pwrkey's property name linux,keycode
  arm64: dts: imx8m-venice: Remove incorrect 'uart-has-rtscts'
  arm64: dts: imx8mm: Reinstate GPIO watchdog always-running property on eDM SBC
  bus: sunxi-rsb: Fix error handling in sunxi_rsb_init()
2023-02-02 13:02:45 -08:00
Linus Torvalds
addfba11b3 Merge tag 's390-6.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Heiko Carstens:

 - With CONFIG_VMAP_STACK enabled it is not possible to load the s390
   specific diag288_wdt watchdog module. The reason is that a pointer to
   a string is passed to an inline assembly; this string however is
   located on the stack, while the instruction within the inline
   assembly expects a physicial address. Fix this by copying the string
   to a kmalloc'ed buffer.

 - The diag288_wdt watchdog module does not indicate that it accesses
   memory from an inline assembly, which it does. Add "memory" to the
   clobber list to prevent the compiler from optimizing code incorrectly
   away.

 - Pass size of the uncompressed kernel image to __decompress() call.
   Otherwise the kernel image decompressor may corrupt/overwrite an
   initrd. This was reported to happen on s390 after commit 2aa14b1ab2
   ("zstd: import usptream v1.5.2").

* tag 's390-6.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/decompressor: specify __decompress() buf len to avoid overflow
  watchdog: diag288_wdt: fix __diag288() inline assembly
  watchdog: diag288_wdt: do not use stack buffers for hardware data
2023-02-02 12:52:47 -08:00
Andreas Schwab
2f394c0e7d riscv: disable generation of unwind tables
GCC 13 will enable -fasynchronous-unwind-tables by default on riscv.  In
the kernel, we don't have any use for unwind tables yet, so disable them.
More importantly, the .eh_frame section brings relocations
(R_RISC_32_PCREL, R_RISCV_SET{6,8,16}, R_RISCV_SUB{6,8,16}) into modules
that we are not prepared to handle.

Signed-off-by: Andreas Schwab <schwab@suse.de>
Link: https://lore.kernel.org/r/mvmzg9xybqu.fsf@suse.de
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-02-01 20:51:57 -08:00
Guo Ren
87f48c7ccc riscv: kprobe: Fixup kernel panic when probing an illegal position
The kernel would panic when probed for an illegal position. eg:

(CONFIG_RISCV_ISA_C=n)

echo 'p:hello kernel_clone+0x16 a0=%a0' >> kprobe_events
echo 1 > events/kprobes/hello/enable
cat trace

Kernel panic - not syncing: stack-protector: Kernel stack
is corrupted in: __do_sys_newfstatat+0xb8/0xb8
CPU: 0 PID: 111 Comm: sh Not tainted
6.2.0-rc1-00027-g2d398fe49a4d #490
Hardware name: riscv-virtio,qemu (DT)
Call Trace:
[<ffffffff80007268>] dump_backtrace+0x38/0x48
[<ffffffff80c5e83c>] show_stack+0x50/0x68
[<ffffffff80c6da28>] dump_stack_lvl+0x60/0x84
[<ffffffff80c6da6c>] dump_stack+0x20/0x30
[<ffffffff80c5ecf4>] panic+0x160/0x374
[<ffffffff80c6db94>] generic_handle_arch_irq+0x0/0xa8
[<ffffffff802deeb0>] sys_newstat+0x0/0x30
[<ffffffff800158c0>] sys_clone+0x20/0x30
[<ffffffff800039e8>] ret_from_syscall+0x0/0x4
---[ end Kernel panic - not syncing: stack-protector:
Kernel stack is corrupted in: __do_sys_newfstatat+0xb8/0xb8 ]---

That is because the kprobe's ebreak instruction broke the kernel's
original code. The user should guarantee the correction of the probe
position, but it couldn't make the kernel panic.

This patch adds arch_check_kprobe in arch_prepare_kprobe to prevent an
illegal position (Such as the middle of an instruction).

Fixes: c22b0bcb1d ("riscv: Add kprobes supported")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20230201040604.3390509-1-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-02-01 20:49:55 -08:00
Michael Ellerman
1665c027af powerpc/64s: Reconnect tlb_flush() to hash__tlb_flush()
Commit baf1ed24b2 ("powerpc/mm: Remove empty hash__ functions")
removed some empty hash MMU flushing routines, but got a bit overeager
and also removed the call to hash__tlb_flush() from tlb_flush().

In regular use this doesn't lead to any noticable breakage, which is a
little concerning. Presumably there are flushes happening via other
paths such as arch_leave_lazy_mmu_mode(), and/or a bit of luck.

Fix it by reinstating the call to hash__tlb_flush().

Fixes: baf1ed24b2 ("powerpc/mm: Remove empty hash__ functions")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230131111407.806770-1-mpe@ellerman.id.au
2023-02-02 13:25:47 +11:00
Helge Deller
316f1f42b5 parisc: Wire up PTRACE_GETREGS/PTRACE_SETREGS for compat case
Wire up the missing ptrace requests PTRACE_GETREGS, PTRACE_SETREGS,
PTRACE_GETFPREGS and PTRACE_SETFPREGS when running 32-bit applications
on 64-bit kernels.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 4.7+
2023-02-01 21:42:37 +01:00
Helge Deller
3f0c17809a parisc: Replace hardcoded value with PRIV_USER constant in ptrace.c
Prefer usage of the PRIV_USER constant over the hard-coded value to set
the lowest 2 bits for the userspace privilege.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 5.16+
2023-02-01 21:42:36 +01:00
Jonathan Neuschäfer
5efb648042 ARM: dts: wpcm450: Add nuvoton,shm = <&shm> to FIU node
The Flash Interface Unit (FIU) should have a reference to the Shared
Memory controller (SHM) so that flash access from the host (x86 computer
managed by the WPCM450 BMC) can be blocked during flash access by the
FIU driver.

Fixes: 38abcb0d68 ("ARM: dts: wpcm450: Add FIU SPI controller node")
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Link: https://lore.kernel.org/r/20230129112611.1176517-1-j.neuschaefer@gmx.net
Signed-off-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20230201044158.962417-1-joel@jms.id.au
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-02-01 17:10:45 +01:00
Palmer Dabbelt
3c349eacc5 Merge patch "riscv: Fix build with CONFIG_CC_OPTIMIZE_FOR_SIZE=y"
This is a single fix, but it conflicts with some recent features.  I'm
merging it on top of the commit it fixes to ease backporting.

* b4-shazam-merge:
  riscv: Fix build with CONFIG_CC_OPTIMIZE_FOR_SIZE=y

Link: https://lore.kernel.org/r/20220922060958.44203-1-samuel@sholland.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31 21:55:43 -08:00
Samuel Holland
0b1d60d6dd riscv: Fix build with CONFIG_CC_OPTIMIZE_FOR_SIZE=y
commit 8eb060e101 ("arch/riscv: add Zihintpause support") broke
building with CONFIG_CC_OPTIMIZE_FOR_SIZE enabled (gcc 11.1.0):

  CC      arch/riscv/kernel/vdso/vgettimeofday.o
In file included from <command-line>:
./arch/riscv/include/asm/jump_label.h: In function 'cpu_relax':
././include/linux/compiler_types.h:285:33: warning: 'asm' operand 0 probably does not match constraints
  285 | #define asm_volatile_goto(x...) asm goto(x)
      |                                 ^~~
./arch/riscv/include/asm/jump_label.h:41:9: note: in expansion of macro 'asm_volatile_goto'
   41 |         asm_volatile_goto(
      |         ^~~~~~~~~~~~~~~~~
././include/linux/compiler_types.h:285:33: error: impossible constraint in 'asm'
  285 | #define asm_volatile_goto(x...) asm goto(x)
      |                                 ^~~
./arch/riscv/include/asm/jump_label.h:41:9: note: in expansion of macro 'asm_volatile_goto'
   41 |         asm_volatile_goto(
      |         ^~~~~~~~~~~~~~~~~
make[1]: *** [scripts/Makefile.build:249: arch/riscv/kernel/vdso/vgettimeofday.o] Error 1
make: *** [arch/riscv/Makefile:128: vdso_prepare] Error 2

Having a static branch in cpu_relax() is problematic because that
function is widely inlined, including in some quite complex functions
like in the VDSO. A quick measurement shows this static branch is
responsible by itself for around 40% of the jump table.

Drop the static branch, which ends up being the same number of
instructions anyway. If Zihintpause is supported, we trade the nop from
the static branch for a div. If Zihintpause is unsupported, we trade the
jump from the static branch for (what gets interpreted as) a nop.

Fixes: 8eb060e101 ("arch/riscv: add Zihintpause support")
Signed-off-by: Samuel Holland <samuel@sholland.org>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31 21:55:27 -08:00
Sourabh Jain
fc546faa55 powerpc/kexec_file: Count hot-pluggable memory in FDT estimate
On Systems where online memory is lesser compared to max memory, the
kexec_file_load system call may fail to load the kdump kernel with the
below errors:

    "Failed to update fdt with linux,drconf-usable-memory property"
    "Error setting up usable-memory property for kdump kernel"

This happens because the size estimation for usable memory properties
for the kdump kernel's FDT is based on the online memory whereas the
usable memory properties include max memory. In short, the hot-pluggable
memory is not accounted for while estimating the size of the usable
memory properties.

The issue is addressed by calculating usable memory property size using
max hotplug address instead of the last online memory address.

Fixes: 2377c92e37 ("powerpc/kexec_file: fix FDT size estimation for kdump kernel")
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230131030615.729894-1-sourabhjain@linux.ibm.com
2023-02-01 13:04:40 +11:00
Tom Saeger
c1c551bebf sh: define RUNTIME_DISCARD_EXIT
sh vmlinux fails to link with GNU ld < 2.40 (likely < 2.36) since
commit 99cb0d917f ("arch: fix broken BuildID for arm64 and riscv").

This is similar to fixes for powerpc and s390:
commit 4b9880dbf3 ("powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXIT").
commit a494398bde ("s390: define RUNTIME_DISCARD_EXIT to fix link error
with GNU ld < 2.36").

  $ sh4-linux-gnu-ld --version | head -n1
  GNU ld (GNU Binutils for Debian) 2.35.2

  $ make ARCH=sh CROSS_COMPILE=sh4-linux-gnu- microdev_defconfig
  $ make ARCH=sh CROSS_COMPILE=sh4-linux-gnu-

  `.exit.text' referenced in section `__bug_table' of crypto/algboss.o:
  defined in discarded section `.exit.text' of crypto/algboss.o
  `.exit.text' referenced in section `__bug_table' of
  drivers/char/hw_random/core.o: defined in discarded section
  `.exit.text' of drivers/char/hw_random/core.o
  make[2]: *** [scripts/Makefile.vmlinux:34: vmlinux] Error 1
  make[1]: *** [Makefile:1252: vmlinux] Error 2

arch/sh/kernel/vmlinux.lds.S keeps EXIT_TEXT:

	/*
	 * .exit.text is discarded at runtime, not link time, to deal with
	 * references from __bug_table
	 */
	.exit.text : AT(ADDR(.exit.text)) { EXIT_TEXT }

However, EXIT_TEXT is thrown away by
DISCARD(include/asm-generic/vmlinux.lds.h) because
sh does not define RUNTIME_DISCARD_EXIT.

GNU ld 2.40 does not have this issue and builds fine.
This corresponds with Masahiro's comments in a494398bde:
"Nathan [Chancellor] also found that binutils
commit 21401fc7bf67 ("Duplicate output sections in scripts") cured this
issue, so we cannot reproduce it with binutils 2.36+, but it is better
to not rely on it."

Link: https://lkml.kernel.org/r/9166a8abdc0f979e50377e61780a4bba1dfa2f52.1674518464.git.tom.saeger@oracle.com
Fixes: 99cb0d917f ("arch: fix broken BuildID for arm64 and riscv")
Link: https://lore.kernel.org/all/Y7Jal56f6UBh1abE@dev-arch.thelio-3990X/
Link: https://lore.kernel.org/all/20230123194218.47ssfzhrpnv3xfez@oracle.com/
Signed-off-by: Tom Saeger <tom.saeger@oracle.com>
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dennis Gilmore <dennis@ausil.us>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31 16:44:09 -08:00
James Morse
6f28a26134 ia64: fix build error due to switch case label appearing next to declaration
Since commit aa06a9bd85 ("ia64: fix clock_getres(CLOCK_MONOTONIC) to
report ITC frequency"), gcc 10.1.0 fails to build ia64 with the gnomic:
| ../arch/ia64/kernel/sys_ia64.c: In function 'ia64_clock_getres':
| ../arch/ia64/kernel/sys_ia64.c:189:3: error: a label can only be part of a statement and a declaration is not a statement
|   189 |   s64 tick_ns = DIV_ROUND_UP(NSEC_PER_SEC, local_cpu_data->itc_freq);

This line appears immediately after a case label in a switch.

Move the declarations out of the case, to the top of the function.

Link: https://lkml.kernel.org/r/20230117151632.393836-1-james.morse@arm.com
Fixes: aa06a9bd85 ("ia64: fix clock_getres(CLOCK_MONOTONIC) to report ITC frequency")
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Sergei Trofimovich <slyich@gmail.com>
Cc: Émeric Maschino <emeric.maschino@gmail.com>
Cc: matoro <matoro_mailinglist_kernel@matoro.tk>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31 16:44:08 -08:00
Vasily Gorbik
7ab41c2c08 s390/decompressor: specify __decompress() buf len to avoid overflow
Historically calls to __decompress() didn't specify "out_len" parameter
on many architectures including s390, expecting that no writes beyond
uncompressed kernel image are performed. This has changed since commit
2aa14b1ab2 ("zstd: import usptream v1.5.2") which includes zstd library
commit 6a7ede3dfccb ("Reduce size of dctx by reutilizing dst buffer
(#2751)"). Now zstd decompression code might store literal buffer in
the unwritten portion of the destination buffer. Since "out_len" is
not set, it is considered to be unlimited and hence free to use for
optimization needs. On s390 this might corrupt initrd or ipl report
which are often placed right after the decompressor buffer. Luckily the
size of uncompressed kernel image is already known to the decompressor,
so to avoid the problem simply specify it in the "out_len" parameter.

Link: https://github.com/facebook/zstd/commit/6a7ede3dfccb
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Link: https://lore.kernel.org/r/patch-1.thread-41c676.git-41c676c2d153.your-ad-here.call-01675030179-ext-9637@work.hours
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-31 18:54:21 +01:00
Joerg Roedel
9d2c7203ff x86/debug: Fix stack recursion caused by wrongly ordered DR7 accesses
In kernels compiled with CONFIG_PARAVIRT=n, the compiler re-orders the
DR7 read in exc_nmi() to happen before the call to sev_es_ist_enter().

This is problematic when running as an SEV-ES guest because in this
environment the DR7 read might cause a #VC exception, and taking #VC
exceptions is not safe in exc_nmi() before sev_es_ist_enter() has run.

The result is stack recursion if the NMI was caused on the #VC IST
stack, because a subsequent #VC exception in the NMI handler will
overwrite the stack frame of the interrupted #VC handler.

As there are no compiler barriers affecting the ordering of DR7
reads/writes, make the accesses to this register volatile, forbidding
the compiler to re-order them.

  [ bp: Massage text, make them volatile too, to make sure some
  aggressive compiler optimization pass doesn't discard them. ]

Fixes: 315562c9af ("x86/sev-es: Adjust #VC IST Stack on entering NMI handler")
Reported-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230127035616.508966-1-aik@amd.com
2023-01-31 12:51:19 +01:00
Michael Ellerman
111bcb3738 powerpc/64s/radix: Fix RWX mapping with relocated kernel
If a relocatable kernel is loaded at a non-zero address and told not to
relocate to zero (kdump or RELOCATABLE_TEST), the mapping of the
interrupt code at zero is left with RWX permissions.

That is a security weakness, and leads to a warning at boot if
CONFIG_DEBUG_WX is enabled:

  powerpc/mm: Found insecure W+X mapping at address 00000000056435bc/0xc000000000000000
  WARNING: CPU: 1 PID: 1 at arch/powerpc/mm/ptdump/ptdump.c:193 note_page+0x484/0x4c0
  CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc1-00001-g8ae8e98aea82-dirty #175
  Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,git-dd0dca hv:linux,kvm pSeries
  NIP:  c0000000004a1c34 LR: c0000000004a1c30 CTR: 0000000000000000
  REGS: c000000003503770 TRAP: 0700   Not tainted  (6.2.0-rc1-00001-g8ae8e98aea82-dirty)
  MSR:  8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 24000220  XER: 00000000
  CFAR: c000000000545a58 IRQMASK: 0
  ...
  NIP note_page+0x484/0x4c0
  LR  note_page+0x480/0x4c0
  Call Trace:
    note_page+0x480/0x4c0 (unreliable)
    ptdump_pmd_entry+0xc8/0x100
    walk_pgd_range+0x618/0xab0
    walk_page_range_novma+0x74/0xc0
    ptdump_walk_pgd+0x98/0x170
    ptdump_check_wx+0x94/0x100
    mark_rodata_ro+0x30/0x70
    kernel_init+0x78/0x1a0
    ret_from_kernel_thread+0x5c/0x64

The fix has two parts. Firstly the pages from zero up to the end of
interrupts need to be marked read-only, so that they are left with R-X
permissions. Secondly the mapping logic needs to be taught to ensure
there is a page boundary at the end of the interrupt region, so that the
permission change only applies to the interrupt text, and not the region
following it.

Fixes: c55d7b5e64 ("powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE")
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230110124753.1325426-2-mpe@ellerman.id.au
2023-01-31 21:37:39 +11:00
Michael Ellerman
98d0219e04 powerpc/64s/radix: Fix crash with unaligned relocated kernel
If a relocatable kernel is loaded at an address that is not 2MB aligned
and told not to relocate to zero, the kernel can crash due to
mark_rodata_ro() incorrectly changing some read-write data to read-only.

Scenarios where the misalignment can occur are when the kernel is
loaded by kdump or using the RELOCATABLE_TEST config option.

Example crash with the kernel loaded at 5MB:

  Run /sbin/init as init process
  BUG: Unable to handle kernel data access on write at 0xc000000000452000
  Faulting instruction address: 0xc0000000005b6730
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
  CPU: 1 PID: 1 Comm: init Not tainted 6.2.0-rc1-00011-g349188be4841 #166
  Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,git-5b4c5a hv:linux,kvm pSeries
  NIP:  c0000000005b6730 LR: c000000000ae9ab8 CTR: 0000000000000380
  REGS: c000000004503250 TRAP: 0300   Not tainted  (6.2.0-rc1-00011-g349188be4841)
  MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 44288480  XER: 00000000
  CFAR: c0000000005b66ec DAR: c000000000452000 DSISR: 0a000000 IRQMASK: 0
  ...
  NIP memset+0x68/0x104
  LR  zero_user_segments.constprop.0+0xa8/0xf0
  Call Trace:
    ext4_mpage_readpages+0x7f8/0x830
    ext4_readahead+0x48/0x60
    read_pages+0xb8/0x380
    page_cache_ra_unbounded+0x19c/0x250
    filemap_fault+0x58c/0xae0
    __do_fault+0x60/0x100
    __handle_mm_fault+0x1230/0x1a40
    handle_mm_fault+0x120/0x300
    ___do_page_fault+0x20c/0xa80
    do_page_fault+0x30/0xc0
    data_access_common_virt+0x210/0x220

This happens because mark_rodata_ro() tries to change permissions on the
range _stext..__end_rodata, but _stext sits in the middle of the 2MB
page from 4MB to 6MB:

  radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
  radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
  radix-mmu: Mapped 0x0000000000400000-0x0000000002400000 with 2.00 MiB pages (exec)

The logic that changes the permissions assumes the linear mapping was
split correctly at boot, so it marks the entire 2MB page read-only. That
leads to the write fault above.

To fix it, the boot time mapping logic needs to consider that if the
kernel is running at a non-zero address then _stext is a boundary where
it must split the mapping.

That leads to the mapping being split correctly, allowing the rodata
permission change to take happen correctly, with no spillover:

  radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
  radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
  radix-mmu: Mapped 0x0000000000400000-0x0000000000500000 with 64.0 KiB pages
  radix-mmu: Mapped 0x0000000000500000-0x0000000000600000 with 64.0 KiB pages (exec)
  radix-mmu: Mapped 0x0000000000600000-0x0000000002400000 with 2.00 MiB pages (exec)

If the kernel is loaded at a 2MB aligned address, the mapping continues
to use 2MB pages as before:

  radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
  radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
  radix-mmu: Mapped 0x0000000000400000-0x0000000002c00000 with 2.00 MiB pages (exec)
  radix-mmu: Mapped 0x0000000002c00000-0x0000000100000000 with 2.00 MiB pages

Fixes: c55d7b5e64 ("powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230110124753.1325426-1-mpe@ellerman.id.au
2023-01-31 21:37:39 +11:00
Michael Ellerman
7294194b47 powerpc/kexec_file: Fix division by zero in extra size estimation
In kexec_extra_fdt_size_ppc64() there's logic to estimate how much
extra space will be needed in the device tree for some memory related
properties.

That logic uses the size of RAM divided by drmem_lmb_size() to do the
estimation. However drmem_lmb_size() can be zero if the machine has no
hotpluggable memory configured, which is the case when booting with qemu
and no maxmem=x parameter is passed (the default).

The division by zero is reported by UBSAN, and can also lead to an
overflow and a warning from kvmalloc, and kdump kernel loading fails:

  WARNING: CPU: 0 PID: 133 at mm/util.c:596 kvmalloc_node+0x15c/0x160
  Modules linked in:
  CPU: 0 PID: 133 Comm: kexec Not tainted 6.2.0-rc5-03455-g07358bd97810 #223
  Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1200 0xf000005 of:SLOF,git-dd0dca pSeries
  NIP:  c00000000041ff4c LR: c00000000041fe58 CTR: 0000000000000000
  REGS: c0000000096ef750 TRAP: 0700   Not tainted  (6.2.0-rc5-03455-g07358bd97810)
  MSR:  800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24248242  XER: 2004011e
  CFAR: c00000000041fed0 IRQMASK: 0
  ...
  NIP kvmalloc_node+0x15c/0x160
  LR  kvmalloc_node+0x68/0x160
  Call Trace:
    kvmalloc_node+0x68/0x160 (unreliable)
    of_kexec_alloc_and_setup_fdt+0xb8/0x7d0
    elf64_load+0x25c/0x4a0
    kexec_image_load_default+0x58/0x80
    sys_kexec_file_load+0x5c0/0x920
    system_call_exception+0x128/0x330
    system_call_vectored_common+0x15c/0x2ec

To fix it, skip the calculation if drmem_lmb_size() is zero.

Fixes: 2377c92e37 ("powerpc/kexec_file: fix FDT size estimation for kdump kernel")
Cc: stable@vger.kernel.org # v5.12+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230130014707.541110-1-mpe@ellerman.id.au
2023-01-31 21:37:36 +11:00
Michael Ellerman
ad53db4acb powerpc/imc-pmu: Revert nest_init_lock to being a mutex
The recent commit 76d588dddc ("powerpc/imc-pmu: Fix use of mutex in
IRQs disabled section") fixed warnings (and possible deadlocks) in the
IMC PMU driver by converting the locking to use spinlocks.

It also converted the init-time nest_init_lock to a spinlock, even
though it's not used at runtime in IRQ disabled sections or while
holding other spinlocks.

This leads to warnings such as:

  BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:49
  in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0
  preempt_count: 1, expected: 0
  CPU: 7 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc2-14719-gf12cd06109f4-dirty #1
  Hardware name: Mambo,Simulated-System POWER9 0x4e1203 opal:v6.6.6 PowerNV
  Call Trace:
    dump_stack_lvl+0x74/0xa8 (unreliable)
    __might_resched+0x178/0x1a0
    __cpuhp_setup_state+0x64/0x1e0
    init_imc_pmu+0xe48/0x1250
    opal_imc_counters_probe+0x30c/0x6a0
    platform_probe+0x78/0x110
    really_probe+0x104/0x420
    __driver_probe_device+0xb0/0x170
    driver_probe_device+0x58/0x180
    __driver_attach+0xd8/0x250
    bus_for_each_dev+0xb4/0x140
    driver_attach+0x34/0x50
    bus_add_driver+0x1e8/0x2d0
    driver_register+0xb4/0x1c0
    __platform_driver_register+0x38/0x50
    opal_imc_driver_init+0x2c/0x40
    do_one_initcall+0x80/0x360
    kernel_init_freeable+0x310/0x3b8
    kernel_init+0x30/0x1a0
    ret_from_kernel_thread+0x5c/0x64

Fix it by converting nest_init_lock back to a mutex, so that we can call
sleeping functions while holding it. There is no interaction between
nest_init_lock and the runtime spinlocks used by the actual PMU routines.

Fixes: 76d588dddc ("powerpc/imc-pmu: Fix use of mutex in IRQs disabled section")
Tested-by: Kajol Jain<kjain@linux.ibm.com>
Reviewed-by: Kajol Jain<kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230130014401.540543-1-mpe@ellerman.id.au
2023-01-31 11:24:17 +11:00
Arnd Bergmann
8a74191c89 Merge tag 'imx-fixes-6.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes
i.MX fixes for 6.2, round 2:

- Update MAINTAINERS i.MX entry to match arm64 freescale DTS.
- Drop misused 'uart-has-rtscts' from imx8m-venice boards.
- Fix USB host over-current polarity for imx7d-smegw01 board.
- Fix a typo in i.MX8DXL sc_pwrkey property name.
- Fix GPIO watchdog property for i.MX8MM eDM SBC board.
- Keep Ethernet PHY powered on imx8mm-verdin to avoid kernel crash.
- Fix configuration of i.MX8MM pad UART1_DTE_RX.

* tag 'imx-fixes-6.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: dts: imx7d-smegw01: Fix USB host over-current polarity
  arm64: dts: imx8mm-verdin: Do not power down eth-phy
  MAINTAINERS: match freescale ARM64 DT directory in i.MX entry
  arm64: dts: imx8mm: Fix pad control for UART1_DTE_RX
  arm64: dts: freescale: imx8dxl: fix sc_pwrkey's property name linux,keycode
  arm64: dts: imx8m-venice: Remove incorrect 'uart-has-rtscts'
  arm64: dts: imx8mm: Reinstate GPIO watchdog always-running property on eDM SBC

Link: https://lore.kernel.org/r/20230130003614.GP20713@T480
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-01-30 17:44:27 +01:00
Nicholas Piggin
c28548012e powerpc/64: Fix perf profiling asynchronous interrupt handlers
Interrupt entry sets the soft mask to IRQS_ALL_DISABLED to match the
hard irq disabled state. So when should_hard_irq_enable() returns true
because we want PMI interrupts in irq handlers, MSR[EE] is enabled but
PMIs just get soft-masked. Fix this by clearing IRQS_PMI_DISABLED before
enabling MSR[EE].

This also tidies some of the warnings, no need to duplicate them in
both should_hard_irq_enable() and do_hard_irq_enable().

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230121100156.2824054-1-npiggin@gmail.com
2023-01-30 20:07:42 +11:00
Nicholas Piggin
bc88ef6632 powerpc/64s: Fix local irq disable when PMIs are disabled
When PMI interrupts are soft-masked, local_irq_save() will clear the PMI
mask bit, allowing PMIs in and causing a race condition. This causes a
deadlock in native_hpte_insert via hash_preload, which depends on PMIs
being disabled since commit 8b91cee5ea ("powerpc/64s/hash: Make hash
faults work in NMI context"). native_hpte_insert calls local_irq_save().
It's possible the lpar hash code is also affected when tracing is
enabled because __trace_hcall_entry() calls local_irq_save().

Fix this by making arch_local_irq_save() _or_ the IRQS_DISABLED bit into
the mask.

This was found with the stress_hpt option with a kbuild workload running
together with `perf record -g`.

Fixes: f442d00480 ("powerpc/64s: Add support to mask perf interrupts and replay them")
Fixes: 8b91cee5ea ("powerpc/64s/hash: Make hash faults work in NMI context")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Just take the fix without the new warning]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230121095352.2823517-1-npiggin@gmail.com
2023-01-30 20:07:18 +11:00
Sathvika Vasireddy
fe6de81b61 powerpc/kvm: Fix unannotated intra-function call warning
objtool throws the following warning:
  arch/powerpc/kvm/booke.o: warning: objtool: kvmppc_fill_pt_regs+0x30:
  unannotated intra-function call

Fix the warning by setting the value of 'nip' using the _THIS_IP_ macro,
without using an assembly bl/mflr sequence to save the instruction
pointer.

Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230128124158.1066251-1-sv@linux.ibm.com
2023-01-30 16:01:04 +11:00
Sathvika Vasireddy
8afffce6aa powerpc/85xx: Fix unannotated intra-function call warning
objtool throws the following warning:
  arch/powerpc/kernel/head_85xx.o: warning: objtool: .head.text+0x1a6c:
  unannotated intra-function call

Fix the warning by annotating KernelSPE symbol with SYM_FUNC_START_LOCAL
and SYM_FUNC_END macros.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230128124138.1066176-1-sv@linux.ibm.com
2023-01-30 15:59:59 +11:00
Linus Torvalds
bc6bc34b10 Merge tag 'x86_urgent_for_v6.2_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:

 - Start checking for -mindirect-branch-cs-prefix clang support too now
   that LLVM 16 will support it

 - Fix a NULL ptr deref when suspending with Xen PV

 - Have a SEV-SNP guest check explicitly for features enabled by the
   hypervisor and fail gracefully if some are unsupported by the guest
   instead of failing in a non-obvious and hard-to-debug way

 - Fix a MSI descriptor leakage under Xen

 - Mark Xen's MSI domain as supporting MSI-X

 - Prevent legacy PIC interrupts from being resent in software by
   marking them level triggered, as they should be, which lead to a NULL
   ptr deref

* tag 'x86_urgent_for_v6.2_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/build: Move '-mindirect-branch-cs-prefix' out of GCC-only block
  acpi: Fix suspend with Xen PV
  x86/sev: Add SEV-SNP guest feature negotiation support
  x86/pci/xen: Fixup fallout from the PCI/MSI overhaul
  x86/pci/xen: Set MSI_FLAG_PCI_MSIX support in Xen MSI domain
  x86/i8259: Mark legacy PIC interrupts with IRQ_LEVEL
2023-01-29 11:17:34 -08:00
Gavin Shan
6028acbe3a KVM: arm64: Allow no running vcpu on saving vgic3 pending table
We don't have a running VCPU context to save vgic3 pending table due
to KVM_DEV_ARM_VGIC_{GRP_CTRL, SAVE_PENDING_TABLES} command on KVM
device "kvm-arm-vgic-v3". The unknown case is caught by kvm-unit-tests.

   # ./kvm-unit-tests/tests/its-pending-migration
   WARNING: CPU: 120 PID: 7973 at arch/arm64/kvm/../../../virt/kvm/kvm_main.c:3325 \
   mark_page_dirty_in_slot+0x60/0xe0
    :
   mark_page_dirty_in_slot+0x60/0xe0
   __kvm_write_guest_page+0xcc/0x100
   kvm_write_guest+0x7c/0xb0
   vgic_v3_save_pending_tables+0x148/0x2a0
   vgic_set_common_attr+0x158/0x240
   vgic_v3_set_attr+0x4c/0x5c
   kvm_device_ioctl+0x100/0x160
   __arm64_sys_ioctl+0xa8/0xf0
   invoke_syscall.constprop.0+0x7c/0xd0
   el0_svc_common.constprop.0+0x144/0x160
   do_el0_svc+0x34/0x60
   el0_svc+0x3c/0x1a0
   el0t_64_sync_handler+0xb4/0x130
   el0t_64_sync+0x178/0x17c

Use vgic_write_guest_lock() to save vgic3 pending table.

Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230126235451.469087-5-gshan@redhat.com
2023-01-29 18:46:11 +00:00
Gavin Shan
2f8b1ad222 KVM: arm64: Allow no running vcpu on restoring vgic3 LPI pending status
We don't have a running VCPU context to restore vgic3 LPI pending status
due to command KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_RESTORE_TABLES} on KVM
device "kvm-arm-vgic-its".

Use vgic_write_guest_lock() to restore vgic3 LPI pending status.

Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230126235451.469087-4-gshan@redhat.com
2023-01-29 18:46:11 +00:00
Gavin Shan
a23eaf9368 KVM: arm64: Add helper vgic_write_guest_lock()
Currently, the unknown no-running-vcpu sites are reported when a
dirty page is tracked by mark_page_dirty_in_slot(). Until now, the
only known no-running-vcpu site is saving vgic/its tables through
KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_SAVE_TABLES} command on KVM device
"kvm-arm-vgic-its". Unfortunately, there are more unknown sites to
be handled and no-running-vcpu context will be allowed in these
sites: (1) KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_RESTORE_TABLES} command
on KVM device "kvm-arm-vgic-its" to restore vgic/its tables. The
vgic3 LPI pending status could be restored. (2) Save vgic3 pending
table through KVM_DEV_ARM_{VGIC_GRP_CTRL, VGIC_SAVE_PENDING_TABLES}
command on KVM device "kvm-arm-vgic-v3".

In order to handle those unknown cases, we need a unified helper
vgic_write_guest_lock(). struct vgic_dist::save_its_tables_in_progress
is also renamed to struct vgic_dist::save_tables_in_progress.

No functional change intended.

Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230126235451.469087-3-gshan@redhat.com
2023-01-29 18:46:11 +00:00
Linus Torvalds
db7c4673bb Merge tag 'riscv-for-linus-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:

 - A few DT bindings fixes to more closely align the ISA string
   requirements between the bindings and the ISA manual.

 - A handful of build error/warning fixes.

 - A fix to move init_cpu_topology() later in the boot flow, so it can
   allocate memory.

 - The IRC channel is now in the MAINTAINERS file, so it's easier to
   find.

* tag 'riscv-for-linus-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Move call to init_cpu_topology() to later initialization stage
  riscv/kprobe: Fix instruction simulation of JALR
  riscv: fix -Wundef warning for CONFIG_RISCV_BOOT_SPINWAIT
  MAINTAINERS: add an IRC entry for RISC-V
  RISC-V: fix compile error from deduplicated __ALTERNATIVE_CFG_2
  dt-bindings: riscv: fix single letter canonical order
  dt-bindings: riscv: fix underscore requirement for multi-letter extensions
2023-01-27 12:52:45 -08:00
Linus Torvalds
e5eb2b22f0 Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:

 - fix nommu assignment build warning

 - fix -Wundef preprocessor warning

 - reduce __thumb2__ definitions for crypto files that require it

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9287/1: Reduce __thumb2__ definition to crypto files that require it
  ARM: 9284/1: include <asm/pgtable.h> from proc-macros.S to fix -Wundef warnings
  ARM: 9280/1: mm: fix warning on phys_addr_t to void pointer assignment
2023-01-27 12:49:00 -08:00
Fabio Estevam
1febf88ef9 ARM: dts: imx7d-smegw01: Fix USB host over-current polarity
Currently, when resetting the USB modem via AT commands, the modem is
no longer re-connected.

This problem is caused by the incorrect description of the USB_OTG2_OC
pad. It should have pull-up enabled, hysteresis enabled and the
property 'over-current-active-low' should be passed.

With this change, the USB modem can be successfully re-connected
after a reset.

Cc: stable@vger.kernel.org
Fixes: 9ac0ae97e3 ("ARM: dts: imx7d-smegw01: Add support for i.MX7D SMEGW01 board")
Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2023-01-26 15:38:05 +08:00
Philippe Schenker
39c95d0c35 arm64: dts: imx8mm-verdin: Do not power down eth-phy
Currently if suspending using either freeze or memory state, the fec
driver tries to power down the phy which leads to crash of the kernel
and non-responsible kernel with the following call trace:

[   24.839889 ] Call trace:
[   24.839892 ]  phy_error+0x18/0x60
[   24.839898 ]  kszphy_handle_interrupt+0x6c/0x80
[   24.839903 ]  phy_interrupt+0x20/0x2c
[   24.839909 ]  irq_thread_fn+0x30/0xa0
[   24.839919 ]  irq_thread+0x178/0x2c0
[   24.839925 ]  kthread+0x154/0x160
[   24.839932 ]  ret_from_fork+0x10/0x20

Since there is currently no functionality in the phy subsystem to power
down phys let's just disable the feature of powering-down the ethernet
phy.

Fixes: 6a57f224f7 ("arm64: dts: freescale: add initial support for verdin imx8m mini")
Signed-off-by: Philippe Schenker <philippe.schenker@toradex.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2023-01-26 15:32:14 +08:00
Pierluigi Passaro
47123900f3 arm64: dts: imx8mm: Fix pad control for UART1_DTE_RX
According section
    8.2.5.313 Select Input Register (IOMUXC_UART1_RXD_SELECT_INPUT)
of 
    i.MX 8M Mini Applications Processor Reference Manual, Rev. 3, 11/2020
the required setting for this specific pin configuration is "1"

Signed-off-by: Pierluigi Passaro <pierluigi.p@variscite.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Fixes: c1c9d41319 ("dt-bindings: imx: Add pinctrl binding doc for imx8mm")
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2023-01-26 09:07:19 +08:00
Ley Foon Tan
c1d6105869 riscv: Move call to init_cpu_topology() to later initialization stage
If "capacity-dmips-mhz" is present in a CPU DT node,
topology_parse_cpu_capacity() will fail to allocate memory.  arm64, with
which this code path is shared, does not call
topology_parse_cpu_capacity() until later in boot where memory
allocation is available.  While "capacity-dmips-mhz" is not yet a valid
property on RISC-V, invalid properties should be ignored rather than
cause issues.  Move init_cpu_topology(), which calls
topology_parse_cpu_capacity(), to a later initialization stage, to match
arm64.

As a side effect of this change, RISC-V is "protected" from changes to
core topology code that would work on arm64 where memory allocation is
safe but on RISC-V isn't.

Fixes: 03f11f03db ("RISC-V: Parse cpu topology during boot.")
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Ley Foon Tan <leyfoon.tan@starfivetech.com>
Link: https://lore.kernel.org/r/20230105033705.3946130-1-leyfoon.tan@starfivetech.com
[Palmer: use Conor's commit text]
Link: https://lore.kernel.org/linux-riscv/20230104183033.755668-1-pierre.gondois@arm.com/T/#me592d4c8b9508642954839f0077288a353b0b9b2
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-25 07:20:00 -08:00
Liao Chang
ca0254998b riscv/kprobe: Fix instruction simulation of JALR
Set kprobe at 'jalr 1140(ra)' of vfs_write results in the following
crash:

[   32.092235] Unable to handle kernel access to user memory without uaccess routines at virtual address 00aaaaaad77b1170
[   32.093115] Oops [#1]
[   32.093251] Modules linked in:
[   32.093626] CPU: 0 PID: 135 Comm: ftracetest Not tainted 6.2.0-rc2-00013-gb0aa5e5df0cb-dirty #16
[   32.093985] Hardware name: riscv-virtio,qemu (DT)
[   32.094280] epc : ksys_read+0x88/0xd6
[   32.094855]  ra : ksys_read+0xc0/0xd6
[   32.095016] epc : ffffffff801cda80 ra : ffffffff801cdab8 sp : ff20000000d7bdc0
[   32.095227]  gp : ffffffff80f14000 tp : ff60000080f9cb40 t0 : ffffffff80f13e80
[   32.095500]  t1 : ffffffff8000c29c t2 : ffffffff800dbc54 s0 : ff20000000d7be60
[   32.095716]  s1 : 0000000000000000 a0 : ffffffff805a64ae a1 : ffffffff80a83708
[   32.095921]  a2 : ffffffff80f160a0 a3 : 0000000000000000 a4 : f229b0afdb165300
[   32.096171]  a5 : f229b0afdb165300 a6 : ffffffff80eeebd0 a7 : 00000000000003ff
[   32.096411]  s2 : ff6000007ff76800 s3 : fffffffffffffff7 s4 : 00aaaaaad77b1170
[   32.096638]  s5 : ffffffff80f160a0 s6 : ff6000007ff76800 s7 : 0000000000000030
[   32.096865]  s8 : 00ffffffc3d97be0 s9 : 0000000000000007 s10: 00aaaaaad77c9410
[   32.097092]  s11: 0000000000000000 t3 : ffffffff80f13e48 t4 : ffffffff8000c29c
[   32.097317]  t5 : ffffffff8000c29c t6 : ffffffff800dbc54
[   32.097505] status: 0000000200000120 badaddr: 00aaaaaad77b1170 cause: 000000000000000d
[   32.098011] [<ffffffff801cdb72>] ksys_write+0x6c/0xd6
[   32.098222] [<ffffffff801cdc06>] sys_write+0x2a/0x38
[   32.098405] [<ffffffff80003c76>] ret_from_syscall+0x0/0x2

Since the rs1 and rd might be the same one, such as 'jalr 1140(ra)',
hence it requires obtaining the target address from rs1 followed by
updating rd.

Fixes: c22b0bcb1d ("riscv: Add kprobes supported")
Signed-off-by: Liao Chang <liaochang1@huawei.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20230116064342.2092136-1-liaochang1@huawei.com
[Palmer: Pick Guo's cleanup]
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-24 21:38:19 -08:00
Linus Torvalds
b2f317173e Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "ARM64:

   - Pass the correct address to mte_clear_page_tags() on initialising a
     tagged page

   - Plug a race against a GICv4.1 doorbell interrupt while saving the
     vgic-v3 pending state.

  x86:

   - A command line parsing fix and a clang compilation fix for
     selftests

   - A fix for a longstanding VMX issue, that surprisingly was only
     found now to affect real world guests"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: selftests: Make reclaim_period_ms input always be positive
  KVM: x86/vmx: Do not skip segment attributes if unusable bit is set
  selftests: kvm: move declaration at the beginning of main()
  KVM: arm64: GICv4.1: Fix race with doorbell on VPE activation/deactivation
  KVM: arm64: Pass the actual page address to mte_clear_page_tags()
2023-01-24 17:48:09 -08:00
Eddie James
d9b6c322fd ARM: dts: aspeed: Fix pca9849 compatible
Missed a digit in the PCA9849 compatible string.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Fixes: 65b697e5de ("ARM: dts: aspeed: Add IBM Bonnell system BMC devicetree")
Link: https://lore.kernel.org/r/20220826194457.164492-1-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20230118051736.246714-1-joel@jms.id.au
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-01-24 22:07:58 +01:00
Linus Torvalds
9946f0981f Merge tag 'efi-fixes-for-v6.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:
 "Another couple of EFI fixes, of which the first two were already in
  -next when I sent out the previous PR, but they caused some issues on
  non-EFI boots so I let them simmer for a bit longer.

   - ensure the EFI ResetSystem and ACPI PRM calls are recognized as
     users of the EFI runtime, and therefore protected against
     exceptions

   - account for the EFI runtime stack in the stacktrace code

   - remove Matthew Garrett's MAINTAINERS entry for efivarfs"

* tag 'efi-fixes-for-v6.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi: Remove Matthew Garrett as efivarfs maintainer
  arm64: efi: Account for the EFI runtime stack in stack unwinder
  arm64: efi: Avoid workqueue to check whether EFI runtime is live
2023-01-23 11:46:19 -08:00
Nathan Chancellor
2f62847cf6 ARM: 9287/1: Reduce __thumb2__ definition to crypto files that require it
Commit 1d2e9b67b0 ("ARM: 9265/1: pass -march= only to compiler") added
a __thumb2__ define to ASFLAGS to avoid build errors in the crypto code,
which relies on __thumb2__ for preprocessing. Commit 59e2cf8d21 ("ARM:
9275/1: Drop '-mthumb' from AFLAGS_ISA") followed up on this by removing
-mthumb from AFLAGS so that __thumb2__ would not be defined when the
default target was ARMv7 or newer.

Unfortunately, the second commit's fix assumes that the toolchain
defaults to -mno-thumb / -marm, which is not the case for Debian's
arm-linux-gnueabihf target, which defaults to -mthumb:

  $ echo | arm-linux-gnueabihf-gcc -dM -E - | grep __thumb
  #define __thumb2__ 1
  #define __thumb__ 1

This target is used by several CI systems, which will still see
redefined macro warnings, despite '-mthumb' not being present in the
flags:

  <command-line>: warning: "__thumb2__" redefined
  <built-in>: note: this is the location of the previous definition

Remove the global AFLAGS __thumb2__ define and move it to the crypto
folder where it is required by the imported OpenSSL algorithms; the rest
of the kernel should use the internal CONFIG_THUMB2_KERNEL symbol to
know whether or not Thumb2 is being used or not. Be sure that __thumb2__
is undefined first so that there are no macro redefinition warnings.

Link: https://github.com/ClangBuiltLinux/linux/issues/1772

Reported-by: "kernelci.org bot" <bot@kernelci.org>
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Fixes: 59e2cf8d21 ("ARM: 9275/1: Drop '-mthumb' from AFLAGS_ISA")
Fixes: 1d2e9b67b0 ("ARM: 9265/1: pass -march= only to compiler")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2023-01-23 14:31:18 +00:00
Linus Torvalds
2475bf0250 Merge tag 'sched_urgent_for_v6.2_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Borislav Petkov:

 - Make sure the scheduler doesn't use stale frequency scaling values
   when latter get disabled due to a value error

 - Fix a NULL pointer access on UP configs

 - Use the proper locking when updating CPU capacity

* tag 'sched_urgent_for_v6.2_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/aperfmperf: Erase stale arch_freq_scale values when disabling frequency invariance readings
  sched/core: Fix NULL pointer access fault in sched_setaffinity() with non-SMP configs
  sched/fair: Fixes for capacity inversion detection
  sched/uclamp: Fix a uninitialized variable warnings
2023-01-22 12:14:58 -08:00
Linus Torvalds
2b299a1cd4 Merge tag 'perf_urgent_for_v6.2_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Borislav Petkov:

 - Add Emerald Rapids model support to more perf machinery

* tag 'perf_urgent_for_v6.2_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/cstate: Add Emerald Rapids
  perf/x86/intel: Add Emerald Rapids
2023-01-22 12:06:18 -08:00
Nathan Chancellor
27b5de622e x86/build: Move '-mindirect-branch-cs-prefix' out of GCC-only block
LLVM 16 will have support for this flag so move it out of the GCC-only
block to allow LLVM builds to take advantage of it.

Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://github.com/ClangBuiltLinux/linux/issues/1665
Link: 6f867f9102
Link: https://lore.kernel.org/r/20230120165826.2469302-1-nathan@kernel.org
2023-01-22 11:36:45 +01:00
Hendrik Borghorst
a44b331614 KVM: x86/vmx: Do not skip segment attributes if unusable bit is set
When serializing and deserializing kvm_sregs, attributes of the segment
descriptors are stored by user space. For unusable segments,
vmx_segment_access_rights skips all attributes and sets them to 0.

This means we zero out the DPL (Descriptor Privilege Level) for unusable
entries.

Unusable segments are - contrary to their name - usable in 64bit mode and
are used by guests to for example create a linear map through the
NULL selector.

VMENTER checks if SS.DPL is correct depending on the CS segment type.
For types 9 (Execute Only) and 11 (Execute Read), CS.DPL must be equal to
SS.DPL [1].

We have seen real world guests setting CS to a usable segment with DPL=3
and SS to an unusable segment with DPL=3. Once we go through an sregs
get/set cycle, SS.DPL turns to 0. This causes the virtual machine to crash
reproducibly.

This commit changes the attribute logic to always preserve attributes for
unusable segments. According to [2] SS.DPL is always saved on VM exits,
regardless of the unusable bit so user space applications should have saved
the information on serialization correctly.

[3] specifies that besides SS.DPL the rest of the attributes of the
descriptors are undefined after VM entry if unusable bit is set. So, there
should be no harm in setting them all to the previous state.

[1] Intel SDM Vol 3C 26.3.1.2 Checks on Guest Segment Registers
[2] Intel SDM Vol 3C 27.3.2 Saving Segment Registers and Descriptor-Table
Registers
[3] Intel SDM Vol 3C 26.3.2.2 Loading Guest Segment Registers and
Descriptor-Table Registers

Cc: Alexander Graf <graf@amazon.de>
Cc: stable@vger.kernel.org
Signed-off-by: Hendrik Borghorst <hborghor@amazon.de>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Message-Id: <20221114164823.69555-1-hborghor@amazon.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-01-22 04:09:28 -05:00