Pull networking fixes from Paolo Abeni:
"Including fixes from bpf.
Relatively calm week, likely due to public holiday in most places. No
known outstanding regressions.
Current release - regressions:
- rxrpc: fix wrong alignmask in __page_frag_alloc_align()
- eth: e1000e: change usleep_range to udelay in PHY mdic access
Previous releases - regressions:
- gro: fix udp bad offset in socket lookup
- bpf: fix incorrect runtime stat for arm64
- tipc: fix UAF in error path
- netfs: fix a potential infinite loop in extract_user_to_sg()
- eth: ice: ensure the copied buf is NUL terminated
- eth: qeth: fix kernel panic after setting hsuid
Previous releases - always broken:
- bpf:
- verifier: prevent userspace memory access
- xdp: use flags field to disambiguate broadcast redirect
- bridge: fix multicast-to-unicast with fraglist GSO
- mptcp: ensure snd_nxt is properly initialized on connect
- nsh: fix outer header access in nsh_gso_segment().
- eth: bcmgenet: fix racing registers access
- eth: vxlan: fix stats counters.
Misc:
- a bunch of MAINTAINERS file updates"
* tag 'net-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
MAINTAINERS: mark MYRICOM MYRI-10G as Orphan
MAINTAINERS: remove Ariel Elior
net: gro: add flush check in udp_gro_receive_segment
net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb
ipv4: Fix uninit-value access in __ip_make_skb()
s390/qeth: Fix kernel panic after setting hsuid
vxlan: Pull inner IP header in vxlan_rcv().
tipc: fix a possible memleak in tipc_buf_append
tipc: fix UAF in error path
rxrpc: Clients must accept conn from any address
net: core: reject skb_copy(_expand) for fraglist GSO skbs
net: bridge: fix multicast-to-unicast with fraglist GSO
mptcp: ensure snd_nxt is properly initialized on connect
e1000e: change usleep_range to udelay in PHY mdic access
net: dsa: mv88e6xxx: Fix number of databases for 88E6141 / 88E6341
cxgb4: Properly lock TX queue for the selftest.
rxrpc: Fix using alignmask being zero for __page_frag_alloc_align()
vxlan: Add missing VNI filter counter update in arp_reduce().
vxlan: Fix racy device stats updates.
net: qede: use return from qede_parse_actions()
...
Daniel Borkmann says:
====================
pull-request: bpf-next 2024-04-29
We've added 147 non-merge commits during the last 32 day(s) which contain
a total of 158 files changed, 9400 insertions(+), 2213 deletions(-).
The main changes are:
1) Add an internal-only BPF per-CPU instruction for resolving per-CPU
memory addresses and implement support in x86 BPF JIT. This allows
inlining per-CPU array and hashmap lookups
and the bpf_get_smp_processor_id() helper, from Andrii Nakryiko.
2) Add BPF link support for sk_msg and sk_skb programs, from Yonghong Song.
3) Optimize x86 BPF JIT's emit_mov_imm64, and add support for various
atomics in bpf_arena which can be JITed as a single x86 instruction,
from Alexei Starovoitov.
4) Add support for passing mark with bpf_fib_lookup helper,
from Anton Protopopov.
5) Add a new bpf_wq API for deferring events and refactor sleepable
bpf_timer code to keep common code where possible,
from Benjamin Tissoires.
6) Fix BPF_PROG_TEST_RUN infra with regards to bpf_dummy_struct_ops programs
to check when NULL is passed for non-NULLable parameters,
from Eduard Zingerman.
7) Harden the BPF verifier's and/or/xor value tracking,
from Harishankar Vishwanathan.
8) Introduce crypto kfuncs to make BPF programs able to utilize the kernel
crypto subsystem, from Vadim Fedorenko.
9) Various improvements to the BPF instruction set standardization doc,
from Dave Thaler.
10) Extend libbpf APIs to partially consume items from the BPF ringbuffer,
from Andrea Righi.
11) Bigger batch of BPF selftests refactoring to use common network helpers
and to drop duplicate code, from Geliang Tang.
12) Support bpf_tail_call_static() helper for BPF programs with GCC 13,
from Jose E. Marchesi.
13) Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF
program to have code sections where preemption is disabled,
from Kumar Kartikeya Dwivedi.
14) Allow invoking BPF kfuncs from BPF_PROG_TYPE_SYSCALL programs,
from David Vernet.
15) Extend the BPF verifier to allow different input maps for a given
bpf_for_each_map_elem() helper call in a BPF program, from Philo Lu.
16) Add support for PROBE_MEM32 and bpf_addr_space_cast instructions
for riscv64 and arm64 JITs to enable BPF Arena, from Puranjay Mohan.
17) Shut up a false-positive KMSAN splat in interpreter mode by unpoison
the stack memory, from Martin KaFai Lau.
18) Improve xsk selftest coverage with new tests on maximum and minimum
hardware ring size configurations, from Tushar Vyavahare.
19) Various ReST man pages fixes as well as documentation and bash completion
improvements for bpftool, from Rameez Rehman & Quentin Monnet.
20) Fix libbpf with regards to dumping subsequent char arrays,
from Quentin Deslandes.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (147 commits)
bpf, docs: Clarify PC use in instruction-set.rst
bpf_helpers.h: Define bpf_tail_call_static when building with GCC
bpf, docs: Add introduction for use in the ISA Internet Draft
selftests/bpf: extend BPF_SOCK_OPS_RTT_CB test for srtt and mrtt_us
bpf: add mrtt and srtt as BPF_SOCK_OPS_RTT_CB args
selftests/bpf: dummy_st_ops should reject 0 for non-nullable params
bpf: check bpf_dummy_struct_ops program params for test runs
selftests/bpf: do not pass NULL for non-nullable params in dummy_st_ops
selftests/bpf: adjust dummy_st_ops_success to detect additional error
bpf: mark bpf_dummy_struct_ops.test_1 parameter as nullable
selftests/bpf: Add ring_buffer__consume_n test.
bpf: Add bpf_guard_preempt() convenience macro
selftests: bpf: crypto: add benchmark for crypto functions
selftests: bpf: crypto skcipher algo selftests
bpf: crypto: add skcipher to bpf crypto
bpf: make common crypto API for TC/XDP programs
bpf: update the comment for BTF_FIELDS_MAX
selftests/bpf: Fix wq test.
selftests/bpf: Use make_sockaddr in test_sock_addr
selftests/bpf: Use connect_to_addr in test_sock_addr
...
====================
Link: https://lore.kernel.org/r/20240429131657.19423-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull x86 fixes from Ingo Molnar:
- Make the CPU_MITIGATIONS=n interaction with conflicting
mitigation-enabling boot parameters a bit saner.
- Re-enable CPU mitigations by default on non-x86
- Fix TDX shared bit propagation on mprotect()
- Fix potential show_regs() system hang when PKE initialization
is not fully finished yet.
- Add the 0x10-0x1f model IDs to the Zen5 range
- Harden #VC instruction emulation some more
* tag 'x86-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu: Ignore "mitigations" kernel parameter if CPU_MITIGATIONS=n
cpu: Re-enable CPU mitigations by default for !X86 architectures
x86/tdx: Preserve shared bit on mprotect()
x86/cpu: Fix check for RDPKRU in __show_regs()
x86/CPU/AMD: Add models 0x10-0x1f to the Zen5 range
x86/sev: Check for MWAITX and MONITORX opcodes in the #VC handler
Pull RISC-V fixes from Palmer Dabbelt:
- A fix for TASK_SIZE on rv64/NOMMU, to reflect the lack of user/kernel
separation
- A fix to avoid loading rv64/NOMMU kernel past the start of RAM
- A fix for RISCV_HWPROBE_EXT_ZVFHMIN on ilp32 to avoid signed integer
overflow in the bitmask
- The sud_test kselftest has been fixed to properly swizzle the syscall
number into the return register, which are not the same on RISC-V
- A fix for a build warning in the perf tools on rv32
- A fix for the CBO selftests, to avoid non-constants leaking into the
inline asm
- A pair of fixes for T-Head PBMT errata probing, which has been
renamed MAE by the vendor
* tag 'riscv-for-linus-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: selftests: cbo: Ensure asm operands match constraints, take 2
perf riscv: Fix the warning due to the incompatible type
riscv: T-Head: Test availability bit before enabling MAE errata
riscv: thead: Rename T-Head PBMT to MAE
selftests: sud_test: return correct emulated syscall value on RISC-V
riscv: hwprobe: fix invalid sign extension for RISCV_HWPROBE_EXT_ZVFHMIN
riscv: Fix loading 64-bit NOMMU kernels past the start of RAM
riscv: Fix TASK_SIZE on 64-bit NOMMU
Daniel Borkmann says:
====================
pull-request: bpf 2024-04-26
We've added 12 non-merge commits during the last 22 day(s) which contain
a total of 14 files changed, 168 insertions(+), 72 deletions(-).
The main changes are:
1) Fix BPF_PROBE_MEM in verifier and JIT to skip loads from vsyscall page,
from Puranjay Mohan.
2) Fix a crash in XDP with devmap broadcast redirect when the latter map
is in process of being torn down, from Toke Høiland-Jørgensen.
3) Fix arm64 and riscv64 BPF JITs to properly clear start time for BPF
program runtime stats, from Xu Kuohai.
4) Fix a sockmap KCSAN-reported data race in sk_psock_skb_ingress_enqueue,
from Jason Xing.
5) Fix BPF verifier error message in resolve_pseudo_ldimm64,
from Anton Protopopov.
6) Fix missing DEBUG_INFO_BTF_MODULES Kconfig menu item,
from Andrii Nakryiko.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: Test PROBE_MEM of VSYSCALL_ADDR on x86-64
bpf, x86: Fix PROBE_MEM runtime load check
bpf: verifier: prevent userspace memory access
xdp: use flags field to disambiguate broadcast redirect
arm32, bpf: Reimplement sign-extension mov instruction
riscv, bpf: Fix incorrect runtime stats
bpf, arm64: Fix incorrect runtime stats
bpf: Fix a verifier verbose message
bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueue
MAINTAINERS: bpf: Add Lehui and Puranjay as riscv64 reviewers
MAINTAINERS: Update email address for Puranjay Mohan
bpf, kconfig: Fix DEBUG_INFO_BTF_MODULES Kconfig definition
====================
Link: https://lore.kernel.org/r/20240426224248.26197-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull ARM SoC fixes from Arnd Bergmann:
"There are a lot of minor DT fixes for Mediatek, Rockchip, Qualcomm and
Microchip and NXP, addressing both build-time warnings and bugs found
during runtime testing.
Most of these changes are machine specific fixups, but there are a few
notable regressions that affect an entire SoC:
- The Qualcomm MSI support that was improved for 6.9 ended up being
wrong on some chips and now gets fixed.
- The i.MX8MP camera interface broke due to a typo and gets updated
again.
The main driver fix is also for Qualcomm platforms, rewriting an
interface in the QSEECOM firmware support that could lead to crashing
the kernel from a trusted application.
The only other code changes are minor fixes for Mediatek SoC drivers"
* tag 'soc-fixes-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (50 commits)
ARM: dts: imx6ull-tarragon: fix USB over-current polarity
soc: mediatek: mtk-socinfo: depends on CONFIG_SOC_BUS
soc: mediatek: mtk-svs: Append "-thermal" to thermal zone names
arm64: dts: imx8mp: Fix assigned-clocks for second CSI2
ARM: dts: microchip: at91-sama7g54_curiosity: Replace regulator-suspend-voltage with the valid property
ARM: dts: microchip: at91-sama7g5ek: Replace regulator-suspend-voltage with the valid property
arm64: dts: rockchip: Fix USB interface compatible string on kobol-helios64
arm64: dts: qcom: sc8180x: Fix ss_phy_irq for secondary USB controller
arm64: dts: qcom: sm8650: Fix the msi-map entries
arm64: dts: qcom: sm8550: Fix the msi-map entries
arm64: dts: qcom: sm8450: Fix the msi-map entries
arm64: dts: qcom: sc8280xp: add missing PCIe minimum OPP
arm64: dts: qcom: x1e80100: Fix the compatible for cluster idle states
arm64: dts: qcom: Fix type of "wdog" IRQs for remoteprocs
arm64: dts: rockchip: regulator for sd needs to be always on for BPI-R2Pro
dt-bindings: rockchip: grf: Add missing type to 'pcie-phy' node
arm64: dts: rockchip: drop redundant disable-gpios in Lubancat 2
arm64: dts: rockchip: drop redundant disable-gpios in Lubancat 1
arm64: dts: rockchip: drop redundant pcie-reset-suspend in Scarlet Dumo
arm64: dts: rockchip: mark system power controller and fix typo on orangepi-5-plus
...
When a load is marked PROBE_MEM - e.g. due to PTR_UNTRUSTED access - the
address being loaded from is not necessarily valid. The BPF jit sets up
exception handlers for each such load which catch page faults and 0 out
the destination register.
If the address for the load is outside kernel address space, the load
will escape the exception handling and crash the kernel. To prevent this
from happening, the emits some instruction to verify that addr is > end
of userspace addresses.
x86 has a legacy vsyscall ABI where a page at address 0xffffffffff600000
is mapped with user accessible permissions. The addresses in this page
are considered userspace addresses by the fault handler. Therefore, a
BPF program accessing this page will crash the kernel.
This patch fixes the runtime checks to also check that the PROBE_MEM
address is below VSYSCALL_ADDR.
Example BPF program:
SEC("fentry/tcp_v4_connect")
int BPF_PROG(fentry_tcp_v4_connect, struct sock *sk)
{
*(volatile unsigned long *)&sk->sk_tsq_flags;
return 0;
}
BPF Assembly:
0: (79) r1 = *(u64 *)(r1 +0)
1: (79) r1 = *(u64 *)(r1 +344)
2: (b7) r0 = 0
3: (95) exit
x86-64 JIT
==========
BEFORE AFTER
------ -----
0: nopl 0x0(%rax,%rax,1) 0: nopl 0x0(%rax,%rax,1)
5: xchg %ax,%ax 5: xchg %ax,%ax
7: push %rbp 7: push %rbp
8: mov %rsp,%rbp 8: mov %rsp,%rbp
b: mov 0x0(%rdi),%rdi b: mov 0x0(%rdi),%rdi
-------------------------------------------------------------------------------
f: movabs $0x100000000000000,%r11 f: movabs $0xffffffffff600000,%r10
19: add $0x2a0,%rdi 19: mov %rdi,%r11
20: cmp %r11,%rdi 1c: add $0x2a0,%r11
23: jae 0x0000000000000029 23: sub %r10,%r11
25: xor %edi,%edi 26: movabs $0x100000000a00000,%r10
27: jmp 0x000000000000002d 30: cmp %r10,%r11
29: mov 0x0(%rdi),%rdi 33: ja 0x0000000000000039
--------------------------------\ 35: xor %edi,%edi
2d: xor %eax,%eax \ 37: jmp 0x0000000000000040
2f: leave \ 39: mov 0x2a0(%rdi),%rdi
30: ret \--------------------------------------------
40: xor %eax,%eax
42: leave
43: ret
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20240424100210.11982-3-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
i.MX fixes for 6.9, round 2:
- Fix i.MX8MP the second CSI2 assigned-clock property which got wrong by
commit f78835d1e6 ("arm64: dts: imx8mp: reparent MEDIA_MIPI_PHY1_REF
to CLK_24M")
- Correct USB over-current polarity for imx6ull-tarragon board
* tag 'imx-fixes-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx6ull-tarragon: fix USB over-current polarity
arm64: dts: imx8mp: Fix assigned-clocks for second CSI2
Link: https://lore.kernel.org/r/ZioopqscxwUOwQkf@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
MediaTek ARM64 DTS fixes for v6.9
This fixes some dts validation issues against bindings for multiple SoCs,
GPU voltage constraints for Chromebook devices, missing gce-client-reg
on various nodes (performance issues) on MT8183/92/95, and also fixes
boot issues on MT8195 when SPMI is built as module.
* tag 'mtk-dts64-fixes-for-v6.9' of https://git.kernel.org/pub/scm/linux/kernel/git/mediatek/linux:
arm64: dts: mediatek: mt2712: fix validation errors
arm64: dts: mediatek: mt7986: prefix BPI-R3 cooling maps with "map-"
arm64: dts: mediatek: mt7986: drop invalid thermal block clock
arm64: dts: mediatek: mt7986: drop "#reset-cells" from Ethernet controller
arm64: dts: mediatek: mt7986: drop invalid properties from ethsys
arm64: dts: mediatek: mt7622: drop "reset-names" from thermal block
arm64: dts: mediatek: mt7622: fix ethernet controller "compatible"
arm64: dts: mediatek: mt7622: fix IR nodename
arm64: dts: mediatek: mt7622: fix clock controllers
arm64: dts: mediatek: mt8186-corsola: Update min voltage constraint for Vgpu
arm64: dts: mediatek: mt8183-kukui: Use default min voltage for MT6358
arm64: dts: mediatek: mt8195-cherry: Update min voltage constraint for MT6315
arm64: dts: mediatek: mt8192-asurada: Update min voltage constraint for MT6315
arm64: dts: mediatek: cherry: Describe CPU supplies
arm64: dts: mediatek: mt8195: Add missing gce-client-reg to mutex1
arm64: dts: mediatek: mt8195: Add missing gce-client-reg to mutex
arm64: dts: mediatek: mt8195: Add missing gce-client-reg to vpp/vdosys
arm64: dts: mediatek: mt8192: Add missing gce-client-reg to mutex
arm64: dts: mediatek: mt8183: Add power-domains properity to mfgcfg
Qualcomm Arm64 DeviceTree fixes for v6.9
This corrects the watchdog IRQ flags for a number of remoteproc
instances, which otherwise prevents the driver from probe in the face of
a probe deferral.
Improvements in other areas, such as USB, have made it possible for CX
rail voltage on SC8280XP to be lowered, no longer meeting requirements
of active PCIe controllers. Necessary votes are added to these
controllers.
The MSI definitions for PCIe controllers in SM8450, SM8550, and SM8650
was incorrect, due to a bug in the driver. As this has now been fixed
the definition needs to be corrected.
Lastly, the SuperSpeed PHY irq of the second USB controller in SC8180x,
and the compatible string for X1 Elite domain idle states are corrected.
* tag 'qcom-arm64-fixes-for-6.9' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
arm64: dts: qcom: sc8180x: Fix ss_phy_irq for secondary USB controller
arm64: dts: qcom: sm8650: Fix the msi-map entries
arm64: dts: qcom: sm8550: Fix the msi-map entries
arm64: dts: qcom: sm8450: Fix the msi-map entries
arm64: dts: qcom: sc8280xp: add missing PCIe minimum OPP
arm64: dts: qcom: x1e80100: Fix the compatible for cluster idle states
arm64: dts: qcom: Fix type of "wdog" IRQs for remoteprocs
Link: https://lore.kernel.org/r/20240420161002.1132240-1-andersson@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
* 'v6.9-armsoc/dtsfixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
arm64: dts: rockchip: Fix USB interface compatible string on kobol-helios64
arm64: dts: rockchip: regulator for sd needs to be always on for BPI-R2Pro
dt-bindings: rockchip: grf: Add missing type to 'pcie-phy' node
arm64: dts: rockchip: drop redundant disable-gpios in Lubancat 2
arm64: dts: rockchip: drop redundant disable-gpios in Lubancat 1
arm64: dts: rockchip: drop redundant pcie-reset-suspend in Scarlet Dumo
arm64: dts: rockchip: mark system power controller and fix typo on orangepi-5-plus
arm64: dts: rockchip: Designate the system power controller on QuartzPro64
arm64: dts: rockchip: drop panel port unit address in GRU Scarlet
arm64: dts: rockchip: Remove unsupported node from the Pinebook Pro dts
arm64: dts: rockchip: Fix the i2c address of es8316 on Cool Pi CM5
arm64: dts: rockchip: add regulators for PCIe on RK3399 Puma Haikou
arm64: dts: rockchip: enable internal pull-up on PCIE_WAKE# for RK3399 Puma
arm64: dts: rockchip: enable internal pull-up on Q7_USB_ID for RK3399 Puma
arm64: dts: rockchip: fix alphabetical ordering RK3399 puma
arm64: dts: rockchip: enable internal pull-up for Q7_THRM# on RK3399 Puma
arm64: dts: rockchip: set PHY address of MT7531 switch to 0x1f
Link: https://lore.kernel.org/r/3413596.CbtlEUcBR6@phil
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
T-Head's memory attribute extension (XTheadMae) (non-compatible
equivalent of RVI's Svpbmt) is currently assumed for all T-Head harts.
However, QEMU recently decided to drop acceptance of guests that write
reserved bits in PTEs.
As XTheadMae uses reserved bits in PTEs and Linux applies the MAE errata
for all T-Head harts, this broke the Linux startup on QEMU emulations
of the C906 emulation.
This patch attempts to address this issue by testing the MAE-enable bit
in the th.sxstatus CSR. This CSR is available in HW and can be
emulated in QEMU.
This patch also makes the XTheadMae probing mechanism reliable, because
a test for the right combination of mvendorid, marchid, and mimpid
is not sufficient to enable MAE.
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Link: https://lore.kernel.org/r/20240407213236.2121592-3-christoph.muellner@vrull.eu
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
There is an smp function call named reset_counters() to init PMU
registers of every CPU in PMU initialization state. It requires that all
CPUs are online. However there is an early_initcall() wrapper for the
PMU init funciton init_hw_perf_events(), so that pmu init funciton is
called in do_pre_smp_initcalls() which before function smp_init().
Function reset_counters() cannot work on other CPUs since they haven't
boot up still.
Here replace the wrapper early_initcall() with pure_initcall(), so that
the PMU init function is called after every cpu is online.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Explicitly disallow enabling mitigations at runtime for kernels that were
built with CONFIG_CPU_MITIGATIONS=n, as some architectures may omit code
entirely if mitigations are disabled at compile time.
E.g. on x86, a large pile of Kconfigs are buried behind CPU_MITIGATIONS,
and trying to provide sane behavior for retroactively enabling mitigations
is extremely difficult, bordering on impossible. E.g. page table isolation
and call depth tracking require build-time support, BHI mitigations will
still be off without additional kernel parameters, etc.
[ bp: Touchups. ]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240420000556.2645001-3-seanjc@google.com
Rename x86's to CPU_MITIGATIONS, define it in generic code, and force it
on for all architectures exception x86. A recent commit to turn
mitigations off by default if SPECULATION_MITIGATIONS=n kinda sorta
missed that "cpu_mitigations" is completely generic, whereas
SPECULATION_MITIGATIONS is x86-specific.
Rename x86's SPECULATIVE_MITIGATIONS instead of keeping both and have it
select CPU_MITIGATIONS, as having two configs for the same thing is
unnecessary and confusing. This will also allow x86 to use the knob to
manage mitigations that aren't strictly related to speculative
execution.
Use another Kconfig to communicate to common code that CPU_MITIGATIONS
is already defined instead of having x86's menu depend on the common
CPU_MITIGATIONS. This allows keeping a single point of contact for all
of x86's mitigations, and it's not clear that other architectures *want*
to allow disabling mitigations at compile-time.
Fixes: f337a6a21e ("x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n")
Closes: https://lkml.kernel.org/r/20240413115324.53303a68%40canb.auug.org.au
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240420000556.2645001-2-seanjc@google.com
Our Tarragon platform uses a active-low signal to inform
the i.MX6ULL about the over-current detection.
Fixes: 5e4f393ccb ("ARM: dts: imx6ull: Add chargebyte Tarragon support")
Signed-off-by: Michael Heimpold <michael.heimpold@chargebyte.com>
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
vgic_v2_parse_attr() is responsible for finding the vCPU that matches
the user-provided CPUID, which (of course) may not be valid. If the ID
is invalid, kvm_get_vcpu_by_id() returns NULL, which isn't handled
gracefully.
Similar to the GICv3 uaccess flow, check that kvm_get_vcpu_by_id()
actually returns something and fail the ioctl if not.
Cc: stable@vger.kernel.org
Fixes: 7d450e2821 ("KVM: arm/arm64: vgic-new: Add userland access to VGIC dist registers")
Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240424173959.3776798-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
cpu_feature_enabled(X86_FEATURE_OSPKE) does not necessarily reflect
whether CR4.PKE is set on the CPU. In particular, they may differ on
non-BSP CPUs before setup_pku() is executed. In this scenario, RDPKRU
will #UD causing the system to hang.
Fix by checking CR4 for PKE enablement which is always correct for the
current CPU.
The scenario happens by inserting a WARN* before setup_pku() in
identiy_cpu() or some other diagnostic which would lead to calling
__show_regs().
[ bp: Massage commit message. ]
Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240421191728.32239-1-bp@kernel.org
As with most architectures, allow handling of read faults in VMAs that
have VM_WRITE but without VM_READ (WRITE implies READ).
Otherwise, reading before writing a write-only memory will error while
reading after writing everything is fine.
BTW, move the VM_EXEC judgement before VM_READ/VM_WRITE to make logic a
little clearer.
Cc: stable@vger.kernel.org
Fixes: 09cfefb7fa ("LoongArch: Add memory management")
Signed-off-by: Jiantao Shan <shanjiantao@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
In commit 85fcde402d ("kexec: split crashkernel reservation code
out from crash_core.c"), crashkernel reservation code is split out from
crash_core.c, and add CRASH_RESERVE to control it. And also rename each
ARCH's <asm/crash_core.h> to <asm/crash_reserve.h> accordingly.
But the relevant part in LoongArch is missed. Do it now.
Fixes: 85fcde402d ("kexec: split crashkernel reservation code out from crash_core.c")
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The current implementation of the mov instruction with sign extension has the
following problems:
1. It clobbers the source register if it is not stacked because it
sign extends the source and then moves it to the destination.
2. If the dst_reg is stacked, the current code doesn't write the value
back in case of 64-bit mov.
3. There is room for improvement by emitting fewer instructions.
The steps for fixing this and the instructions emitted by the JIT are explained
below with examples in all combinations:
Case A: offset == 32:
=====================
Case A.1: src and dst are stacked registers:
--------------------------------------------
1. Load src_lo into tmp_lo
2. Store tmp_lo into dst_lo
3. Sign extend tmp_lo into tmp_hi
4. Store tmp_hi to dst_hi
Example: r3 = (s32)r3
r3 is a stacked register
ldr r6, [r11, #-16] // Load r3_lo into tmp_lo
// str to dst_lo is not emitted because src_lo == dst_lo
asr r7, r6, #31 // Sign extend tmp_lo into tmp_hi
str r7, [r11, #-12] // Store tmp_hi into r3_hi
Case A.2: src is stacked but dst is not:
----------------------------------------
1. Load src_lo into dst_lo
2. Sign extend dst_lo into dst_hi
Example: r6 = (s32)r3
r6 maps to {ARM_R5, ARM_R4} and r3 is stacked
ldr r4, [r11, #-16] // Load r3_lo into r6_lo
asr r5, r4, #31 // Sign extend r6_lo into r6_hi
Case A.3: src is not stacked but dst is stacked:
------------------------------------------------
1. Store src_lo into dst_lo
2. Sign extend src_lo into tmp_hi
3. Store tmp_hi to dst_hi
Example: r3 = (s32)r6
r3 is stacked and r6 maps to {ARM_R5, ARM_R4}
str r4, [r11, #-16] // Store r6_lo to r3_lo
asr r7, r4, #31 // Sign extend r6_lo into tmp_hi
str r7, [r11, #-12] // Store tmp_hi to dest_hi
Case A.4: Both src and dst are not stacked:
-------------------------------------------
1. Mov src_lo into dst_lo
2. Sign extend src_lo into dst_hi
Example: (bf) r6 = (s32)r6
r6 maps to {ARM_R5, ARM_R4}
// Mov not emitted because dst == src
asr r5, r4, #31 // Sign extend r6_lo into r6_hi
Case B: offset != 32:
=====================
Case B.1: src and dst are stacked registers:
--------------------------------------------
1. Load src_lo into tmp_lo
2. Sign extend tmp_lo according to offset.
3. Store tmp_lo into dst_lo
4. Sign extend tmp_lo into tmp_hi
5. Store tmp_hi to dst_hi
Example: r9 = (s8)r3
r9 and r3 are both stacked registers
ldr r6, [r11, #-16] // Load r3_lo into tmp_lo
lsl r6, r6, #24 // Sign extend tmp_lo
asr r6, r6, #24 // ..
str r6, [r11, #-56] // Store tmp_lo to r9_lo
asr r7, r6, #31 // Sign extend tmp_lo to tmp_hi
str r7, [r11, #-52] // Store tmp_hi to r9_hi
Case B.2: src is stacked but dst is not:
----------------------------------------
1. Load src_lo into dst_lo
2. Sign extend dst_lo according to offset.
3. Sign extend tmp_lo into dst_hi
Example: r6 = (s8)r3
r6 maps to {ARM_R5, ARM_R4} and r3 is stacked
ldr r4, [r11, #-16] // Load r3_lo to r6_lo
lsl r4, r4, #24 // Sign extend r6_lo
asr r4, r4, #24 // ..
asr r5, r4, #31 // Sign extend r6_lo into r6_hi
Case B.3: src is not stacked but dst is stacked:
------------------------------------------------
1. Sign extend src_lo into tmp_lo according to offset.
2. Store tmp_lo into dst_lo.
3. Sign extend src_lo into tmp_hi.
4. Store tmp_hi to dst_hi.
Example: r3 = (s8)r1
r3 is stacked and r1 maps to {ARM_R3, ARM_R2}
lsl r6, r2, #24 // Sign extend r1_lo to tmp_lo
asr r6, r6, #24 // ..
str r6, [r11, #-16] // Store tmp_lo to r3_lo
asr r7, r6, #31 // Sign extend tmp_lo to tmp_hi
str r7, [r11, #-12] // Store tmp_hi to r3_hi
Case B.4: Both src and dst are not stacked:
-------------------------------------------
1. Sign extend src_lo into dst_lo according to offset.
2. Sign extend dst_lo into dst_hi.
Example: r6 = (s8)r1
r6 maps to {ARM_R5, ARM_R4} and r1 maps to {ARM_R3, ARM_R2}
lsl r4, r2, #24 // Sign extend r1_lo to r6_lo
asr r4, r4, #24 // ..
asr r5, r4, #31 // Sign extend r6_lo to r6_hi
Fixes: fc832653fa ("arm32, bpf: add support for sign-extension mov instruction")
Reported-by: syzbot+186522670e6722692d86@syzkaller.appspotmail.com
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Closes: https://lore.kernel.org/all/000000000000e9a8d80615163f2a@google.com
Link: https://lore.kernel.org/bpf/20240419182832.27707-1-puranjay@kernel.org
The first CSI2 pixel clock are supplied from IMX8MP_CLK_MEDIA_CAM1_PIX_ROOT,
the second CSI2 pixel clock are supplied from IMX8MP_CLK_MEDIA_CAM2_PIX_ROOT,
both clock are supplied from SYS_PLL2 and configured using assigned-clock DT
properties. Each CSI2 DT node configures its IMX8MP_CLK_MEDIA_CAMn_PIX_ROOT
clock. This used to be the case until likely a copy-paste error in commit
f78835d1e6 ("arm64: dts: imx8mp: reparent MEDIA_MIPI_PHY1_REF to CLK_24M")
which changed the second CSI2 node to configure IMX8MP_CLK_MEDIA_CAM1_PIX_ROOT
using its assigned-clocks property.
Fix the second CSI2 assigned-clock property back to the original correct
IMX8MP_CLK_MEDIA_CAM2_PIX_ROOT .
Fixes: f78835d1e6 ("arm64: dts: imx8mp: reparent MEDIA_MIPI_PHY1_REF to CLK_24M")
Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Pull scheduler fix from Borislav Petkov:
- Add a missing memory barrier in the concurrency ID mm switching
* tag 'sched_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Add missing memory barrier in switch_mm_cid
Pull x86 fixes from Borislav Petkov:
- Fix CPU feature dependencies of GFNI, VAES, and VPCLMULQDQ
- Print the correct error code when FRED reports a bad event type
- Add a FRED-specific INT80 handler without the special dances that
need to happen in the current one
- Enable the using-the-default-return-thunk-but-you-should-not warning
only on configs which actually enable those special return thunks
- Check the proper feature flags when selecting BHI retpoline
mitigation
* tag 'x86_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpufeatures: Fix dependencies for GFNI, VAES, and VPCLMULQDQ
x86/fred: Fix incorrect error code printout in fred_bad_type()
x86/fred: Fix INT80 emulation for FRED
x86/retpolines: Enable the default thunk warning only on relevant configs
x86/bugs: Fix BHI retpoline check
By checking the pmic node with microchip,mcp16502.yaml#
'regulator-suspend-voltage' does not match any of the
regexes 'pinctrl-[0-9]+' from schema microchip,mcp16502.yaml#
which inherits regulator.yaml#. So replace regulator-suspend-voltage
with regulator-suspend-microvolt to avoid the inconsitency.
Fixes: 85b1304b9d ("ARM: dts: at91: sama7g5ek: set regulator voltages for standby state")
Signed-off-by: Andrei Simion <andrei.simion@microchip.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Link: https://lore.kernel.org/r/20240404123824.19182-2-andrei.simion@microchip.com
[claudiu.beznea: added a dot before starting the last sentence in commit
description]
Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
Pull kvm fixes from Paolo Bonzini:
"This is a bit on the large side, mostly due to two changes:
- Changes to disable some broken PMU virtualization (see below for
details under "x86 PMU")
- Clean up SVM's enter/exit assembly code so that it can be compiled
without OBJECT_FILES_NON_STANDARD. This fixes a warning "Unpatched
return thunk in use. This should not happen!" when running KVM
selftests.
Everything else is small bugfixes and selftest changes:
- Fix a mostly benign bug in the gfn_to_pfn_cache infrastructure
where KVM would allow userspace to refresh the cache with a bogus
GPA. The bug has existed for quite some time, but was exposed by a
new sanity check added in 6.9 (to ensure a cache is either
GPA-based or HVA-based).
- Drop an unused param from gfn_to_pfn_cache_invalidate_start() that
got left behind during a 6.9 cleanup.
- Fix a math goof in x86's hugepage logic for
KVM_SET_MEMORY_ATTRIBUTES that results in an array overflow
(detected by KASAN).
- Fix a bug where KVM incorrectly clears root_role.direct when
userspace sets guest CPUID.
- Fix a dirty logging bug in the where KVM fails to write-protect
SPTEs used by a nested guest, if KVM is using Page-Modification
Logging and the nested hypervisor is NOT using EPT.
x86 PMU:
- Drop support for virtualizing adaptive PEBS, as KVM's
implementation is architecturally broken without an obvious/easy
path forward, and because exposing adaptive PEBS can leak host LBRs
to the guest, i.e. can leak host kernel addresses to the guest.
- Set the enable bits for general purpose counters in
PERF_GLOBAL_CTRL at RESET time, as done by both Intel and AMD
processors.
- Disable LBR virtualization on CPUs that don't support LBR
callstacks, as KVM unconditionally uses
PERF_SAMPLE_BRANCH_CALL_STACK when creating the perf event, and
would fail on such CPUs.
Tests:
- Fix a flaw in the max_guest_memory selftest that results in it
exhausting the supply of ucall structures when run with more than
256 vCPUs.
- Mark KVM_MEM_READONLY as supported for RISC-V in
set_memory_region_test"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (30 commits)
KVM: Drop unused @may_block param from gfn_to_pfn_cache_invalidate_start()
KVM: selftests: Add coverage of EPT-disabled to vmx_dirty_log_test
KVM: x86/mmu: Fix and clarify comments about clearing D-bit vs. write-protecting
KVM: x86/mmu: Remove function comments above clear_dirty_{gfn_range,pt_masked}()
KVM: x86/mmu: Write-protect L2 SPTEs in TDP MMU when clearing dirty status
KVM: x86/mmu: Precisely invalidate MMU root_role during CPUID update
KVM: VMX: Disable LBR virtualization if the CPU doesn't support LBR callstacks
perf/x86/intel: Expose existence of callback support to KVM
KVM: VMX: Snapshot LBR capabilities during module initialization
KVM: x86/pmu: Do not mask LVTPC when handling a PMI on AMD platforms
KVM: x86: Snapshot if a vCPU's vendor model is AMD vs. Intel compatible
KVM: x86: Stop compiling vmenter.S with OBJECT_FILES_NON_STANDARD
KVM: SVM: Create a stack frame in __svm_sev_es_vcpu_run()
KVM: SVM: Save/restore args across SEV-ES VMRUN via host save area
KVM: SVM: Save/restore non-volatile GPRs in SEV-ES VMRUN via host save area
KVM: SVM: Clobber RAX instead of RBX when discarding spec_ctrl_intercepted
KVM: SVM: Drop 32-bit "support" from __svm_sev_es_vcpu_run()
KVM: SVM: Wrap __svm_sev_es_vcpu_run() with #ifdef CONFIG_KVM_AMD_SEV
KVM: SVM: Create a stack frame in __svm_vcpu_run() for unwinding
KVM: SVM: Remove a useless zeroing of allocated memory
...
Pull powerpc fixes from Michael Ellerman:
- Fix wireguard loading failure on pre-Power10 due to Power10 crypto
routines
- Fix papr-vpd selftest failure due to missing variable initialization
- Avoid unnecessary get/put in spapr_tce_platform_iommu_attach_dev()
Thanks to Geetika Moolchandani, Jason Gunthorpe, Michal Suchánek, Nathan
Lynch, and Shivaprasad G Bhat.
* tag 'powerpc-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
selftests/powerpc/papr-vpd: Fix missing variable initialization
powerpc/crypto/chacha-p10: Fix failure on non Power10
powerpc/iommu: Refactor spapr_tce_platform_iommu_attach_dev()
Pull arm64 fixes from Catalin Marinas:
- Fix a kernel fault during page table walking in huge_pte_alloc() with
PTABLE_LEVELS=5 due to using p4d_offset() instead of p4d_alloc()
- head.S fix and cleanup to disable the MMU before toggling the
HCR_EL2.E2H bit when entering the kernel with the MMU on from the EFI
stub. Changing this bit (currently from VHE to nVHE) causes some
system registers as well as page table descriptors to be interpreted
differently, potentially resulting in spurious MMU faults
- Fix translation fault in swsusp_save() accessing MEMBLOCK_NOMAP
memory ranges due to kernel_page_present() returning true in most
configurations other than rodata_full == true,
CONFIG_DEBUG_PAGEALLOC=y or CONFIG_KFENCE=y
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: hibernate: Fix level3 translation fault in swsusp_save()
arm64/head: Disable MMU at EL2 before clearing HCR_EL2.E2H
arm64/head: Drop unnecessary pre-disable-MMU workaround
arm64/hugetlb: Fix page table walk in huge_pte_alloc()
Pull s390 updates from Alexander Gordeev:
- Fix NULL pointer dereference in program check handler
- Fake IRBs are important events relevant for problem analysis. Add
traces when queueing and delivering
- Fix a race condition in ccw_device_set_online() that can cause the
online process to fail
- Deferred condition code 1 response indicates that I/O was not started
and should be retried. The current QDIO implementation handles a cc1
response as an error, resulting in a failed QDIO setup. Fix that by
retrying the setup when a cc1 response is received
* tag 's390-6.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/mm: Fix NULL pointer dereference
s390/cio: log fake IRB events
s390/cio: fix race condition during online processing
s390/qdio: handle deferred cc1
Even though the boot protocol stipulates otherwise, an exception has
been made for the EFI stub, and entering the core kernel with the MMU
enabled is permitted. This allows a substantial amount of cache
maintenance to be elided, wich is significant when fast boot times are
critical (e.g., for booting micro-VMs)
Once the initial ID map has been populated, the MMU is disabled as part
of the logic sequence that puts all system registers into a known state.
Any code that needs to execute within the window where the MMU is off is
cleaned to the PoC explicitly, which includes all of HYP text when
entering at EL2.
However, the current sequence of initializing the EL2 system registers
is not safe: HCR_EL2 is set to its nVHE initial state before SCTLR_EL2
is reprogrammed, and this means that a VHE-to-nVHE switch may occur
while the MMU is enabled. This switch causes some system registers as
well as page table descriptors to be interpreted in a different way,
potentially resulting in spurious exceptions relating to MMU
translation.
So disable the MMU explicitly first when entering in EL2 with the MMU
and caches enabled.
Fixes: 6178617038 ("efi: arm64: enter with MMU and caches enabled")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Cc: <stable@vger.kernel.org> # 6.3.x
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240415075412.2347624-6-ardb+git@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Fix cpuid_deps[] to list the correct dependencies for GFNI, VAES, and
VPCLMULQDQ. These features don't depend on AVX512, and there exist CPUs
that support these features but not AVX512. GFNI actually doesn't even
depend on AVX.
This prevents GFNI from being unnecessarily disabled if AVX is disabled
to mitigate the GDS vulnerability.
This also prevents all three features from being unnecessarily disabled
if AVX512VL (or its dependency AVX512F) were to be disabled, but it
looks like there isn't any case where this happens anyway.
Fixes: c128dbfa0f ("x86/cpufeatures: Enable new SSE/AVX/AVX512 CPU features")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20240417060434.47101-1-ebiggers@kernel.org