Commit Graph

1427891 Commits

Author SHA1 Message Date
Vincent Donnefort
680a04c333 KVM: arm64: Add tracing capability for the nVHE/pKVM hyp
There is currently no way to inspect or log what's happening at EL2
when the nVHE or pKVM hypervisor is used. With the growing set of
features for pKVM, the need for tooling is more pressing. And tracefs,
by its reliability, versatility and support for user-space is fit for
purpose.

Add support to write into a tracefs compatible ring-buffer. There's no
way the hypervisor could log events directly into the host tracefs
ring-buffers. So instead let's use our own, where the hypervisor is the
writer and the host the reader.

Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://patch.msgid.link/20260309162516.2623589-24-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-03-11 08:51:16 +00:00
Vincent Donnefort
4cdf8dec8c KVM: arm64: Support unaligned fixmap in the pKVM hyp
Return the fixmap VA with the page offset, instead of the page base
address. This allows to use hyp_fixmap_map() seamlessly regardless of
the address alignment. While at it, do the same for hyp_fixblock_map().

Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://patch.msgid.link/20260309162516.2623589-23-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-03-11 08:51:16 +00:00
Vincent Donnefort
8bbeb4d169 KVM: arm64: Initialise hyp_nr_cpus for nVHE hyp
Knowing the number of CPUs is necessary for determining the boundaries
of per-cpu variables, which will be used for upcoming hypervisor
tracing. hyp_nr_cpus which stores this value, is only initialised for
the pKVM hypervisor. Make it accessible for the nVHE hypervisor as well.

With the kernel now responsible for initialising hyp_nr_cpus, the
nr_cpus parameter is no longer needed in __pkvm_init.

Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://patch.msgid.link/20260309162516.2623589-22-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-03-11 08:51:16 +00:00
Vincent Donnefort
405df5b557 KVM: arm64: Add clock support to nVHE/pKVM hyp
In preparation for supporting tracing from the nVHE hyp, add support to
generate timestamps with a clock fed by the CNTCVT counter. The clock
can be kept in sync with the kernel's by updating the slope values. This
will be done later.

As current we do only create a trace clock, make the whole support
dependent on the upcoming CONFIG_NVHE_EL2_TRACING.

Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://patch.msgid.link/20260309162516.2623589-21-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-03-11 08:51:16 +00:00
Vincent Donnefort
9019e82c7e KVM: arm64: Add PKVM_DISABLE_STAGE2_ON_PANIC
On NVHE_EL2_DEBUG, when using pKVM, the host stage-2 is relaxed to grant
the kernel access to the stacktrace, hypervisor bug table and text to
symbolize addresses. This is unsafe for production. In preparation for
adding more debug options to NVHE_EL2_DEBUG, decouple the stage-2
relaxation into a separate option.

While at it, rename PROTECTED_NVHE_STACKTRACE into PKVM_STACKTRACE,
following the same naming scheme as PKVM_DISABLE_STAGE2_ON_PANIC.

Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://patch.msgid.link/20260309162516.2623589-20-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-03-11 08:51:16 +00:00
Vincent Donnefort
a717943d8e tracing: Check for undefined symbols in simple_ring_buffer
The simple_ring_buffer implementation must remain simple enough to be
used by the pKVM hypervisor. Prevent the object build if unresolved
symbols are found.

Link: https://patch.msgid.link/20260309162516.2623589-19-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:55 -04:00
Vincent Donnefort
635923081c tracing: load/unload page callbacks for simple_ring_buffer
Add load/unload callback used for each admitted page in the ring-buffer.
This will be later useful for the pKVM hypervisor which uses a different
VA space and need to dynamically map/unmap the ring-buffer pages.

Link: https://patch.msgid.link/20260309162516.2623589-18-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:55 -04:00
Vincent Donnefort
c88d510584 Documentation: tracing: Add tracing remotes
Add documentation about the newly introduced tracing remotes framework.

Link: https://patch.msgid.link/20260309162516.2623589-17-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:55 -04:00
Vincent Donnefort
0a1b03251d tracing: selftests: Add trace remote tests
Exercise the tracefs interface for trace remote with a set of tests to
check:

  * loading/unloading (unloading.tc)
  * reset (reset.tc)
  * size changes (buffer_size.tc)
  * consuming read (trace_pipe.tc)
  * non-consuming read (trace.tc)

Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: linux-kselftest@vger.kernel.org
Link: https://patch.msgid.link/20260309162516.2623589-16-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:55 -04:00
Vincent Donnefort
ea908a2b79 tracing: Add a trace remote module for testing
Add a module to help testing the tracefs support for trace remotes. This
module:

  * Use simple_ring_buffer to write into a ring-buffer.
  * Declare a single "selftest" event that can be triggered from
    user-space.
  * Register a "test" trace remote.

This is intended to be used by trace remote selftests.

Link: https://patch.msgid.link/20260309162516.2623589-15-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:55 -04:00
Vincent Donnefort
34e5b958bd tracing: Introduce simple_ring_buffer
Add a simple implementation of the kernel ring-buffer. This intends to
be used later by ring-buffer remotes such as the pKVM hypervisor, hence
the need for a cut down version (write only) without any dependency.

Link: https://patch.msgid.link/20260309162516.2623589-14-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:55 -04:00
Vincent Donnefort
93ae1b76ff ring-buffer: Export buffer_data_page and macros
In preparation for allowing the writing of ring-buffer compliant pages
outside of ring_buffer.c, move buffer_data_page and timestamps encoding
macros into the publicly available ring_buffer_types.h.

Link: https://patch.msgid.link/20260309162516.2623589-13-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:55 -04:00
Vincent Donnefort
5f3efd1dce tracing: Add helpers to create trace remote events
Declaring remote events can be cumbersome let's add a set of macros to
simplify developers life. The declaration of a remote event is very
similar to kernel's events:

 REMOTE_EVENT(name, id,
     RE_STRUCT(
        re_field(u64 foo)
     ),
     RE_PRINTK("foo=%llu", __entry->foo)
 )

Link: https://patch.msgid.link/20260309162516.2623589-12-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:54 -04:00
Vincent Donnefort
775cb093bc tracing: Add events/ root files to trace remotes
Just like for the kernel events directory, add 'enable', 'header_page'
and 'header_event' at the root of the trace remote events/ directory.

Link: https://patch.msgid.link/20260309162516.2623589-11-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:54 -04:00
Vincent Donnefort
072529158e tracing: Add events to trace remotes
An event is predefined point in the writer code that allows to log
data. Following the same scheme as kernel events, add remote events,
described to user-space within the events/ tracefs directory found in
the corresponding trace remote.

Remote events are expected to be described during the trace remote
registration.

Add also a .enable_event callback for trace_remote to toggle the event
logging, if supported.

Link: https://patch.msgid.link/20260309162516.2623589-10-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:54 -04:00
Vincent Donnefort
bf2ba0f8ca tracing: Add init callback to trace remotes
Add a .init call back so the trace remote callers can add entries to the
tracefs directory.

Link: https://patch.msgid.link/20260309162516.2623589-9-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:54 -04:00
Vincent Donnefort
330b0cceb3 tracing: Add non-consuming read to trace remotes
Allow reading the trace file for trace remotes. This performs a
non-consuming read of the trace buffer.

Link: https://patch.msgid.link/20260309162516.2623589-8-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:54 -04:00
Vincent Donnefort
9af4ab0e11 tracing: Add reset to trace remotes
Allow to reset the trace remote buffer by writing to the Tracefs "trace"
file. This is similar to the regular Tracefs interface.

Link: https://patch.msgid.link/20260309162516.2623589-7-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:54 -04:00
Vincent Donnefort
96e43537af tracing: Introduce trace remotes
A trace remote relies on ring-buffer remotes to read and control
compatible tracing buffers, written by entity such as firmware or
hypervisor.

Add a Tracefs directory remotes/ that contains all instances of trace
remotes. Each instance follows the same hierarchy as any other to ease
the support by existing user-space tools.

This currently does not provide any event support, which will come
later.

Link: https://patch.msgid.link/20260309162516.2623589-6-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:53 -04:00
Vincent Donnefort
fbd1743ecb ring-buffer: Add non-consuming read for ring-buffer remotes
Hopefully, the remote will only swap pages on the kernel instruction (via
the swap_reader_page() callback). This means we know at what point the
ring-buffer geometry has changed. It is therefore possible to rearrange
the kernel view of that ring-buffer to allow non-consuming read.

Link: https://patch.msgid.link/20260309162516.2623589-5-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:53 -04:00
Vincent Donnefort
2e67fabd8b ring-buffer: Introduce ring-buffer remotes
Add ring-buffer remotes to support entities outside of the kernel (such
as firmware or a hypervisor) that writes events into a ring-buffer using
the tracefs format

Require a description of the ring-buffer pages (struct
trace_buffer_desc) and callbacks (swap_reader_page and reset) to set up
the ring-buffer on the kernel side.

Expect the remote entity to maintain and update the meta-page.

Link: https://patch.msgid.link/20260309162516.2623589-4-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:53 -04:00
Vincent Donnefort
e682207bf7 ring-buffer: Store bpage pointers into subbuf_ids
The subbuf_ids field allows to point to a specific page from the
ring-buffer based on its ID. As a preparation or the upcoming
ring-buffer remote support, point this array to the buffer_page instead
of the buffer_data_page.

Link: https://patch.msgid.link/20260309162516.2623589-3-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:53 -04:00
Vincent Donnefort
7d776a3627 ring-buffer: Add page statistics to the meta-page
Add two fields pages_touched and pages_lost to the ring-buffer
meta-page. Those fields are useful to get the number of used pages in
the ring-buffer.

Link: https://patch.msgid.link/20260309162516.2623589-2-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:53 -04:00
Linus Torvalds
1f318b96cc Linux 7.0-rc3 v7.0-rc3 2026-03-08 16:56:54 -07:00
Linus Torvalds
fc9f248d8c Merge tag 'efi-fixes-for-v7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fix from Ard Biesheuvel:
 "Fix for the x86 EFI workaround keeping boot services code and data
  regions reserved until after SetVirtualAddressMap() completes:
  deferred struct page initialization may result in some of this memory
  being lost permanently"

* tag 'efi-fixes-for-v7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  x86/efi: defer freeing of boot services memory
2026-03-08 12:13:09 -07:00
Linus Torvalds
014441d1e4 Merge tag 'i2c-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fix from Wolfram Sang:
 "A revert for the i801 driver restoring old locking behaviour"

* tag 'i2c-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: i801: Revert "i2c: i801: replace acpi_lock with I2C bus lock"
2026-03-08 10:17:05 -07:00
Linus Torvalds
c23719abc3 Merge tag 'x86-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:

 - Fix SEV guest boot failures in certain circumstances, due to
   very early code relying on a BSS-zeroed variable that isn't
   actually zeroed yet an may contain non-zero bootup values

   Move the variable into the .data section go gain even earlier
   zeroing

 - Expose & allow the IBPB-on-Entry feature on SNP guests, which
   was not properly exposed to guests due to initial implementational
   caution

 - Fix O= build failure when CONFIG_EFI_SBAT_FILE is using relative
   file paths

 - Fix the various SNC (Sub-NUMA Clustering) topology enumeration
   bugs/artifacts (sched-domain build errors mostly).

   SNC enumeration data got more complicated with Granite Rapids X
   (GNR) and Clearwater Forest X (CWF), which exposed these bugs
   and made their effects more serious

 - Also use the now sane(r) SNC code to fix resctrl SNC detection bugs

 - Work around a historic libgcc unwinder bug in the vdso32 sigreturn
   code (again), which regressed during an overly aggressive recent
   cleanup of DWARF annotations

* tag 'x86-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/entry/vdso32: Work around libgcc unwinder bug
  x86/resctrl: Fix SNC detection
  x86/topo: Fix SNC topology mess
  x86/topo: Replace x86_has_numa_in_package
  x86/topo: Add topology_num_nodes_per_package()
  x86/numa: Store extra copy of numa_nodes_parsed
  x86/boot: Handle relative CONFIG_EFI_SBAT_FILE file paths
  x86/sev: Allow IBPB-on-Entry feature for SNP guests
  x86/boot/sev: Move SEV decompressor variables into the .data section
2026-03-07 17:12:06 -08:00
Linus Torvalds
6ff1020c2f Merge tag 'timers-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
 "Make clock_adjtime() syscall timex validation slightly more permissive
  for auxiliary clocks, to not reject syscalls based on the status field
  that do not try to modify the status field.

  This makes the ABI behavior in clock_adjtime() consistent with
  CLOCK_REALTIME"

* tag 'timers-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timekeeping: Fix timex status validation for auxiliary clocks
2026-03-07 17:09:15 -08:00
Linus Torvalds
b1b9a9d0b5 Merge tag 'sched-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
 "Fix a DL scheduler bug that may corrupt internal metrics during PI and
  setscheduler() syscalls, resulting in kernel warnings and misbehavior.

  Found during stress-testing"

* tag 'sched-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/deadline: Fix missing ENQUEUE_REPLENISH during PI de-boosting
2026-03-07 17:07:13 -08:00
Eric Dumazet
1954c4f012 eventpoll: Convert epoll_put_uevent() to scoped user access
Saves two function calls, and one stac/clac pair.

stac/clac is rather expensive on older cpus like Zen 2.

A synthetic network stress test gives a ~1.5% increase of pps
on AMD Zen 2.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-03-07 15:03:14 -08:00
Linus Torvalds
3b5d535c63 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "Two core changes and the rest in drivers, one core change to quirk the
  behaviour of the Iomega Zip drive and one to fix a hang caused by tag
  reallocation problems, which has mostly been seen by the iscsi client.

  Note the latter fixes the problem but still has a slight sysfs memory
  leak, so will be amended in the next pull request (once we've run the
  fix for the fix through our testing)"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: target: Fix recursive locking in __configfs_open_file()
  scsi: devinfo: Add BLIST_SKIP_IO_HINTS for Iomega ZIP
  scsi: mpi3mr: Clear reset history on ready and recheck state after timeout
  scsi: core: Fix refcount leak for tagset_refcnt
2026-03-07 14:04:50 -08:00
Linus Torvalds
fb07430e6f Merge tag 'fbdev-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
Pull fbdev fix from Helge Deller:
 "Silence build error in au1100fb driver found by kernel test robot"

* tag 'fbdev-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
  fbdev: au1100fb: Fix build on MIPS64
2026-03-07 13:21:43 -08:00
Linus Torvalds
6deccafcb4 Merge tag 'parisc-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
 "While testing Sasha Levin's 'kallsyms: embed source file:line info in
  kernel stack traces' patch series, which increases the typical kernel
  image size, I found some issues with the parisc initial kernel mapping
  which may prevent the kernel to boot.

  The three small patches here fix this"

* tag 'parisc-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix initial page table creation for boot
  parisc: Check kernel mapping earlier at bootup
  parisc: Increase initial mapping to 64 MB with KALLSYMS
2026-03-07 12:38:16 -08:00
Linus Torvalds
8b7f4cd3ac Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:

 - Fix u32/s32 bounds when ranges cross min/max boundary (Eduard
   Zingerman)

 - Fix precision backtracking with linked registers (Eduard Zingerman)

 - Fix linker flags detection for resolve_btfids (Ihor Solodrai)

 - Fix race in update_ftrace_direct_add/del (Jiri Olsa)

 - Fix UAF in bpf_trampoline_link_cgroup_shim (Lang Xu)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  resolve_btfids: Fix linker flags detection
  selftests/bpf: add reproducer for spurious precision propagation through calls
  bpf: collect only live registers in linked regs
  Revert "selftests/bpf: Update reg_bound range refinement logic"
  selftests/bpf: test refining u32/s32 bounds when ranges cross min/max boundary
  bpf: Fix u32/s32 bounds when ranges cross min/max boundary
  bpf: Fix a UAF issue in bpf_trampoline_link_cgroup_shim
  ftrace: Add missing ftrace_lock to update_ftrace_direct_add/del
2026-03-07 12:20:37 -08:00
Linus Torvalds
03dcad79ee Merge tag 'rcu-fixes.v7.0-20260307a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux
Pull RCU selftest fixes from Boqun Feng:
 "Fix a regression in RCU torture test pre-defined scenarios caused by
  commit 7dadeaa6e8 ("sched: Further restrict the preemption modes")
  which limits PREEMPT_NONE to architectures that do not support
  preemption at all and PREEMPT_VOLUNTARY to those architectures that do
  not yet have PREEMPT_LAZY support.

  Since major architectures (e.g. x86 and arm64) no longer support
  CONFIG_PREEMPT_NONE and CONFIG_PREEMPT_VOLUNTARY, using them in
  rcutorture, rcuscale, refscale, and scftorture pre-defined scenarios
  causes config checking errors.

  Switch these kconfigs to PREEMPT_LAZY"

* tag 'rcu-fixes.v7.0-20260307a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux:
  scftorture: Update due to x86 not supporting none/voluntary preemption
  refscale: Update due to x86 not supporting none/voluntary preemption
  rcuscale: Update due to x86 not supporting none/voluntary preemption
  rcutorture: Update due to x86 not supporting none/voluntary preemption
2026-03-07 11:56:55 -08:00
Linus Torvalds
aed0af05a8 Merge tag 'trace-v7.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:

 - Fix possible NULL pointer dereference in trace_data_alloc()

   On the trace_data_alloc() error path, it can call trigger_data_free()
   with a NULL pointer. This used to be a kfree() but was changed to
   trigger_data_free() to clean up any partial initialization. The issue
   is that trigger_data_free() does not expect a NULL pointer. Have
   trigger_data_free() return safely on NULL pointer.

 - Fix multiple events on the command line and bootconfig

   If multiple events are enabled on the command line separately and not
   grouped, only the last event gets enabled. That is:

      trace_event=sched_switch trace_event=sched_waking

   will only enable sched_waking whereas:

      trace_event=sched_switch,sched_waking

   will enable both.

   The bootconfig makes it even worse as the second way is the more
   common method.

   The issue is that a temporary buffer is used to store the events to
   enable later in boot. Each time the cmdline callback is called, it
   overwrites what was previously there.

   Have the callback append the next value (delimited by a comma) if the
   temporary buffer already has content.

 - Fix command line trace_buffer_size if >= 2G

   The logic to allocate the trace buffer uses "int" for the size
   parameter in the command line code causing overflow issues if more
   that 2G is specified.

* tag 'trace-v7.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Fix trace_buf_size= cmdline parameter with sizes >= 2G
  tracing: Fix enabling multiple events on the kernel command line and bootconfig
  tracing: Add NULL pointer check to trigger_data_free()
2026-03-07 09:50:54 -08:00
Ihor Solodrai
b0dcdcb9ae resolve_btfids: Fix linker flags detection
The "|| echo -lzstd" default makes zstd an unconditional link
dependency of resolve_btfids. On systems where libzstd-dev is not
installed and pkg-config fails, the linker fails:

  ld: cannot find -lzstd: No such file or directory

libzstd is a transitive dependency of libelf, so the -lzstd flag is
strictly necessary only for static builds [1].

Remove ZSTD_LIBS variable, and instead set LIBELF_LIBS depending on
whether the build is static or not. Use $(HOSTPKG_CONFIG) as primary
source of the flags list.

Also add a default value for HOSTPKG_CONFIG in case it's not built via
the toplevel Makefile. Pass it from selftests/bpf too.

[1] https://lore.kernel.org/bpf/4ff82800-2daa-4b9f-95a9-6f512859ee70@linux.dev/

Reported-by: BPF CI Bot (Claude Opus 4.6) <bot+bpf-ci@kernel.org>
Reported-by: Vitaly Chikunov <vt@altlinux.org>
Closes: https://lore.kernel.org/bpf/aaWqMcK-2AQw5dx8@altlinux.org/
Fixes: 4021848a90 ("selftests/bpf: Pass through build flags to bpftool and resolve_btfids")
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Reviewed-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/20260305014730.3123382-1-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-07 08:51:51 -08:00
Linus Torvalds
7b6e48df88 Merge tag 'hwmon-for-v7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:

 - Fix initialization commands for AHT20

 - Correct a malformed email address (emc1403)

 - Check the it87_lock() return value

 - Fix inverted polarity (max6639)

 - Fix overflows, underflows, sign extension, and other problems in
   macsmc

 - Fix stack overflow in debugfs read (pmbus/q54sj108a2)

 - Drop support for SMARC-sAM67 (discontinued and never released to
   market)

* tag 'hwmon-for-v7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (pmbus/q54sj108a2) fix stack overflow in debugfs read
  hwmon: (max6639) fix inverted polarity
  dt-bindings: hwmon: sl28cpld: Drop sa67mcu compatible
  hwmon: (it87) Check the it87_lock() return value
  Revert "hwmon: add SMARC-sAM67 support"
  hwmon: (aht10) Fix initialization commands for AHT20
  hwmon: (emc1403) correct a malformed email address
  hwmon: (macsmc) Fix overflows, underflows, and sign extension
  hwmon: (macsmc) Fix regressions in Apple Silicon SMC hwmon driver
2026-03-07 08:39:59 -08:00
Linus Torvalds
e33aafac04 Merge tag 'driver-core-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
Pull driver core fix from Danilo Krummrich:

 - Revert "driver core: enforce device_lock for driver_match_device()":

   When a device is already present in the system and a driver is
   registered on the same bus, we iterate over all devices registered on
   this bus to see if one of them matches. If we come across an already
   bound one where the corresponding driver crashed while holding the
   device lock (e.g. in probe()) we can't make any progress anymore.

   Thus, revert and clarify that an implementer of struct bus_type must
   not expect match() to be called with the device lock held.

* tag 'driver-core-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core:
  Revert "driver core: enforce device_lock for driver_match_device()"
2026-03-07 08:16:48 -08:00
Linus Torvalds
0f912c8917 Merge tag 'for-linus-7.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:

 - a cleanup of arch/x86/kernel/head_64.S removing the pre-built page
   tables for Xen guests

 - a small comment update

 - another cleanup for Xen PVH guests mode

 - fix an issue with Xen PV-devices backed by driver domains

* tag 'for-linus-7.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/xenbus: better handle backend crash
  xenbus: add xenbus_device parameter to xenbus_read_driver_state()
  x86/PVH: Use boot params to pass RSDP address in start_info page
  x86/xen: update outdated comment
  xen/acpi-processor: fix _CST detection using undersized evaluation buffer
  x86/xen: Build identity mapping page tables dynamically for XENPV
2026-03-07 07:44:32 -08:00
Alexei Starovoitov
325d1ba3ca Merge branch 'bpf-fix-precision-backtracking-bug-with-linked-registers'
Eduard Zingerman says:

====================
bpf: Fix precision backtracking bug with linked registers

Emil Tsalapatis reported a verifier bug hit by the scx_lavd sched_ext
scheduler. The essential part of the verifier log looks as follows:

  436: ...
  // checkpoint hit for 438: (1d) if r7 == r8 goto ...
  frame 3: propagating r2,r7,r8
  frame 2: propagating r6
  mark_precise: frame3: last_idx ...
  mark_precise: frame3: regs=r2,r7,r8 stack= before 436: ...
  mark_precise: frame3: regs=r2,r7 stack= before 435: ...
  mark_precise: frame3: regs=r2,r7 stack= before 434: (85) call bpf_trace_vprintk#177
  verifier bug: backtracking call unexpected regs 84

The log complains that registers r2 and r7 are tracked as precise
while processing the bpf_trace_vprintk() call in precision backtracking.
This can't be right, as r2 is reset by the call and there is nothing
to backtrack it to. The precision propagation is triggered when
a checkpoint is hit at instruction 438, r2 is dead at that instruction.

This happens because of the following sequence of events:
- Instruction 438 is first reached with registers r2 and r7 having
  the same id via a path that does not call bpf_trace_vprintk():
  - Checkpoint is created at 438.
  - The jump at 438 is predicted, hence r7 and registers linked to it
    (r2) are propagated as precise, marking r2 and r7 precise in the
    checkpoint.
- Instruction 438 is reached a second time with r2 undefined and via
  a path that calls bpf_trace_vprintk():
  - Checkpoint is hit.
  - propagate_precision() picks registers r2 and r7 and propagates
    precision marks for those up to the helper call.

The root cause is the fact that states_equal() and
propagate_precision() assume that the precision flag can't be set for a
dead register (as computed by compute_live_registers()).
However, this is not the case when linked registers are at play.
Fix this by accounting for live register flags in
collect_linked_regs().
---
====================

Link: https://patch.msgid.link/20260306-linked-regs-and-propagate-precision-v1-0-18e859be570d@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-06 21:50:05 -08:00
Eduard Zingerman
223ffb6a3d selftests/bpf: add reproducer for spurious precision propagation through calls
Add a test for the scenario described in the previous commit:
an iterator loop with two paths where one ties r2/r7 via
shared scalar id and skips a call, while the other goes
through the call. Precision marks from the linked registers
get spuriously propagated to the call path via
propagate_precision(), hitting "backtracking call unexpected
regs" in backtrack_insn().

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260306-linked-regs-and-propagate-precision-v1-2-18e859be570d@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-06 21:50:05 -08:00
Eduard Zingerman
2658a1720a bpf: collect only live registers in linked regs
Fix an inconsistency between func_states_equal() and
collect_linked_regs():
- regsafe() uses check_ids() to verify that cached and current states
  have identical register id mapping.
- func_states_equal() calls regsafe() only for registers computed as
  live by compute_live_registers().
- clean_live_states() is supposed to remove dead registers from cached
  states, but it can skip states belonging to an iterator-based loop.
- collect_linked_regs() collects all registers sharing the same id,
  ignoring the marks computed by compute_live_registers().
  Linked registers are stored in the state's jump history.
- backtrack_insn() marks all linked registers for an instruction
  as precise whenever one of the linked registers is precise.

The above might lead to a scenario:
- There is an instruction I with register rY known to be dead at I.
- Instruction I is reached via two paths: first A, then B.
- On path A:
  - There is an id link between registers rX and rY.
  - Checkpoint C is created at I.
  - Linked register set {rX, rY} is saved to the jump history.
  - rX is marked as precise at I, causing both rX and rY
    to be marked precise at C.
- On path B:
  - There is no id link between registers rX and rY,
    otherwise register states are sub-states of those in C.
  - Because rY is dead at I, check_ids() returns true.
  - Current state is considered equal to checkpoint C,
    propagate_precision() propagates spurious precision
    mark for register rY along the path B.
  - Depending on a program, this might hit verifier_bug()
    in the backtrack_insn(), e.g. if rY ∈  [r1..r5]
    and backtrack_insn() spots a function call.

The reproducer program is in the next patch.
This was hit by sched_ext scx_lavd scheduler code.

Changes in tests:
- verifier_scalar_ids.c selftests need modification to preserve
  some registers as live for __msg() checks.
- exceptions_assert.c adjusted to match changes in the verifier log,
  R0 is dead after conditional instruction and thus does not get
  range.
- precise.c adjusted to match changes in the verifier log, register r9
  is dead after comparison and it's range is not important for test.

Reported-by: Emil Tsalapatis <emil@etsalapatis.com>
Fixes: 0fb3cf6110 ("bpf: use register liveness information for func_states_equal")
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260306-linked-regs-and-propagate-precision-v1-1-18e859be570d@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-06 21:49:40 -08:00
Linus Torvalds
4ae12d8bd9 Merge tag 'kbuild-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild fixes from Nathan Chancellor:

 - Split out .modinfo section from ELF_DETAILS macro, as that macro may
   be used in other areas that expect to discard .modinfo, breaking
   certain image layouts

 - Adjust genksyms parser to handle optional attributes in certain
   declarations, necessary after commit 07919126ec ("netfilter:
   annotate NAT helper hook pointers with __rcu")

 - Include resolve_btfids in external module build created by
   scripts/package/install-extmod-build when it may be run on external
   modules

 - Avoid removing objtool binary with 'make clean', as it is required
   for external module builds

* tag 'kbuild-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
  kbuild: Leave objtool binary around with 'make clean'
  kbuild: install-extmod-build: Package resolve_btfids if necessary
  genksyms: Fix parsing a declarator with a preceding attribute
  kbuild: Split .modinfo out from ELF_DETAILS
2026-03-06 20:27:13 -08:00
Linus Torvalds
591d8796b2 Merge tag 's390-7.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:

 - Fix stackleak and xor lib inline asm, constraints and clobbers to
   prevent miscompilations and incomplete stack poisoning

* tag 's390-7.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/stackleak: Fix __stackleak_poison() inline assembly constraint
  s390/xor: Improve inline assembly constraints
  s390/xor: Fix xor_xc_2() inline assembly constraints
  s390/xor: Fix xor_xc_5() inline assembly
2026-03-06 20:20:17 -08:00
Linus Torvalds
4660e168c6 Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
 "The main changes are a fix to the way in which we manage the access
  flag setting for mappings using the contiguous bit and a fix for a
  hang on the kexec/hibernation path.

  Summary:

   - Fix kexec/hibernation hang due to bogus read-only mappings

   - Fix sparse warnings in our cmpxchg() implementation

   - Prevent runtime-const being used in modules, just like x86

   - Fix broken elision of access flag modifications for contiguous
     entries on systems without support for hardware updates

   - Fix a broken SVE selftest that was testing the wrong instruction"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  selftest/arm64: Fix sve2p1_sigill() to hwcap test
  arm64: contpte: fix set_access_flags() no-op check for SMMU/ATS faults
  arm64: make runtime const not usable by modules
  arm64: mm: Add PTE_DIRTY back to PAGE_KERNEL* to fix kexec/hibernation
  arm64: Silence sparse warnings caused by the type casting in (cmp)xchg
2026-03-06 19:57:03 -08:00
Calvin Owens
d008ba8be8 tracing: Fix trace_buf_size= cmdline parameter with sizes >= 2G
Some of the sizing logic through tracer_alloc_buffers() uses int
internally, causing unexpected behavior if the user passes a value that
does not fit in an int (on my x86 machine, the result is uselessly tiny
buffers).

Fix by plumbing the parameter's real type (unsigned long) through to the
ring buffer allocation functions, which already use unsigned long.

It has always been possible to create larger ring buffers via the sysfs
interface: this only affects the cmdline parameter.

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://patch.msgid.link/bff42a4288aada08bdf74da3f5b67a2c28b761f8.1772852067.git.calvin@wbinvd.org
Fixes: 73c5162aa3 ("tracing: keep ring buffer to minimum size till used")
Signed-off-by: Calvin Owens <calvin@wbinvd.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-06 22:25:53 -05:00
Alexei Starovoitov
6895e1d7c3 Merge branch 'bpf-fix-u32-s32-bounds-when-ranges-cross-min-max-boundary'
Eduard Zingerman says:

====================
bpf: Fix u32/s32 bounds when ranges cross min/max boundary

Cover the following cases in range refinement logic for 32-bit ranges:
- s32 range crosses U32_MAX/0 boundary, positive part of the s32 range
  overlaps with u32 range.
- s32 range crosses U32_MAX/0 boundary, negative part of the s32 range
  overlaps with u32 range.

These cases are already handled for 64-bit range refinement.

Without the fix the test in patch 2 is rejected by the verifier.
The test was reduced from sched-ext program.

Changelog:
- v2 -> v3:
  - Reverted da653de268 (Paul)
  - Removed !BPF_F_TEST_REG_INVARIANTS flag from
    crossing_32_bit_signed_boundary_2() (Paul)
- v1 -> v2:
  - Extended commit message and comments (Emil)
  - Targeting 'bpf' tree instead of bpf-next (Alexei)

v1: https://lore.kernel.org/bpf/9a23fbacdc6d33ec8fcb3f6988395b5129f75369.camel@gmail.com/T
v2: https://lore.kernel.org/bpf/20260305-bpf-32-bit-range-overflow-v2-0-7169206a3041@gmail.com/
---
====================

Link: https://patch.msgid.link/20260306-bpf-32-bit-range-overflow-v3-0-f7f67e060a6b@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-06 18:16:17 -08:00
Eduard Zingerman
d87c9305a8 Revert "selftests/bpf: Update reg_bound range refinement logic"
This reverts commit da653de268.
Removed logic is now covered by range_refine_in_halves()
which handles both 32-bit and 64-bit refinements.

Suggested-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260306-bpf-32-bit-range-overflow-v3-3-f7f67e060a6b@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-06 18:16:17 -08:00
Eduard Zingerman
f81fdfd167 selftests/bpf: test refining u32/s32 bounds when ranges cross min/max boundary
Two test cases for signed/unsigned 32-bit bounds refinement
when s32 range crosses the sign boundary:
- s32 range [S32_MIN..1] overlapping with u32 range [3..U32_MAX],
  s32 range tail before sign boundary overlaps with u32 range.
- s32 range [-3..5] overlapping with u32 range [0..S32_MIN+3],
  s32 range head after the sign boundary overlaps with u32 range.

This covers both branches added in the __reg32_deduce_bounds().

Also, crossing_32_bit_signed_boundary_2() no longer triggers invariant
violations.

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Reviewed-by: Paul Chaignon <paul.chaignon@gmail.com>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260306-bpf-32-bit-range-overflow-v3-2-f7f67e060a6b@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-06 18:16:17 -08:00