Commit Graph

21684 Commits

Author SHA1 Message Date
Paolo Bonzini
1a14928e2e Merge tag 'kvm-x86-misc-6.17' of https://github.com/kvm-x86/linux into HEAD
KVM x86 misc changes for 6.17

 - Prevert the host's DEBUGCTL.FREEZE_IN_SMM (Intel only) when running the
   guest.  Failure to honor FREEZE_IN_SMM can bleed host state into the guest.

 - Explicitly check vmcs12.GUEST_DEBUGCTL on nested VM-Enter (Intel only) to
   prevent L1 from running L2 with features that KVM doesn't support, e.g. BTF.

 - Intercept SPEC_CTRL on AMD if the MSR shouldn't exist according to the
   vCPU's CPUID model.

 - Rework the MSR interception code so that the SVM and VMX APIs are more or
   less identical.

 - Recalculate all MSR intercepts from the "source" on MSR filter changes, and
   drop the dedicated "shadow" bitmaps (and their awful "max" size defines).

 - WARN and reject loading kvm-amd.ko instead of panicking the kernel if the
   nested SVM MSRPM offsets tracker can't handle an MSR.

 - Advertise support for LKGS (Load Kernel GS base), a new instruction that's
   loosely related to FRED, but is supported and enumerated independently.

 - Fix a user-triggerable WARN that syzkaller found by stuffing INIT_RECEIVED,
   a.k.a. WFS, and then putting the vCPU into VMX Root Mode (post-VMXON).  Use
   the same approach KVM uses for dealing with "impossible" emulation when
   running a !URG guest, and simply wait until KVM_RUN to detect that the vCPU
   has architecturally impossible state.

 - Add KVM_X86_DISABLE_EXITS_APERFMPERF to allow disabling interception of
   APERF/MPERF reads, so that a "properly" configured VM can "virtualize"
   APERF/MPERF (with many caveats).

 - Reject KVM_SET_TSC_KHZ if vCPUs have been created, as changing the "default"
   frequency is unsupported for VMs with a "secure" TSC, and there's no known
   use case for changing the default frequency for other VM types.
2025-07-29 08:36:43 -04:00
Paolo Bonzini
f02b1bcc73 Merge tag 'kvm-x86-irqs-6.17' of https://github.com/kvm-x86/linux into HEAD
KVM IRQ changes for 6.17

 - Rework irqbypass to track/match producers and consumers via an xarray
   instead of a linked list.  Using a linked list leads to O(n^2) insertion
   times, which is hugely problematic for use cases that create large numbers
   of VMs.  Such use cases typically don't actually use irqbypass, but
   eliminating the pointless registration is a future problem to solve as it
   likely requires new uAPI.

 - Track irqbypass's "token" as "struct eventfd_ctx *" instead of a "void *",
   to avoid making a simple concept unnecessarily difficult to understand.

 - Add CONFIG_KVM_IOAPIC for x86 to allow disabling support for I/O APIC, PIC,
   and PIT emulation at compile time.

 - Drop x86's irq_comm.c, and move a pile of IRQ related code into irq.c.

 - Fix a variety of flaws and bugs in the AVIC device posted IRQ code.

 - Inhibited AVIC if a vCPU's ID is too big (relative to what hardware
   supports) instead of rejecting vCPU creation.

 - Extend enable_ipiv module param support to SVM, by simply leaving IsRunning
   clear in the vCPU's physical ID table entry.

 - Disable IPI virtualization, via enable_ipiv, if the CPU is affected by
   erratum #1235, to allow (safely) enabling AVIC on such CPUs.

 - Dedup x86's device posted IRQ code, as the vast majority of functionality
   can be shared verbatime between SVM and VMX.

 - Harden the device posted IRQ code against bugs and runtime errors.

 - Use vcpu_idx, not vcpu_id, for GA log tag/metadata, to make lookups O(1)
   instead of O(n).

 - Generate GA Log interrupts if and only if the target vCPU is blocking, i.e.
   only if KVM needs a notification in order to wake the vCPU.

 - Decouple device posted IRQs from VFIO device assignment, as binding a VM to
   a VFIO group is not a requirement for enabling device posted IRQs.

 - Clean up and document/comment the irqfd assignment code.

 - Disallow binding multiple irqfds to an eventfd with a priority waiter, i.e.
   ensure an eventfd is bound to at most one irqfd through the entire host,
   and add a selftest to verify eventfd:irqfd bindings are globally unique.
2025-07-29 08:35:46 -04:00
KaFai Wan
51d3750aba selftests/bpf: Migrate fexit_noreturns case into tracing_failure test suite
Delete fexit_noreturns.c files and migrate the cases into
tracing_failure.c files.

The result:

 $ tools/testing/selftests/bpf/test_progs -t tracing_failure/fexit_noreturns
 #467/4   tracing_failure/fexit_noreturns:OK
 #467     tracing_failure:OK
 Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: KaFai Wan <kafai.wan@linux.dev>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250724151454.499040-5-kafai.wan@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-28 19:39:30 -07:00
KaFai Wan
a32f6f17a7 selftests/bpf: Add selftest for attaching tracing programs to functions in deny list
The result:

 $ tools/testing/selftests/bpf/test_progs -t tracing_failure/tracing_deny
 #468/3   tracing_failure/tracing_deny:OK
 #468     tracing_failure:OK
 Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: KaFai Wan <kafai.wan@linux.dev>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250724151454.499040-4-kafai.wan@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-28 19:39:29 -07:00
KaFai Wan
a5a6b29a70 bpf: Show precise rejected function when attaching fexit/fmod_ret to __noreturn functions
With this change, we know the precise rejected function name when
attaching fexit/fmod_ret to __noreturn functions from log.

$ ./fexit
libbpf: prog 'fexit': BPF program load failed: -EINVAL
libbpf: prog 'fexit': -- BEGIN PROG LOAD LOG --
Attaching fexit/fmod_ret to __noreturn function 'do_exit' is rejected.

Suggested-by: Leon Hwang <leon.hwang@linux.dev>
Signed-off-by: KaFai Wan <kafai.wan@linux.dev>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250724151454.499040-2-kafai.wan@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-28 19:39:29 -07:00
Linus Torvalds
ae388edd4a Merge tag 'landlock-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock update from Mickaël Salaün:
 "Fix test issues, improve build compatibility, and add new tests"

* tag 'landlock-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  landlock: Fix cosmetic change
  samples/landlock: Fix building on musl libc
  landlock: Fix warning from KUnit tests
  selftests/landlock: Add test to check rule tied to covered mount point
  selftests/landlock: Fix build of audit_test
  selftests/landlock: Fix readlink check
2025-07-28 19:21:32 -07:00
Linus Torvalds
13150742b0 Merge tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers:
 "This is the main crypto library pull request for 6.17. The main focus
  this cycle is on reorganizing the SHA-1 and SHA-2 code, providing
  high-quality library APIs for SHA-1 and SHA-2 including HMAC support,
  and establishing conventions for lib/crypto/ going forward:

   - Migrate the SHA-1 and SHA-512 code (and also SHA-384 which shares
     most of the SHA-512 code) into lib/crypto/. This includes both the
     generic and architecture-optimized code. Greatly simplify how the
     architecture-optimized code is integrated. Add an easy-to-use
     library API for each SHA variant, including HMAC support. Finally,
     reimplement the crypto_shash support on top of the library API.

   - Apply the same reorganization to the SHA-256 code (and also SHA-224
     which shares most of the SHA-256 code). This is a somewhat smaller
     change, due to my earlier work on SHA-256. But this brings in all
     the same additional improvements that I made for SHA-1 and SHA-512.

  There are also some smaller changes:

   - Move the architecture-optimized ChaCha, Poly1305, and BLAKE2s code
     from arch/$(SRCARCH)/lib/crypto/ to lib/crypto/$(SRCARCH)/. For
     these algorithms it's just a move, not a full reorganization yet.

   - Fix the MIPS chacha-core.S to build with the clang assembler.

   - Fix the Poly1305 functions to work in all contexts.

   - Fix a performance regression in the x86_64 Poly1305 code.

   - Clean up the x86_64 SHA-NI optimized SHA-1 assembly code.

  Note that since the new organization of the SHA code is much simpler,
  the diffstat of this pull request is negative, despite the addition of
  new fully-documented library APIs for multiple SHA and HMAC-SHA
  variants.

  These APIs will allow further simplifications across the kernel as
  users start using them instead of the old-school crypto API. (I've
  already written a lot of such conversion patches, removing over 1000
  more lines of code. But most of those will target 6.18 or later)"

* tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (67 commits)
  lib/crypto: arm64/sha512-ce: Drop compatibility macros for older binutils
  lib/crypto: x86/sha1-ni: Convert to use rounds macros
  lib/crypto: x86/sha1-ni: Minor optimizations and cleanup
  crypto: sha1 - Remove sha1_base.h
  lib/crypto: x86/sha1: Migrate optimized code into library
  lib/crypto: sparc/sha1: Migrate optimized code into library
  lib/crypto: s390/sha1: Migrate optimized code into library
  lib/crypto: powerpc/sha1: Migrate optimized code into library
  lib/crypto: mips/sha1: Migrate optimized code into library
  lib/crypto: arm64/sha1: Migrate optimized code into library
  lib/crypto: arm/sha1: Migrate optimized code into library
  crypto: sha1 - Use same state format as legacy drivers
  crypto: sha1 - Wrap library and add HMAC support
  lib/crypto: sha1: Add HMAC support
  lib/crypto: sha1: Add SHA-1 library functions
  lib/crypto: sha1: Rename sha1_init() to sha1_init_raw()
  crypto: x86/sha1 - Rename conflicting symbol
  lib/crypto: sha2: Add hmac_sha*_init_usingrawkey()
  lib/crypto: arm/poly1305: Remove unneeded empty weak function
  lib/crypto: x86/poly1305: Fix performance regression on short messages
  ...
2025-07-28 17:58:52 -07:00
Linus Torvalds
8e736a2eea Merge tag 'hardening-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:

 - Introduce and start using TRAILING_OVERLAP() helper for fixing
   embedded flex array instances (Gustavo A. R. Silva)

 - mux: Convert mux_control_ops to a flex array member in mux_chip
   (Thorsten Blum)

 - string: Group str_has_prefix() and strstarts() (Andy Shevchenko)

 - Remove KCOV instrumentation from __init and __head (Ritesh Harjani,
   Kees Cook)

 - Refactor and rename stackleak feature to support Clang

 - Add KUnit test for seq_buf API

 - Fix KUnit fortify test under LTO

* tag 'hardening-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (22 commits)
  sched/task_stack: Add missing const qualifier to end_of_stack()
  kstack_erase: Support Clang stack depth tracking
  kstack_erase: Add -mgeneral-regs-only to silence Clang warnings
  init.h: Disable sanitizer coverage for __init and __head
  kstack_erase: Disable kstack_erase for all of arm compressed boot code
  x86: Handle KCOV __init vs inline mismatches
  arm64: Handle KCOV __init vs inline mismatches
  s390: Handle KCOV __init vs inline mismatches
  arm: Handle KCOV __init vs inline mismatches
  mips: Handle KCOV __init vs inline mismatch
  powerpc/mm/book3s64: Move kfence and debug_pagealloc related calls to __init section
  configs/hardening: Enable CONFIG_INIT_ON_FREE_DEFAULT_ON
  configs/hardening: Enable CONFIG_KSTACK_ERASE
  stackleak: Split KSTACK_ERASE_CFLAGS from GCC_PLUGINS_CFLAGS
  stackleak: Rename stackleak_track_stack to __sanitizer_cov_stack_depth
  stackleak: Rename STACKLEAK to KSTACK_ERASE
  seq_buf: Introduce KUnit tests
  string: Group str_has_prefix() and strstarts()
  kunit/fortify: Add back "volatile" for sizeof() constants
  acpi: nfit: intel: avoid multiple -Wflex-array-member-not-at-end warnings
  ...
2025-07-28 17:16:12 -07:00
Linus Torvalds
6e11664f14 Merge tag 'for-6.17/block-20250728' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:

 - MD pull request via Yu:
      - call del_gendisk synchronously (Xiao)
      - cleanup unused variable (John)
      - cleanup workqueue flags (Ryo)
      - fix faulty rdev can't be removed during resync (Qixing)

 - NVMe pull request via Christoph:
      - try PCIe function level reset on init failure (Keith Busch)
      - log TLS handshake failures at error level (Maurizio Lombardi)
      - pci-epf: do not complete commands twice if nvmet_req_init()
        fails (Rick Wertenbroek)
      - misc cleanups (Alok Tiwari)

 - Removal of the pktcdvd driver

   This has been more than a decade coming at this point, and some
   recently revealed breakages that had it causing issues even for cases
   where it isn't required made me re-pull the trigger on this one. It's
   known broken and nobody has stepped up to maintain the code

 - Series for ublk supporting batch commands, enabling the use of
   multishot where appropriate

 - Speed up ublk exit handling

 - Fix for the two-stage elevator fixing which could leak data

 - Convert NVMe to use the new IOVA based API

 - Increase default max transfer size to something more reasonable

 - Series fixing write operations on zoned DM devices

 - Add tracepoints for zoned block device operations

 - Prep series working towards improving blk-mq queue management in the
   presence of isolated CPUs

 - Don't allow updating of the block size of a loop device that is
   currently under exclusively ownership/open

 - Set chunk sectors from stacked device stripe size and use it for the
   atomic write size limit

 - Switch to folios in bcache read_super()

 - Fix for CD-ROM MRW exit flush handling

 - Various tweaks, fixes, and cleanups

* tag 'for-6.17/block-20250728' of git://git.kernel.dk/linux: (94 commits)
  block: restore two stage elevator switch while running nr_hw_queue update
  cdrom: Call cdrom_mrw_exit from cdrom_release function
  sunvdc: Balance device refcount in vdc_port_mpgroup_check
  nvme-pci: try function level reset on init failure
  dm: split write BIOs on zone boundaries when zone append is not emulated
  block: use chunk_sectors when evaluating stacked atomic write limits
  dm-stripe: limit chunk_sectors to the stripe size
  md/raid10: set chunk_sectors limit
  md/raid0: set chunk_sectors limit
  block: sanitize chunk_sectors for atomic write limits
  ilog2: add max_pow_of_two_factor()
  nvmet: pci-epf: Do not complete commands twice if nvmet_req_init() fails
  nvme-tcp: log TLS handshake failures at error level
  docs: nvme: fix grammar in nvme-pci-endpoint-target.rst
  nvme: fix typo in status code constant for self-test in progress
  nvmet: remove redundant assignment of error code in nvmet_ns_enable()
  nvme: fix incorrect variable in io cqes error message
  nvme: fix multiple spelling and grammar issues in host drivers
  block: fix blk_zone_append_update_request_bio() kernel-doc
  md/raid10: fix set but not used variable in sync_request_write()
  ...
2025-07-28 16:43:54 -07:00
Linus Torvalds
7e7bc8335b Merge tag 'vfs-6.17-rc1.bpf' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs bpf updates from Christian Brauner:
 "These changes allow bpf to read extended attributes from cgroupfs.

  This is useful in redirecting AF_UNIX socket connections based on
  cgroup membership of the socket. One use-case is the ability to
  implement log namespaces in systemd so services and containers are
  redirected to different journals"

* tag 'vfs-6.17-rc1.bpf' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  selftests/kernfs: test xattr retrieval
  selftests/bpf: Add tests for bpf_cgroup_read_xattr
  bpf: Mark cgroup_subsys_state->cgroup RCU safe
  bpf: Introduce bpf_cgroup_read_xattr to read xattr of cgroup's node
  kernfs: remove iattr_mutex
2025-07-28 14:42:31 -07:00
Linus Torvalds
672dcda246 Merge tag 'vfs-6.17-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull pidfs updates from Christian Brauner:

 - persistent info

   Persist exit and coredump information independent of whether anyone
   currently holds a pidfd for the struct pid.

   The current scheme allocated pidfs dentries on-demand repeatedly.
   This scheme is reaching it's limits as it makes it impossible to pin
   information that needs to be available after the task has exited or
   coredumped and that should not be lost simply because the pidfd got
   closed temporarily. The next opener should still see the stashed
   information.

   This is also a prerequisite for supporting extended attributes on
   pidfds to allow attaching meta information to them.

   If someone opens a pidfd for a struct pid a pidfs dentry is allocated
   and stashed in pid->stashed. Once the last pidfd for the struct pid
   is closed the pidfs dentry is released and removed from pid->stashed.

   So if 10 callers create a pidfs dentry for the same struct pid
   sequentially, i.e., each closing the pidfd before the other creates a
   new one then a new pidfs dentry is allocated every time.

   Because multiple tasks acquiring and releasing a pidfd for the same
   struct pid can race with each another a task may still find a valid
   pidfs entry from the previous task in pid->stashed and reuse it. Or
   it might find a dead dentry in there and fail to reuse it and so
   stashes a new pidfs dentry. Multiple tasks may race to stash a new
   pidfs dentry but only one will succeed, the other ones will put their
   dentry.

   The current scheme aims to ensure that a pidfs dentry for a struct
   pid can only be created if the task is still alive or if a pidfs
   dentry already existed before the task was reaped and so exit
   information has been was stashed in the pidfs inode.

   That's great except that it's buggy. If a pidfs dentry is stashed in
   pid->stashed after pidfs_exit() but before __unhash_process() is
   called we will return a pidfd for a reaped task without exit
   information being available.

   The pidfds_pid_valid() check does not guard against this race as it
   doens't sync at all with pidfs_exit(). The pid_has_task() check might
   be successful simply because we're before __unhash_process() but
   after pidfs_exit().

   Introduce a new scheme where the lifetime of information associated
   with a pidfs entry (coredump and exit information) isn't bound to the
   lifetime of the pidfs inode but the struct pid itself.

   The first time a pidfs dentry is allocated for a struct pid a struct
   pidfs_attr will be allocated which will be used to store exit and
   coredump information.

   If all pidfs for the pidfs dentry are closed the dentry and inode can
   be cleaned up but the struct pidfs_attr will stick until the struct
   pid itself is freed. This will ensure minimal memory usage while
   persisting relevant information.

   The new scheme has various advantages. First, it allows to close the
   race where we end up handing out a pidfd for a reaped task for which
   no exit information is available. Second, it minimizes memory usage.
   Third, it allows to remove complex lifetime tracking via dentries
   when registering a struct pid with pidfs. There's no need to get or
   put a reference. Instead, the lifetime of exit and coredump
   information associated with a struct pid is bound to the lifetime of
   struct pid itself.

 - extended attributes

   Now that we have a way to persist information for pidfs dentries we
   can start supporting extended attributes on pidfds. This will allow
   userspace to attach meta information to tasks.

   One natural extension would be to introduce a custom pidfs.* extended
   attribute space and allow for the inheritance of extended attributes
   across fork() and exec().

   The first simple scheme will allow privileged userspace to set
   trusted extended attributes on pidfs inodes.

 - Allow autonomous pidfs file handles

   Various filesystems such as pidfs and drm support opening file
   handles without having to require a file descriptor to identify the
   filesystem. The filesystem are global single instances and can be
   trivially identified solely on the information encoded in the file
   handle.

   This makes it possible to not have to keep or acquire a sentinal file
   descriptor just to pass it to open_by_handle_at() to identify the
   filesystem. That's especially useful when such sentinel file
   descriptor cannot or should not be acquired.

   For pidfs this means a file handle can function as full replacement
   for storing a pid in a file. Instead a file handle can be stored and
   reopened purely based on the file handle.

   Such autonomous file handles can be opened with or without specifying
   a a file descriptor. If no proper file descriptor is used the
   FD_PIDFS_ROOT sentinel must be passed. This allows us to define
   further special negative fd sentinels in the future.

   Userspace can trivially test for support by trying to open the file
   handle with an invalid file descriptor.

 - Allow pidfds for reaped tasks with SCM_PIDFD messages

   This is a logical continuation of the earlier work to create pidfds
   for reaped tasks through the SO_PEERPIDFD socket option merged in
   923ea4d448 ("Merge patch series "net, pidfs: enable handing out
   pidfds for reaped sk->sk_peer_pid"").

 - Two minor fixes:

    * Fold fs_struct->{lock,seq} into a seqlock

    * Don't bother with path_{get,put}() in unix_open_file()

* tag 'vfs-6.17-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (37 commits)
  don't bother with path_get()/path_put() in unix_open_file()
  fold fs_struct->{lock,seq} into a seqlock
  selftests: net: extend SCM_PIDFD test to cover stale pidfds
  af_unix: enable handing out pidfds for reaped tasks in SCM_PIDFD
  af_unix: stash pidfs dentry when needed
  af_unix/scm: fix whitespace errors
  af_unix: introduce and use scm_replace_pid() helper
  af_unix: introduce unix_skb_to_scm helper
  af_unix: rework unix_maybe_add_creds() to allow sleep
  selftests/pidfd: decode pidfd file handles withou having to specify an fd
  fhandle, pidfs: support open_by_handle_at() purely based on file handle
  uapi/fcntl: add FD_PIDFS_ROOT
  uapi/fcntl: add FD_INVALID
  fcntl/pidfd: redefine PIDFD_SELF_THREAD_GROUP
  uapi/fcntl: mark range as reserved
  fhandle: reflow get_path_anchor()
  pidfs: add pidfs_root_path() helper
  fhandle: rename to get_path_anchor()
  fhandle: hoist copy_from_user() above get_path_from_fd()
  fhandle: raise FILEID_IS_DIR in handle_type
  ...
2025-07-28 14:10:15 -07:00
Linus Torvalds
7031769e10 Merge tag 'vfs-6.17-rc1.mmap_prepare' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull mmap_prepare updates from Christian Brauner:
 "Last cycle we introduce f_op->mmap_prepare() in c84bf6dd2b ("mm:
  introduce new .mmap_prepare() file callback").

  This is preferred to the existing f_op->mmap() hook as it does require
  a VMA to be established yet, thus allowing the mmap logic to invoke
  this hook far, far earlier, prior to inserting a VMA into the virtual
  address space, or performing any other heavy handed operations.

  This allows for much simpler unwinding on error, and for there to be a
  single attempt at merging a VMA rather than having to possibly
  reattempt a merge based on potentially altered VMA state.

  Far more importantly, it prevents inappropriate manipulation of
  incompletely initialised VMA state, which is something that has been
  the cause of bugs and complexity in the past.

  The intent is to gradually deprecate f_op->mmap, and in that vein this
  series coverts the majority of file systems to using f_op->mmap_prepare.

  Prerequisite steps are taken - firstly ensuring all checks for mmap
  capabilities use the file_has_valid_mmap_hooks() helper rather than
  directly checking for f_op->mmap (which is now not a valid check) and
  secondly updating daxdev_mapping_supported() to not require a VMA
  parameter to allow ext4 and xfs to be converted.

  Commit bb666b7c27 ("mm: add mmap_prepare() compatibility layer for
  nested file systems") handles the nasty edge-case of nested file
  systems like overlayfs, which introduces a compatibility shim to allow
  f_op->mmap_prepare() to be invoked from an f_op->mmap() callback.

  This allows for nested filesystems to continue to function correctly
  with all file systems regardless of which callback is used. Once we
  finally convert all file systems, this shim can be removed.

  As a result, ecryptfs, fuse, and overlayfs remain unaltered so they
  can nest all other file systems.

  We additionally do not update resctl - as this requires an update to
  remap_pfn_range() (or an alternative to it) which we defer to a later
  series, equally we do not update cramfs which needs a mixed mapping
  insertion with the same issue, nor do we update procfs, hugetlbfs,
  syfs or kernfs all of which require VMAs for internal state and hooks.
  We shall return to all of these later"

* tag 'vfs-6.17-rc1.mmap_prepare' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  doc: update porting, vfs documentation to describe mmap_prepare()
  fs: replace mmap hook with .mmap_prepare for simple mappings
  fs: convert most other generic_file_*mmap() users to .mmap_prepare()
  fs: convert simple use of generic_file_*_mmap() to .mmap_prepare()
  mm/filemap: introduce generic_file_*_mmap_prepare() helpers
  fs/xfs: transition from deprecated .mmap hook to .mmap_prepare
  fs/ext4: transition from deprecated .mmap hook to .mmap_prepare
  fs/dax: make it possible to check dev dax support without a VMA
  fs: consistently use can_mmap_file() helper
  mm/nommu: use file_has_valid_mmap_hooks() helper
  mm: rename call_mmap/mmap_prepare to vfs_mmap/mmap_prepare
2025-07-28 13:43:25 -07:00
Linus Torvalds
117eab5c6e Merge tag 'vfs-6.17-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull coredump updates from Christian Brauner:
 "This contains an extension to the coredump socket and a proper rework
  of the coredump code.

   - This extends the coredump socket to allow the coredump server to
     tell the kernel how to process individual coredumps. This allows
     for fine-grained coredump management. Userspace can decide to just
     let the kernel write out the coredump, or generate the coredump
     itself, or just reject it.

     * COREDUMP_KERNEL
       The kernel will write the coredump data to the socket.

     * COREDUMP_USERSPACE
       The kernel will not write coredump data but will indicate to the
       parent that a coredump has been generated. This is used when
       userspace generates its own coredumps.

     * COREDUMP_REJECT
       The kernel will skip generating a coredump for this task.

     * COREDUMP_WAIT
       The kernel will prevent the task from exiting until the coredump
       server has shutdown the socket connection.

     The flexible coredump socket can be enabled by using the "@@"
     prefix instead of the single "@" prefix for the regular coredump
     socket:

       @@/run/systemd/coredump.socket

   - Cleanup the coredump code properly while we have to touch it
     anyway.

     Split out each coredump mode in a separate helper so it's easy to
     grasp what is going on and make the code easier to follow. The core
     coredump function should now be very trivial to follow"

* tag 'vfs-6.17-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (31 commits)
  cleanup: add a scoped version of CLASS()
  coredump: add coredump_skip() helper
  coredump: avoid pointless variable
  coredump: order auto cleanup variables at the top
  coredump: add coredump_cleanup()
  coredump: auto cleanup prepare_creds()
  cred: add auto cleanup method
  coredump: directly return
  coredump: auto cleanup argv
  coredump: add coredump_write()
  coredump: use a single helper for the socket
  coredump: move pipe specific file check into coredump_pipe()
  coredump: split pipe coredumping into coredump_pipe()
  coredump: move core_pipe_count to global variable
  coredump: prepare to simplify exit paths
  coredump: split file coredumping into coredump_file()
  coredump: rename do_coredump() to vfs_coredump()
  selftests/coredump: make sure invalid paths are rejected
  coredump: validate socket path in coredump_parse()
  coredump: don't allow ".." in coredump socket path
  ...
2025-07-28 11:50:36 -07:00
Paul Chaignon
5dbb19b16a bpf: Add third round of bounds deduction
Commit d7f0087381 ("bpf: try harder to deduce register bounds from
different numeric domains") added a second call to __reg_deduce_bounds
in reg_bounds_sync because a single call wasn't enough to converge to a
fixed point in terms of register bounds.

With patch "bpf: Improve bounds when s64 crosses sign boundary" from
this series, Eduard noticed that calling __reg_deduce_bounds twice isn't
enough anymore to converge. The first selftest added in "selftests/bpf:
Test cross-sign 64bits range refinement" highlights the need for a third
call to __reg_deduce_bounds. After instruction 7, reg_bounds_sync
performs the following bounds deduction:

  reg_bounds_sync entry:          scalar(smin=-655,smax=0xeffffeee,smin32=-783,smax32=-146)
  __update_reg_bounds:            scalar(smin=-655,smax=0xeffffeee,smin32=-783,smax32=-146)
  __reg_deduce_bounds:
      __reg32_deduce_bounds:      scalar(smin=-655,smax=0xeffffeee,smin32=-783,smax32=-146,umin32=0xfffffcf1,umax32=0xffffff6e)
      __reg64_deduce_bounds:      scalar(smin=-655,smax=0xeffffeee,smin32=-783,smax32=-146,umin32=0xfffffcf1,umax32=0xffffff6e)
      __reg_deduce_mixed_bounds:  scalar(smin=-655,smax=0xeffffeee,umin=umin32=0xfffffcf1,umax=0xffffffffffffff6e,smin32=-783,smax32=-146,umax32=0xffffff6e)
  __reg_deduce_bounds:
      __reg32_deduce_bounds:      scalar(smin=-655,smax=0xeffffeee,umin=umin32=0xfffffcf1,umax=0xffffffffffffff6e,smin32=-783,smax32=-146,umax32=0xffffff6e)
      __reg64_deduce_bounds:      scalar(smin=-655,smax=smax32=-146,umin=0xfffffffffffffd71,umax=0xffffffffffffff6e,smin32=-783,umin32=0xfffffcf1,umax32=0xffffff6e)
      __reg_deduce_mixed_bounds:  scalar(smin=-655,smax=smax32=-146,umin=0xfffffffffffffd71,umax=0xffffffffffffff6e,smin32=-783,umin32=0xfffffcf1,umax32=0xffffff6e)
  __reg_bound_offset:             scalar(smin=-655,smax=smax32=-146,umin=0xfffffffffffffd71,umax=0xffffffffffffff6e,smin32=-783,umin32=0xfffffcf1,umax32=0xffffff6e,var_off=(0xfffffffffffffc00; 0x3ff))
  __update_reg_bounds:            scalar(smin=-655,smax=smax32=-146,umin=0xfffffffffffffd71,umax=0xffffffffffffff6e,smin32=-783,umin32=0xfffffcf1,umax32=0xffffff6e,var_off=(0xfffffffffffffc00; 0x3ff))

In particular, notice how:
1. In the first call to __reg_deduce_bounds, __reg32_deduce_bounds
   learns new u32 bounds.
2. __reg64_deduce_bounds is unable to improve bounds at this point.
3. __reg_deduce_mixed_bounds derives new u64 bounds from the u32 bounds.
4. In the second call to __reg_deduce_bounds, __reg64_deduce_bounds
   improves the smax and umin bounds thanks to patch "bpf: Improve
   bounds when s64 crosses sign boundary" from this series.
5. Subsequent functions are unable to improve the ranges further (only
   tnums). Yet, a better smin32 bound could be learned from the smin
   bound.

__reg32_deduce_bounds is able to improve smin32 from smin, but for that
we need a third call to __reg_deduce_bounds.

As discussed in [1], there may be a better way to organize the deduction
rules to learn the same information with less calls to the same
functions. Such an optimization requires further analysis and is
orthogonal to the present patchset.

Link: https://lore.kernel.org/bpf/aIKtSK9LjQXB8FLY@mail.gmail.com/ [1]
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Co-developed-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/79619d3b42e5525e0e174ed534b75879a5ba15de.1753695655.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-28 10:02:13 -07:00
Paul Chaignon
f96841bbf4 selftests/bpf: Test invariants on JSLT crossing sign
The improvement of the u64/s64 range refinement fixed the invariant
violation that was happening on this test for BPF_JSLT when crossing the
sign boundary.

After this patch, we have one test remaining with a known invariant
violation. It's the same test as fixed here but for 32 bits ranges.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/ad046fb0016428f1a33c3b81617aabf31b51183f.1753695655.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-28 10:02:13 -07:00
Paul Chaignon
26e5e346a5 selftests/bpf: Test cross-sign 64bits range refinement
This patch adds coverage for the new cross-sign 64bits range refinement
logic. The three tests cover the cases when the u64 and s64 ranges
overlap (1) in the negative portion of s64, (2) in the positive portion
of s64, and (3) in both portions.

The first test is a simplified version of a BPF program generated by
syzkaller that caused an invariant violation [1]. It looks like
syzkaller could not extract the reproducer itself (and therefore didn't
report it to the mailing list), but I was able to extract it from the
console logs of a crash.

The principle is similar to the invariant violation described in
commit 6279846b9b ("bpf: Forget ranges when refining tnum after
JSET"): the verifier walks a dead branch, uses the condition to refine
ranges, and ends up with inconsistent ranges. In this case, the dead
branch is when we fallthrough on both jumps. The new refinement logic
improves the bounds such that the second jump is properly detected as
always-taken and the verifier doesn't end up walking a dead branch.

The second and third tests are inspired by the first, but rely on
condition jumps to prepare the bounds instead of ALU instructions. An
R10 write is used to trigger a verifier error when the bounds can't be
refined.

Link: https://syzkaller.appspot.com/bug?extid=c711ce17dd78e5d4fdcf [1]
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/a0e17b00dab8dabcfa6f8384e7e151186efedfdd.1753695655.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-28 10:02:12 -07:00
Paul Chaignon
da653de268 selftests/bpf: Update reg_bound range refinement logic
This patch updates the range refinement logic in the reg_bound test to
match the new logic from the previous commit. Without this change, tests
would fail because we end with more precise ranges than the tests
expect.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/b7f6b1fbe03373cca4e1bb6a113035a6cd2b3ff7.1753695655.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-28 10:02:12 -07:00
Oliver Upton
18ec25dd0e KVM: arm64: selftests: Add FEAT_RAS EL2 registers to get-reg-list
VDISR_EL2 and VSESR_EL2 are now visible to userspace for nested VMs. Add
them to get-reg-list.

Link: https://lore.kernel.org/r/20250728152603.2823699-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-28 08:28:05 -07:00
Oliver Upton
0d46e324c0 Merge branch 'kvm-arm64/vgic-v4-ctl' into kvmarm/next
* kvm-arm64/vgic-v4-ctl:
  : Userspace control of nASSGIcap, courtesy of Raghavendra Rao Ananta
  :
  : Allow userspace to decide if support for SGIs without an active state is
  : advertised to the guest, allowing VMs from GICv3-only hardware to be
  : migrated to to GICv4.1 capable machines.
  Documentation: KVM: arm64: Describe VGICv3 registers writable pre-init
  KVM: arm64: selftests: Add test for nASSGIcap attribute
  KVM: arm64: vgic-v3: Allow userspace to write GICD_TYPER2.nASSGIcap
  KVM: arm64: vgic-v3: Allow access to GICD_IIDR prior to initialization
  KVM: arm64: vgic-v3: Consolidate MAINT_IRQ handling
  KVM: arm64: Disambiguate support for vSGIs v. vLPIs

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-28 08:11:38 -07:00
Oliver Upton
a7f49a9bf4 Merge branch 'kvm-arm64/el2-reg-visibility' into kvmarm/next
* kvm-arm64/el2-reg-visibility:
  : Fixes to EL2 register visibility, courtesy of Marc Zyngier
  :
  :  - Expose EL2 VGICv3 registers via the VGIC attributes accessor, not the
  :    KVM_{GET,SET}_ONE_REG ioctls
  :
  :  - Condition visibility of FGT registers on the presence of FEAT_FGT in
  :    the VM
  KVM: arm64: selftest: vgic-v3: Add basic GICv3 sysreg userspace access test
  KVM: arm64: Enforce the sorting of the GICv3 system register table
  KVM: arm64: Clarify the check for reset callback in check_sysreg_table()
  KVM: arm64: vgic-v3: Fix ordering of ICH_HCR_EL2
  KVM: arm64: Document registers exposed via KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS
  KVM: arm64: selftests: get-reg-list: Add base EL2 registers
  KVM: arm64: selftests: get-reg-list: Simplify feature dependency
  KVM: arm64: Advertise FGT2 registers to userspace
  KVM: arm64: Condition FGT registers on feature availability
  KVM: arm64: Expose GICv3 EL2 registers via KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS
  KVM: arm64: Let GICv3 save/restore honor visibility attribute
  KVM: arm64: Define helper for ICH_VTR_EL2
  KVM: arm64: Define constant value for ICC_SRE_EL2
  KVM: arm64: Don't advertise ICH_*_EL2 registers through GET_ONE_REG
  KVM: arm64: Make RVBAR_EL2 accesses UNDEF

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-28 08:06:38 -07:00
Puranjay Mohan
cf2a6de32c powerpc64/bpf: Add jit support for load_acquire and store_release
Add JIT support for the load_acquire and store_release instructions. The
implementation is similar to the kernel where:

        load_acquire  => plain load -> lwsync
        store_release => lwsync -> plain store

To test the correctness of the implementation, following selftests were
run:

  [fedora@linux-kernel bpf]$ sudo ./test_progs -a \
  verifier_load_acquire,verifier_store_release,atomics
  #11/1    atomics/add:OK
  #11/2    atomics/sub:OK
  #11/3    atomics/and:OK
  #11/4    atomics/or:OK
  #11/5    atomics/xor:OK
  #11/6    atomics/cmpxchg:OK
  #11/7    atomics/xchg:OK
  #11      atomics:OK
  #519/1   verifier_load_acquire/load-acquire, 8-bit:OK
  #519/2   verifier_load_acquire/load-acquire, 8-bit @unpriv:OK
  #519/3   verifier_load_acquire/load-acquire, 16-bit:OK
  #519/4   verifier_load_acquire/load-acquire, 16-bit @unpriv:OK
  #519/5   verifier_load_acquire/load-acquire, 32-bit:OK
  #519/6   verifier_load_acquire/load-acquire, 32-bit @unpriv:OK
  #519/7   verifier_load_acquire/load-acquire, 64-bit:OK
  #519/8   verifier_load_acquire/load-acquire, 64-bit @unpriv:OK
  #519/9   verifier_load_acquire/load-acquire with uninitialized
  src_reg:OK
  #519/10  verifier_load_acquire/load-acquire with uninitialized src_reg
  @unpriv:OK
  #519/11  verifier_load_acquire/load-acquire with non-pointer src_reg:OK
  #519/12  verifier_load_acquire/load-acquire with non-pointer src_reg
  @unpriv:OK
  #519/13  verifier_load_acquire/misaligned load-acquire:OK
  #519/14  verifier_load_acquire/misaligned load-acquire @unpriv:OK
  #519/15  verifier_load_acquire/load-acquire from ctx pointer:OK
  #519/16  verifier_load_acquire/load-acquire from ctx pointer @unpriv:OK
  #519/17  verifier_load_acquire/load-acquire with invalid register R15:OK
  #519/18  verifier_load_acquire/load-acquire with invalid register R15
  @unpriv:OK
  #519/19  verifier_load_acquire/load-acquire from pkt pointer:OK
  #519/20  verifier_load_acquire/load-acquire from flow_keys pointer:OK
  #519/21  verifier_load_acquire/load-acquire from sock pointer:OK
  #519     verifier_load_acquire:OK
  #556/1   verifier_store_release/store-release, 8-bit:OK
  #556/2   verifier_store_release/store-release, 8-bit @unpriv:OK
  #556/3   verifier_store_release/store-release, 16-bit:OK
  #556/4   verifier_store_release/store-release, 16-bit @unpriv:OK
  #556/5   verifier_store_release/store-release, 32-bit:OK
  #556/6   verifier_store_release/store-release, 32-bit @unpriv:OK
  #556/7   verifier_store_release/store-release, 64-bit:OK
  #556/8   verifier_store_release/store-release, 64-bit @unpriv:OK
  #556/9   verifier_store_release/store-release with uninitialized
  src_reg:OK
  #556/10  verifier_store_release/store-release with uninitialized src_reg
  @unpriv:OK
  #556/11  verifier_store_release/store-release with uninitialized
  dst_reg:OK
  #556/12  verifier_store_release/store-release with uninitialized dst_reg
  @unpriv:OK
  #556/13  verifier_store_release/store-release with non-pointer
  dst_reg:OK
  #556/14  verifier_store_release/store-release with non-pointer dst_reg
  @unpriv:OK
  #556/15  verifier_store_release/misaligned store-release:OK
  #556/16  verifier_store_release/misaligned store-release @unpriv:OK
  #556/17  verifier_store_release/store-release to ctx pointer:OK
  #556/18  verifier_store_release/store-release to ctx pointer @unpriv:OK
  #556/19  verifier_store_release/store-release, leak pointer to stack:OK
  #556/20  verifier_store_release/store-release, leak pointer to stack
  @unpriv:OK
  #556/21  verifier_store_release/store-release, leak pointer to map:OK
  #556/22  verifier_store_release/store-release, leak pointer to map
  @unpriv:OK
  #556/23  verifier_store_release/store-release with invalid register
  R15:OK
  #556/24  verifier_store_release/store-release with invalid register R15
  @unpriv:OK
  #556/25  verifier_store_release/store-release to pkt pointer:OK
  #556/26  verifier_store_release/store-release to flow_keys pointer:OK
  #556/27  verifier_store_release/store-release to sock pointer:OK
  #556     verifier_store_release:OK
  Summary: 3/55 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Tested-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250717202935.29018-2-puranjay@kernel.org
2025-07-28 08:13:35 +05:30
Enze Li
511914506d selftests/damon: introduce _common.sh to host shared function
The current test scripts contain duplicated root permission checks in
multiple locations.  This patch consolidates these checks into _common.sh
to eliminate code redundancy.

Link: https://lkml.kernel.org/r/20250718064217.299300-1-lienze@kylinos.cn
Signed-off-by: Enze Li <lienze@kylinos.cn>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:21 -07:00
SeongJae Park
da5973a0b8 selftests/damon/sysfs.py: test runtime reduction of DAMON parameters
sysfs.py is testing if non-default additional parameters can be committed.
Add a test case for further reducing the parameters to the default set.

Link: https://lkml.kernel.org/r/20250720171652.92309-23-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:21 -07:00
SeongJae Park
62b7b1ffa2 selftests/damon/sysfs.py: test non-default parameters runtime commit
sysfs.py is testing only the default and minimum DAMON parameters.  Add
another test case for more non-default additional DAMON parameters
commitment on runtime.

Link: https://lkml.kernel.org/r/20250720171652.92309-22-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:21 -07:00
SeongJae Park
16797a55aa selftests/damon/sysfs.py: generalize DAMON context commit assertion
DAMON context commitment assertion is hard-coded for a specific test case.
Split it out into a general version that can be reused for different test
cases.

Link: https://lkml.kernel.org/r/20250720171652.92309-21-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:21 -07:00
SeongJae Park
a4027b5f24 selftests/damon/sysfs.py: generalize monitoring attributes commit assertion
DAMON monitoring attributes commitment assertion is hard-coded for a
specific test case.  Split it out into a general version that can be
reused for different test cases.

Link: https://lkml.kernel.org/r/20250720171652.92309-20-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:20 -07:00
SeongJae Park
771d7754ab selftests/damon/sysfs.py: generalize DAMOS schemes commit assertion
DAMOS schemes commitment assertion is hard-coded for a specific test case.
Split it out into a general version that can be reused for different test
cases.

Link: https://lkml.kernel.org/r/20250720171652.92309-19-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:20 -07:00
SeongJae Park
53f800581f selftests/damon/sysfs.py: test DAMOS filters commitment
Current DAMOS scheme commitment assertion is not testing DAMOS filters. 
Add the test.

Link: https://lkml.kernel.org/r/20250720171652.92309-18-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:20 -07:00
SeongJae Park
f22ff7b5a5 selftests/damon/sysfs.py: generalize DAMOS scheme commit assertion
DAMOS scheme commitment assertion is hard-coded for a specific test case. 
Split it out into a general version that can be reused for different test
cases.

Link: https://lkml.kernel.org/r/20250720171652.92309-17-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:20 -07:00
SeongJae Park
bd0487a774 selftests/damon/sysfs.py: test DAMOS destinations commitment
Current DAMOS commitment assertion is not testing quota destinations
commitment.  Add the test.

Link: https://lkml.kernel.org/r/20250720171652.92309-16-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:20 -07:00
SeongJae Park
84dc442bd5 selftests/damon/sysfs.py: test quota goal commitment
Current DAMOS quota commitment assertion is not testing quota goal
commitment.  Add the test.

Link: https://lkml.kernel.org/r/20250720171652.92309-15-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:19 -07:00
SeongJae Park
f797e709f7 selftests/damon/sysfs.py: generalize DamosQuota commit assertion
DamosQuota commitment assertion is hard-coded for a specific test case. 
Split it out into a general version that can be reused for different test
cases.

Link: https://lkml.kernel.org/r/20250720171652.92309-14-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:19 -07:00
SeongJae Park
b50c48de61 selftests/damon/sysfs.py: generalize DAMOS Watermarks commit assertion
DamosWatermarks commitment assertion is hard-coded for a specific test
case.  Split it out into a general version that can be reused for
different test cases.

Link: https://lkml.kernel.org/r/20250720171652.92309-13-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:19 -07:00
SeongJae Park
a1d52cd030 selftests/damon/drgn_dump_damon_status: dump DAMOS filters
drgn_dump_damon_status.py is a script for dumping DAMON internal status in
json format.  It is being used for seeing if DAMON parameters that are set
using _damon_sysfs.py are actually passed to DAMON in the kernel space. 
It is, however, not dumping full DAMON internal status, and it makes
increasing test coverage difficult.  Add damos filters dumping for more
tests.

Link: https://lkml.kernel.org/r/20250720171652.92309-12-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:19 -07:00
SeongJae Park
eb413daaf2 selftests/damon/drgn_dump_damon_status: dump ctx->ops.id
drgn_dump_damon_status.py is a script for dumping DAMON internal status in
json format.  It is being used for seeing if DAMON parameters that are set
using _damon_sysfs.py are actually passed to DAMON in the kernel space. 
It is, however, not dumping full DAMON internal status, and it makes
increasing test coverage difficult.  Add ctx->ops.id dumping for more
tests.

Link: https://lkml.kernel.org/r/20250720171652.92309-11-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:18 -07:00
SeongJae Park
c1a6958957 selftests/damon/drgn_dump_damon_status: dump damos->migrate_dests
drgn_dump_damon_status.py is a script for dumping DAMON internal status in
json format.  It is being used for seeing if DAMON parameters that are set
using _damon_sysfs.py are actually passed to DAMON in the kernel space. 
It is, however, not dumping full DAMON internal status, and it makes
increasing test coverage difficult.  Add damos->migrate_dests dumping for
more tests.

Link: https://lkml.kernel.org/r/20250720171652.92309-10-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:18 -07:00
SeongJae Park
80d4e38107 selftests/damon/_damon_sysfs: use 2**32 - 1 as max nr_accesses and age
nr_accesses and age are unsigned int.  Use the proper max value.

Link: https://lkml.kernel.org/r/20250720171652.92309-9-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:18 -07:00
SeongJae Park
86e541f0be selftests/damon/_damon_sysfs: support DAMOS target_nid setup
_damon_sysfs.py contains code for test-purpose DAMON sysfs interface
control.  Add support of DAMOS action destination target_nid setup for
more tests.

Link: https://lkml.kernel.org/r/20250720171652.92309-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:18 -07:00
SeongJae Park
fca6ddf44d selftests/damon/_damon_sysfs: support DAMOS action dests setup
_damon_sysfs.py contains code for test-purpose DAMON sysfs interface
control.  Add support of DAMOS action destinations setup for more tests.

Link: https://lkml.kernel.org/r/20250720171652.92309-7-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:17 -07:00
SeongJae Park
229b0af664 selftests/damon/_damon_sysfs: support DAMOS quota goal nid setup
_damon_sysfs.py contains code for test-purpose DAMON sysfs interface
control.  Add support of DAMOS quota goal nid setup for more tests.

Link: https://lkml.kernel.org/r/20250720171652.92309-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:17 -07:00
SeongJae Park
ff5aae307b selftests/damon/_damon_sysfs: support DAMOS quota weights setup
_damon_sysfs.py contains code for test-purpose DAMON sysfs interface
control.  Add support of DAMOS quotas prioritization weights setup for
more tests.

Link: https://lkml.kernel.org/r/20250720171652.92309-5-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:17 -07:00
SeongJae Park
b436dfaad2 selftests/damon/_damon_sysfs: support monitoring intervals goal setup
_damon_sysfs.py contains code for test-purpose DAMON sysfs interface
control.  Add support of the monitoring intervals auto-tune goal setup for
more tests.

Link: https://lkml.kernel.org/r/20250720171652.92309-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:17 -07:00
SeongJae Park
9d93c103ed selftests/damon/_damon_sysfs: support DAMOS filters setup
_damon_sysfs.py contains code for test-purpose DAMON sysfs interface
control.  Add support of DAMOS filters setup for more tests.

Link: https://lkml.kernel.org/r/20250720171652.92309-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:16 -07:00
SeongJae Park
6da5e2961f selftests/damon/_damon_sysfs: support DAMOS watermarks setup
Patch series "selftests/damon/sysfs.py: test all parameters".

sysfs.py tests if DAMON sysfs interface is passing the user-requested
parameters to DAMON as expected.  But only the default (minimum)
parameters are being tested.  This is partially because _damon_sysfs.py,
which is the library for making the parameter requests, is not supporting
the entire parameters.  The internal DAMON status dump script
(drgn_dump_damon_status.py) is also not dumping entire parameters.  Extend
the test coverage by updating parameters input and status dumping scripts
to support all parameters, and writing additional tests using those.

This increased test coverage actually found one real bug
(https://lore.kernel.org/20250719181932.72944-1-sj@kernel.org).

First seven patches (1-7) extend _damon_sysfs.py for all parameters setup.
The eight patch (8) fixes _damon_sysfs.py to use correct max nr_acceses
and age values for their type.  Following three patches (9-11) extend
drgn_dump_damon_status.py to dump full DAMON parameters.  Following nine
patches (12-20) refactor sysfs.py for general testing code reuse, and
extend it for full parameters check.  Finally, two patches (21 and 22) add
test cases in sysfs.py for full parameters testing.


This patch (of 22):

_damon_sysfs.py contains code for test-purpose DAMON sysfs interface
control.  Add support of DAMOS watermarks setup for more tests.

Link: https://lkml.kernel.org/r/20250720171652.92309-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20250720171652.92309-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:16 -07:00
SeongJae Park
cf20cb9ad1 selftests/damon/sysfs.py: stop DAMON for dumping failures
Commit 4ece018976 ("selftests/damon: add python and drgn-based DAMON
sysfs test") in mm-stable tree introduced sysfs.py that runs drgn for
dumping DAMON status.  When the DAMON status dumping fails for reasons
including drgn uninstalled environment, the test fails without stopping
DAMON.  Following DAMON selftests that assumes DAMON is not running when
they executed therefore fail.  Catch dumping failures and stop DAMON for
that case.

Link: https://lkml.kernel.org/r/20250722060330.56068-1-sj@kernel.org
Fixes: 4ece018976 ("selftests/damon: add python and drgn-based DAMON sysfs test")
Signed-off-by: SeongJae Park <sj@kernel.org>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202507220707.9c5d6247-lkp@intel.com
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-26 15:08:16 -07:00
Puranjay Mohan
e9f545d0d3 selftests/bpf: Enable private stack tests for arm64
As arm64 JIT now supports private stack, make sure all relevant tests
run on arm64 architecture.

Relevant tests:

 #415/1   struct_ops_private_stack/private_stack:OK
 #415/2   struct_ops_private_stack/private_stack_fail:OK
 #415/3   struct_ops_private_stack/private_stack_recur:OK
 #415     struct_ops_private_stack:OK
 #549/1   verifier_private_stack/Private stack, single prog:OK
 #549/2   verifier_private_stack/Private stack, subtree > MAX_BPF_STACK:OK
 #549/3   verifier_private_stack/No private stack:OK
 #549/4   verifier_private_stack/Private stack, callback:OK
 #549/5   verifier_private_stack/Private stack, exception in mainprog:OK
 #549/6   verifier_private_stack/Private stack, exception in subprog:OK
 #549/7   verifier_private_stack/Private stack, async callback, not nested:OK
 #549/8   verifier_private_stack/Private stack, async callback, potential nesting:OK
 #549     verifier_private_stack:OK
 Summary: 2/11 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20250724120257.7299-4-puranjay@kernel.org
2025-07-26 21:27:15 +02:00
Jakub Kicinski
38b74b212a selftests: bpf: fix legacy netfilter options
Recent commit to add NETFILTER_XTABLES_LEGACY missed setting
a couple of configs to y. They are still enabled but as modules
which appears to have upset BPF CI, e.g.:

   test_bpf_nf_ct:FAIL:iptables-legacy -t raw -A PREROUTING -j CONNMARK --set-mark 42/0 unexpected error: 768 (errno 0)

Fixes: 3c3ab65f00 ("selftests: net: Enable legacy netfilter legacy options.")
Link: https://patch.msgid.link/20250726155349.1161845-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-26 12:05:09 -07:00
Jakub Kicinski
c58c18be88 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge in late fixes to prepare for the 6.17 net-next PR.

Conflicts:

net/core/neighbour.c
  1bbb76a899 ("neighbour: Fix null-ptr-deref in neigh_flush_dev().")
  13a936bb99 ("neighbour: Protect tbl->phash_buckets[] with a dedicated mutex.")
  03dc03fa04 ("neighbor: Add NTF_EXT_VALIDATED flag for externally validated entries")

Adjacent changes:

drivers/net/usb/usbnet.c
  0d9cfc9b8c ("net: usbnet: Avoid potential RCU stall on LINK_CHANGE event")
  2c04d279e8 ("net: usb: Convert tasklet API to new bottom half workqueue mechanism")

net/ipv6/route.c
  31d7d67ba1 ("ipv6: annotate data-races around rt->fib6_nsiblings")
  1caf272972 ("ipv6: adopt dst_dev() helper")
  3b3ccf9ed0 ("net: Remove unnecessary NULL check for lwtunnel_fill_encap()")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-26 11:49:45 -07:00
Xiumei Mu
5b32321fda selftests: rtnetlink.sh: remove esp4_offload after test
The esp4_offload module, loaded during IPsec offload tests, should
be reset to its default settings after testing.
Otherwise, leaving it enabled could unintentionally affect subsequence
test cases by keeping offload active.

Without this fix:
$ lsmod | grep offload; ./rtnetlink.sh -t kci_test_ipsec_offload ; lsmod | grep offload;
PASS: ipsec_offload
esp4_offload           12288  0
esp4                   32768  1 esp4_offload

With this fix:
$ lsmod | grep offload; ./rtnetlink.sh -t kci_test_ipsec_offload ; lsmod | grep offload;
PASS: ipsec_offload

Fixes: 2766a11161 ("selftests: rtnetlink: add ipsec offload API test")
Signed-off-by: Xiumei Mu <xmu@redhat.com>
Reviewed-by: Shannon Nelson <sln@onemain.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/6d3a1d777c4de4eb0ca94ced9e77be8d48c5b12f.1753415428.git.xmu@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-26 11:26:47 -07:00
Raghavendra Rao Ananta
15b5964a41 KVM: arm64: selftests: Add test for nASSGIcap attribute
Extend vgic_init to test the nASSGIcap attribute, asserting that it is
configurable (within reason) prior to initializing the VGIC.
Additionally, check that userspace cannot set the attribute after the
VGIC has been initialized.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250724062805.2658919-6-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-26 08:45:52 -07:00