Commit Graph

1426929 Commits

Author SHA1 Message Date
Michael Guralnik
d2ea675e86 RDMA/core: Add netlink command to modify FRMR aging
Allow users to set FRMR pools aging timer through netlink.
This functionality will allow user to control how long handles reside in
the kernel before being destroyed, thus being able to tune the tradeoff
between memory and HW object consumption and memory registration
optimization.
Since FRMR pools is highly beneficial for application restart scenarios,
this command allows users to modify the aging timer to their application
restart time, making sure the FRMR handles deregistered on application
teardown are kept for long enough in the pools for reuse in the
application startup.

Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20260226-frmr_pools-v4-9-95360b54f15e@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-02 13:45:37 -05:00
Michael Guralnik
50c035976a RDMA/nldev: Add command to get FRMR pools
Add support for a new command in netlink to dump to user the state of
the FRMR pools on the devices.
Expose each pool with its key and the usage statistics for it.

Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20260226-frmr_pools-v4-8-95360b54f15e@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-02 13:45:34 -05:00
Michael Guralnik
ba51cf9fcf net/mlx5: Drop MR cache related code
Following mlx5_ib move to using FRMR pools, drop all unused code of MR
cache.

Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20260226-frmr_pools-v4-7-95360b54f15e@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-02 13:45:30 -05:00
Michael Guralnik
36680ef7bc RDMA/mlx5: Switch from MR cache to FRMR pools
Use the new generic FRMR pools mechanism to optimize the performance of
memory registrations.
The move to the new generic FRMR pools will allow users configuring MR
cache through debugfs of MR cache to use the netlink API for FRMR pools
which will be added later in this series. Thus being able to have more
flexibility configuring the kernel and also being able to configure on
machines where debugfs is not available.

Mlx5_ib will save the mkey index as the handle in FRMR pools, same as the
MR cache implementation.
Upon each memory registration mlx5_ib will try to pull a handle from FRMR
pools and upon each deregistration it will push the handle back to it's
appropriate pool.

Use the vendor key field in umr pool key to save the access mode of the
mkey.

Use the option for kernel-only FRMR pool to manage the mkeys used for
registration with DMAH as the translation between UAPI of DMAH and the
mkey property of st_index is non-trivial and changing dynamically.
Since the value for no PH is 0xff and not zero, switch between them in
the frmr_key to have a zero'ed kernel_vendor_key when not using DMAH.

Remove the limitation we had with MR cache for mkeys up to 2^20 dma
blocks and support mkeys up to HW limitations according to caps.

Remove all MR cache related code.

Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20260226-frmr_pools-v4-6-95360b54f15e@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-02 13:45:19 -05:00
Michael Guralnik
020d189d16 RDMA/core: Add pinned handles to FRMR pools
Add a configuration of pinned handles on a specific FRMR pool.
The configured amount of pinned handles will not be aged and will stay
available for users to claim.

Upon setting the amount of pinned handles to an FRMR pool, we will make
sure we have at least the pinned amount of handles associated with the
pool and create more, if necessary.
The count for pinned handles take into account handles that are used by
user MRs and handles in the queue.

Introduce a new FRMR operation of build_key that allows drivers to
manipulate FRMR keys supplied by the user, allowing failing for
unsupported properties and masking of properties that are modifiable.

Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20260226-frmr_pools-v4-5-95360b54f15e@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-02 13:45:13 -05:00
Michael Guralnik
304725adec RDMA/core: Add FRMR pools statistics
Count for each pool the number of FRMR handles popped and held by user
MRs.
Also keep track of the max value of this counter.

Next patches will expose the statistics through netlink.

Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20260226-frmr_pools-v4-4-95360b54f15e@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-02 13:45:10 -05:00
Michael Guralnik
84cb1dd06f RDMA/core: Add aging to FRMR pools
Add aging mechanism to handles of FRMR pools.
Keep the handles stored in FRMR pools for at least 1 minute for
application to reuse, destroy all handles which were not reused.

Add a new queue to each pool to accomplish that.
Upon aging trigger, destroy all FRMR handles from the new 'inactive'
queue and move all handles from the 'active' pool to the 'inactive' pool.
This ensures all destroyed handles were not reused for at least one aging
time period and were not held longer than 2 aging time periods.
Handles from the inactive queue will be popped only if the active queue is
empty.

Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20260226-frmr_pools-v4-3-95360b54f15e@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-02 13:45:05 -05:00
Michael Guralnik
ce5df0b891 IB/core: Introduce FRMR pools
Add a generic Fast Registration Memory Region pools mechanism to allow
drivers to optimize memory registration performance.
Drivers that have the ability to reuse MRs or their underlying HW
objects can take advantage of the mechanism to keep a 'handle' for those
objects and use them upon user request.
We assume that to achieve this goal a driver and its HW should implement
a modify operation for the MRs that is able to at least clear and set the
MRs and in more advanced implementations also support changing a subset
of the MRs properties.

The mechanism is built using an RB-tree consisting of pools, each pool
represents a set of MR properties that are shared by all of the MRs
residing in the pool and are unmodifiable by the vendor driver or HW.

The exposed API from ib_core to the driver has 4 operations:
Init and cleanup - handles data structs and locks for the pools.
Push and pop - store and retrieve 'handle' for a memory registration
or deregistrations request.

The FRMR pools mechanism implements the logic to search the RB-tree for
a pool with matching properties and create a new one when needed and
requires the driver to implement creation and destruction of a 'handle'
when pool is empty or a handle is requested or is being destroyed.

Later patch will introduce Netlink API to interact with the FRMR pools
mechanism to allow users to both configure and track its usage.
A vendor wishing to configure FRMR pool without exposing it or without
exposing internal MR properties to users, should use the
kernel_vendor_key field in the pools key. This can be useful in a few
cases, e.g, when the FRMR handle has a vendor-specific un-modifiable
property that the user registering the memory might not be aware of.

Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20260226-frmr_pools-v4-2-95360b54f15e@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-02 13:44:58 -05:00
Chiara Meiohas
bc0ad1a17c RDMA/mlx5: Move device async_ctx initialization
Move the async_ctx initialization from mlx5_mkey_cache_init() to
mlx5_ib_stage_init_init() since the async_ctx is used by both the MR
cache and DEVX.

Also add the corresponding cleanup in mlx5_ib_stage_init_cleanup() to
properly release the async_ctx resources.

Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20260226-frmr_pools-v4-1-95360b54f15e@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-02 13:44:37 -05:00
Leon Romanovsky
94ff7c59cd RDMA: Complete k[z|m|c]alloc-to-k[z|m]alloc_obj conversion
Commits bf4afc53b7 ("Convert 'alloc_obj' family to use the new default
GFP_KERNEL argument") and 69050f8d6d ("treewide: Replace kmalloc with
kmalloc_obj for non-scalar types") updated various k[z|m|c]alloc calls to their
k[z|m]alloc_obj counterparts.

This commit finalizes that transition within the RDMA subsystem.

Link: https://patch.msgid.link/20260226-complete-alloc-conversion-v1-1-ebf1df1c2518@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-03-02 05:41:24 -05:00
Leon Romanovsky
58409f0d4d RDMA/mlx4: Remove unused create_flags field from CQ structure
The CQ creation flags do not need to be cached, as they are used
immediately at the point where they are stored. Remove the unused
field and reclaim 4 bytes.

Link: https://patch.msgid.link/20260213-refactor-umem-v1-14-f3be85847922@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-02-25 08:15:30 -05:00
Leon Romanovsky
f45f195af5 RDMA/mlx4: Introduce a modern CQ creation interface
The uverbs CQ creation UAPI allows users to supply their own umem when
creating a CQ. Update mlx4 to support this model while preserving compatibility
with the legacy interface that allocates umem internally.

Link: https://patch.msgid.link/20260213-refactor-umem-v1-13-f3be85847922@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-02-25 08:15:30 -05:00
Leon Romanovsky
0e4b9841f4 RDMA/mlx4: Inline mlx4_ib_get_cq_umem into callers
Inline the mlx4_ib_get_cq_umem helper function into its two call sites
(mlx4_ib_create_cq and mlx4_alloc_resize_umem) to prepare for the
transition to modern CQ creation interface.

Link: https://patch.msgid.link/20260213-refactor-umem-v1-12-f3be85847922@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-02-25 08:15:30 -05:00
Leon Romanovsky
14738dce7e RDMA/mlx5: Provide a modern CQ creation interface
The uverbs CQ creation UAPI allows users to supply their own umem for a CQ.
Update mlx5 to support this workflow while preserving support for creating
umem through the legacy interface.

Link: https://patch.msgid.link/20260213-refactor-umem-v1-11-f3be85847922@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-02-25 08:15:30 -05:00
Leon Romanovsky
eebea464f5 RDMA/mlx5: Save 4 bytes in CQ structure
There is no need to maintain separate, nearly empty create_flags and
private_flags fields. Unifying them reduces memory usage.

Link: https://patch.msgid.link/20260213-refactor-umem-v1-10-f3be85847922@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-02-25 08:15:30 -05:00
Leon Romanovsky
66011c1bd7 RDMA/efa: Remove check for zero CQE count
Since ib_core now handles validation, the device driver no longer needs
to verify that the CQE count is non‑zero.

Link: https://patch.msgid.link/20260213-refactor-umem-v1-9-f3be85847922@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-02-25 08:15:30 -05:00
Leon Romanovsky
a291758288 RDMA/core: Reject zero CQE count
All drivers already ensure that the number of CQEs is at least 1.
Add this validation to the core so drivers no longer need to repeat it.
Future patches converting to the .create_user_cq() interface will remove
the per‑driver checks.

Link: https://patch.msgid.link/20260213-refactor-umem-v1-8-f3be85847922@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-02-25 08:15:30 -05:00
Leon Romanovsky
584ec74748 RDMA/core: Prepare create CQ path for API unification
Ensure that .create_cq_umem() and .create_cq() follow the same API
contract, allowing drivers to be gradually migrated to the umem-aware
CQ management flow.

Link: https://patch.msgid.link/20260213-refactor-umem-v1-7-f3be85847922@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-02-25 08:15:30 -05:00
Leon Romanovsky
2ead7b09bc RDMA/efa: Rely on CPU address in create‑QP
Align this code with other locations where efa_free_mapped() depends on the
presence of a valid CPU address, which is guaranteed when qp->rq_size != 0.

Link: https://patch.msgid.link/20260213-refactor-umem-v1-6-f3be85847922@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-02-25 08:15:30 -05:00
Leon Romanovsky
25c7410488 RDMA/core: Manage CQ umem in core code
In the current implementation, CQ umem is handled both by ib_core and
the driver. ib_core sometimes creates and destroys it, while the driver
also destroys it.

Store the umem in struct ib_cq and ensure that only ib_core manages
its lifetime, relying solely on its internal reference counter.

Link: https://patch.msgid.link/20260213-refactor-umem-v1-5-f3be85847922@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-02-25 08:15:30 -05:00
Leon Romanovsky
a731c86265 RDMA/core: Promote UMEM to a core component
To manage UMEM objects at the core level and reuse the existing
ib_destroy_cq*() flow, move the UMEM files to be built together with
ib_core. Attempting to call ib_umem_release() from verbs.c currently
results in the following error:

    depmod: ERROR: Cycle detected: ib_core -> ib_uverbs -> ib_core
    depmod: ERROR: Found 2 modules in dependency cycles!
    verbs.c:(.text+0x250c): undefined reference to `ib_umem_release'

Link: https://patch.msgid.link/20260213-refactor-umem-v1-4-f3be85847922@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-02-25 08:15:30 -05:00
Leon Romanovsky
e3104fe921 RDMA/umem: Remove unnecessary includes and defines from ib_umem header
The ib_umem header no longer requires the removed includes or forward
declarations, so drop them to reduce clutter.

Link: https://patch.msgid.link/20260213-refactor-umem-v1-3-f3be85847922@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-02-25 08:15:30 -05:00
Leon Romanovsky
2ae3c4f6ea RDMA/umem: Allow including ib_umem header from any location
Including ib_umem.h currently triggers circular dependency errors.
These issues can be resolved by removing the include of ib_verbs.h,
which was only needed to resolve the struct ib_device pointer.

>> depmod: ERROR: Cycle detected: ib_core -> ib_uverbs -> ib_core
>> depmod: ERROR: Found 2 modules in dependency cycles!
  make[3]: *** [scripts/Makefile.modinst:132: depmod] Error 1
  make[3]: Target '__modinst' not remade because of errors.
  make[2]: *** [Makefile:1960: modules_install] Error 2
  make[1]: *** [Makefile:248: __sub-make] Error 2
  make[1]: Target 'modules_install' not remade because of errors.
  make: *** [Makefile:248: __sub-make] Error 2
  make: Target 'modules_install' not remade because of errors.

Link: https://patch.msgid.link/20260213-refactor-umem-v1-2-f3be85847922@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-02-25 08:15:30 -05:00
Leon Romanovsky
6094ea64c6 RDMA: Move DMA block iterator logic into dedicated files
The DMA iterator logic was mixed into verbs and umem-specific code,
forcing all users to include rdma/ib_umem.h. Move the block iterator
logic into iter.c and rdma/iter.h so that rdma/ib_umem.h and
rdma/ib_verbs.h can be separated in a follow-up patch.

Link: https://patch.msgid.link/20260213-refactor-umem-v1-1-f3be85847922@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-02-25 08:15:30 -05:00
Yonatan Nachum
d1fc91be26 RDMA/efa: Use extended inline buff size for inline validation
On QP creation we validate the requested max inline size is supported by
the device. Use the new extended max inline size instead of the old one
to support actual max inline available.

Reviewed-by: Michael Margolin <mrgolin@amazon.com>
Signed-off-by: Yonatan Nachum <ynachum@amazon.com>
Link: https://patch.msgid.link/20260217112304.36849-4-ynachum@amazon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-02-25 06:22:20 -05:00
Yonatan Nachum
e736a223ab RDMA/efa: Expose new extended max inline buff size
Add new extended max inline query and report the new value to userspace.

Reviewed-by: Firas Jahjah <firasj@amazon.com>
Reviewed-by: Michael Margolin <mrgolin@amazon.com>
Signed-off-by: Yonatan Nachum <ynachum@amazon.com>
Link: https://patch.msgid.link/20260217112304.36849-3-ynachum@amazon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-02-25 06:22:20 -05:00
Yonatan Nachum
6b8d5a0cdb RDMA/efa: Rename admin queue attributes struct name for extendability
As preparation for adding a second queue attributes query, change the
name of the existing queue attributes.

Reviewed-by: Michael Margolin <mrgolin@amazon.com>
Signed-off-by: Yonatan Nachum <ynachum@amazon.com>
Link: https://patch.msgid.link/20260217112304.36849-2-ynachum@amazon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-02-25 06:22:20 -05:00
Randy Dunlap
2865500db9 RDMA/restrack: fix kernel-doc indicator
Use "/**" to begin kernel-doc comments. This eliminates these
kernel-doc warnings:

Warning: include/rdma/restrack.h:123 struct member 'kref' not described in
 'rdma_restrack_entry'
Warning: include/rdma/restrack.h:123 struct member 'comp' not described in
 'rdma_restrack_entry'

(not adding missing return value kernel-doc descriptions)

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20260224003149.3175815-1-rdunlap@infradead.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-02-24 05:39:21 -05:00
Randy Dunlap
16dc2d72de RDMA/iwcm: fix some kernel-doc issues in iw_cm.h
Use the "typedef" keyword as needed.
Correct 2 function parameter names.

Warning: include/rdma/iw_cm.h:42 function parameter 'iw_cm_handler' not
 described in 'int'
Warning: include/rdma/iw_cm.h:42 expecting prototype for iw_cm_handler().
 Prototype was for int() instead
Warning: include/rdma/iw_cm.h:53 function parameter 'iw_event_handler' not
 described in 'int'
Warning: include/rdma/iw_cm.h:53 expecting prototype for
 iw_event_handler(). Prototype was for int() instead
Warning: include/rdma/iw_cm.h:104 function parameter 'cm_handler' not
 described in 'iw_create_cm_id'
Warning: include/rdma/iw_cm.h:158 function parameter 'private_data' not
 described in 'iw_cm_reject'

(not adding missing return value kernel-doc descriptions)

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20260224003134.3174856-1-rdunlap@infradead.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-02-24 05:39:13 -05:00
Randy Dunlap
ff46d13927 RDMA/umem: fix kernel-doc warnings
Add or correct kernel-doc comments to eliminate warnings:

Warning: include/rdma/ib_umem.h:104 function parameter 'biter' not
 described in 'rdma_umem_for_each_dma_block'
Warning: include/rdma/ib_umem.h:140 function parameter 'pgsz_bitmap' not
 described in 'ib_umem_find_best_pgoff'
Warning: include/rdma/ib_umem.h:141 No description found for return
 value of 'ib_umem_find_best_pgoff'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20260224003120.3173892-1-rdunlap@infradead.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-02-24 05:39:01 -05:00
Randy Dunlap
2ecd012774 IB/cache: avoid kernel-doc warnings
Use the correct function parameters names to eliminate kernel-doc
warnings:

Warning: include/rdma/ib_cache.h:47 function parameter 'device_handle'
 not described in 'ib_get_cached_pkey'
Warning: include/rdma/ib_cache.h:89 function parameter 'port_active'
 not described in 'ib_get_cached_port_state'

(not adding missing function return value descriptions)

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20260224003106.3172916-1-rdunlap@infradead.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-02-24 05:38:45 -05:00
Evan Green
f3f9825837 RDMA/rxe: Generate async error for r_key violations
Table 63 of the IBTA spec lists R_Key violations as a class C
error. 9.9.3.1.3 Responder Class C Fault Behavior indicates an
affiliated asynchronous error should be generated at the responder
if the error can be associated to a QP but not a particular RX WQE.

Relevant portion of the spec:
C9-222.1.1: For an HCA responder using Reliable Connection service, for
a Class C responder side error, the error shall be reported to the
requester by generating the appropriate NAK code as specified in Table 63
Responder Error Behavior Summary on page 448. If the error can be related
to a particular QP but cannot be related to a particular WQE on that
receive queue (e.g. the error occurred while executing an RDMA Write
Request without immediate data), the error shall be reported to the
responder’s client as an Affiliated Asynchronous error. See Section
10.10.2.3 Asynchronous Errors on page 576 for details. If the error can be
related to a particular WQE on a given receive queue, the QP shall be
placed into the error state and the error shall be reported to the
responder’s client as a Completion error.

Generate an affiliated asynchronous error upon Rkey violations
if the opcode does not carry an immediate. This causes async
events at the responder for all ops that generate R_Key violations
except WRITE_WITH_IMM, where the error can ride in with the RX WQE.

Signed-off-by: Evan Green <evgreen@meta.com>
Link: https://patch.msgid.link/20260220185533.252759-1-evgreen@meta.com
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-02-24 04:56:18 -05:00
Linus Torvalds
6de23f81a5 Linux 7.0-rc1 v7.0-rc1 2026-02-22 13:18:59 -08:00
Linus Torvalds
fbf3380361 Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux
Pull fsverity fixes from Eric Biggers:

 - Fix a build error on parisc

 - Remove the non-large-folio-aware function fsverity_verify_page()

* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
  fsverity: fix build error by adding fsverity_readahead() stub
  fsverity: remove fsverity_verify_page()
  f2fs: make f2fs_verify_cluster() partially large-folio-aware
  f2fs: remove unnecessary ClearPageUptodate in f2fs_verify_cluster()
2026-02-22 13:12:04 -08:00
Linus Torvalds
75e1f66a9e Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library fix from Eric Biggers:
 "Fix a big endian specific issue in the PPC64-optimized AES code"

* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  lib/crypto: powerpc/aes: Fix rndkey_from_vsx() on big endian CPUs
2026-02-22 13:09:33 -08:00
Mark Brown
aaf96df959 CREDITS: Add -next to Stephen Rothwell's entry
Stephen retired and stepped back from -next maintainership, update his
entry in CREDITS to recognise his 18 years of hard work making it what
it is today and all the impact it's had on our development process.

Also update to his current GnuPG key while we're here.

Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-22 12:11:33 -08:00
Arnd Bergmann
746b9ef5d5 x509: select CONFIG_CRYPTO_LIB_SHA256
The x509 public key code gained a dependency on the sha256 hash
implementation, causing a rare link time failure in randconfig
builds:

  arm-linux-gnueabi-ld: crypto/asymmetric_keys/x509_public_key.o: in function `x509_get_sig_params':
  x509_public_key.c:(.text.x509_get_sig_params+0x12): undefined reference to `sha256'
  arm-linux-gnueabi-ld: (sha256): Unknown destination type (ARM/Thumb) in crypto/asymmetric_keys/x509_public_key.o
  x509_public_key.c:(.text.x509_get_sig_params+0x12): dangerous relocation: unsupported relocation

Select the necessary library code from Kconfig.

Fixes: 2c62068ac8 ("x509: Separately calculate sha256 for blacklist")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-22 12:09:23 -08:00
Haiyue Wang
fd1d6b9d13 xz: fix arm fdt compile error for kmalloc replacement
Align to the commit bf4afc53b7 ("Convert 'alloc_obj' family to use the
new default GFP_KERNEL argument") update the 'kmalloc_obj' declaration
for userspace to fix below compile error:

  In file included from arch/arm/boot/compressed/../../../../lib/decompress_unxz.c:241,
                   from arch/arm/boot/compressed/decompress.c:56:
  arch/arm/boot/compressed/../../../../lib/xz/xz_dec_stream.c: In function 'xz_dec_init':
  arch/arm/boot/compressed/../../../../lib/xz/xz_dec_stream.c:787:28: error: implicit declaration of function 'kmalloc_obj'; did you mean 'kmalloc'? [-Wimplicit-function-declaration]
     787 |         struct xz_dec *s = kmalloc_obj(*s);
         |                            ^~~~~~~~~~~
         |                            kmalloc

Signed-off-by: Haiyue Wang <haiyuewa@163.com>
Fixes: 69050f8d6d ("treewide: Replace kmalloc with kmalloc_obj for non-scalar types")
Fixes: bf4afc53b7 ("Convert 'alloc_obj' family to use the new default GFP_KERNEL argument")
Reviewed-by: Kees Cook <kees@kernel.org>
Acked-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-22 12:05:31 -08:00
Linus Torvalds
5f2eac7767 Merge tag 'rtc-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC updates from Alexandre Belloni:

 - loongson: Loongson-2K0300 support

 - s35390a: nvmem support

 - zynqmp: rework calibration

* tag 'rtc-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
  rtc: ds1390: fix number of bytes read from RTC
  rtc: class: Remove duplicate check for alarm
  rtc: optee: simplify OP-TEE context match
  rtc: interface: Alarm race handling should not discard preceding error
  rtc: s35390a: implement nvmem support
  rtc: loongson: Add Loongson-2K0300 support
  dt-bindings: rtc: loongson: Document Loongson-2K0300 compatible
  dt-bindings: rtc: loongson: Correct Loongson-1C interrupts property
  dt-bindings: rtc: renesas,rz-rtca3: Add RZ/V2N support
  dt-bindings: rtc: cpcap: convert to schema
  rtc: zynqmp: use dynamic max and min offset ranges
  rtc: zynqmp: rework set_offset
  rtc: zynqmp: rework read_offset
  rtc: zynqmp: check calibration max value
  rtc: zynqmp: correct frequency value
  rtc: amlogic-a4: Remove IRQF_ONESHOT
  rtc: pcf8563: use correct of_node for output clock
  rtc: max31335: use correct CONFIG symbol in IS_REACHABLE()
  rtc: nvvrs: Add ARCH_TEGRA to the NV VRS RTC driver
2026-02-22 09:43:11 -08:00
Linus Torvalds
1dd419145d Merge tag 'rust-fixes-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull rust fixes from Miguel Ojeda:
 "Toolchain and infrastructure:

   - Pass '-Zunstable-options' flag required by the future Rust 1.95.0

   - Fix 'objtool' warning for Rust 1.84.0

  'kernel' crate:

   - 'irq' module: add missing bound detected by the future Rust 1.95.0

   - 'list' module: add missing 'unsafe' blocks and placeholder safety
     comments to macros (an issue for future callers within the crate)

  'pin-init' crate:

   - Clean Clippy warning that changed behavior in the future Rust
     1.95.0"

* tag 'rust-fixes-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
  rust: list: Add unsafe blocks for container_of and safety comments
  rust: pin-init: replace clippy `expect` with `allow`
  rust: irq: add `'static` bounds to irq callbacks
  objtool/rust: add one more `noreturn` Rust function
  rust: kbuild: pass `-Zunstable-options` for Rust 1.95.0
2026-02-22 08:43:31 -08:00
Linus Torvalds
d2ba6e9c0a Merge tag 'trace-rv-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull runtime verifier fix from Steven Rostedt:

 - Fix multiple definition of __pcpu_unique_da_mon_this

   After refactoring monitors, we used static per-cpu variables with the
   same names across different per-cpu monitors. This is explicitly
   disallowed for modules on some architectures (alpha) or if
   CONFIG_DEBUG_FORCE_WEAK_PER_CPU is enabled (e.g. Fedora's debug
   kernel). Make sure all those variables have different names to avoid
   compilation issues.

* tag 'trace-rv-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  rv: Fix multiple definition of __pcpu_unique_da_mon_this
2026-02-22 08:40:13 -08:00
Kees Cook
189f164e57 Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses
Conversion performed via this Coccinelle script:

  // SPDX-License-Identifier: GPL-2.0-only
  // Options: --include-headers-for-types --all-includes --include-headers --keep-comments
  virtual patch

  @gfp depends on patch && !(file in "tools") && !(file in "samples")@
  identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex,
 		    kzalloc_obj,kzalloc_objs,kzalloc_flex,
		    kvmalloc_obj,kvmalloc_objs,kvmalloc_flex,
		    kvzalloc_obj,kvzalloc_objs,kvzalloc_flex};
  @@

  	ALLOC(...
  -		, GFP_KERNEL
  	)

  $ make coccicheck MODE=patch COCCI=gfp.cocci

Build and boot tested x86_64 with Fedora 42's GCC and Clang:

Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01
Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01

Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-22 08:26:33 -08:00
Linus Torvalds
32a92f8c89 Convert more 'alloc_obj' cases to default GFP_KERNEL arguments
This converts some of the visually simpler cases that have been split
over multiple lines.  I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.

Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script.  I probably had made it a bit _too_ trivial.

So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.

The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 20:03:00 -08:00
Linus Torvalds
323bbfcf1e Convert 'alloc_flex' family to use the new default GFP_KERNEL argument
This is the exact same thing as the 'alloc_obj()' version, only much
smaller because there are a lot fewer users of the *alloc_flex()
interface.

As with alloc_obj() version, this was done entirely with mindless brute
force, using the same script, except using 'flex' in the pattern rather
than 'objs*'.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Linus Torvalds
bf4afc53b7 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using

    git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Linus Torvalds
e19e1b480a add default_gfp() helper macro and use it in the new *alloc_obj() helpers
Most simple allocations use GFP_KERNEL, and with the new allocation
helpers being introduced, let's just take advantage of that to simplify
that default case.

It's a numbers game:

    git grep 'alloc_obj(' |
	sed 's/.*\(GFP_[_A-Z]*\).*/\1/' |
	sort | uniq -c | sort -n | tail

shows that about 90% of all those new allocator instances just use that
standard GFP_KERNEL.

Those helpers are already macros, and we can easily just make it be the
default case when the gfp argument is missing.

And yes, we could do that for all the legacy interfaces too, but let's
keep it to just the new ones at least for now, since those all got
converted recently anyway, so this is not any "extra" noise outside of
that limited conversion.

And, in fact, I want to do this before doing the -rc1 release, exactly
so that we don't get extra merge conflicts.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:50 -08:00
Linus Torvalds
fa5c82f4d2 slab.h: disable completely broken overflow handling in flex allocations
Commit 69050f8d6d ("treewide: Replace kmalloc with kmalloc_obj for
non-scalar types") started using the new allocation helpers, and in the
process showed that they were completely non-working.

The overflow logic in overflows_flex_counter_type() is completely the
wrong way around, and that broke __alloc_flex() completely.  By chance,
the resulting code was then such a mess that clang generated
sufficiently garbage code that objtool warned about it all.  Which made
it somewhat quicker to narrow things down.

While fixing overflows_flex_counter_type() would presumably fix this
all, I'm excising the whole broken overflow logic from __alloc_flex(),
because we don't want that kind of code in basic allocation functions
anyway.

That (no longer) broken overflows_flex_counter_type() thing needs to be
inserted into the actual __set_flex_counter() logic in the unlikely case
that we ever want this at all.  And made conditional.

Fixes: 81cee9166a ("compiler_types: Introduce __flex_counter() and family")
Fixes: 69050f8d6d ("treewide: Replace kmalloc with kmalloc_obj for non-scalar types")
Cc: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/all/CAHk-=whEd020BYzGTzYrENjD9Z5_82xx6h8HsQvH5xDSnv0=Hw@mail.gmail.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 15:12:09 -08:00
Linus Torvalds
8934827db5 Merge tag 'kmalloc_obj-treewide-v7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull kmalloc_obj conversion from Kees Cook:
 "This does the tree-wide conversion to kmalloc_obj() and friends using
  coccinelle, with a subsequent small manual cleanup of whitespace
  alignment that coccinelle does not handle.

  This uncovered a clang bug in __builtin_counted_by_ref(), so the
  conversion is preceded by disabling that for current versions of
  clang.  The imminent clang 22.1 release has the fix.

  I've done allmodconfig build tests for x86_64, arm64, i386, and arm. I
  did defconfig builds for alpha, m68k, mips, parisc, powerpc, riscv,
  s390, sparc, sh, arc, csky, xtensa, hexagon, and openrisc"

* tag 'kmalloc_obj-treewide-v7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  kmalloc_obj: Clean up after treewide replacements
  treewide: Replace kmalloc with kmalloc_obj for non-scalar types
  compiler_types: Disable __builtin_counted_by_ref for Clang
2026-02-21 11:02:58 -08:00
Linus Torvalds
c7decec2f2 Merge tag 'perf-tools-for-v7.0-1-2026-02-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:

 - Introduce 'perf sched stats' tool with record/report/diff workflows
   using schedstat counters

 - Add a faster libdw based addr2line implementation and allow selecting
   it or its alternatives via 'perf config addr2line.style='

 - Data-type profiling fixes and improvements including the ability to
   select fields using 'perf report''s -F/-fields, e.g.:

     'perf report --fields overhead,type'

 - Add 'perf test' regression tests for Data-type profiling with C and
   Rust workloads

 - Fix srcline printing with inlines in callchains, make sure this has
   coverage in 'perf test'

 - Fix printing of leaf IP in LBR callchains

 - Fix display of metrics without sufficient permission in 'perf stat'

 - Print all machines in 'perf kvm report -vvv', not just the host

 - Switch from SHA-1 to BLAKE2s for build ID generation, remove SHA-1
   code

 - Fix 'perf report's histogram entry collapsing with '-F' option

 - Use system's cacheline size instead of a hardcoded value in 'perf
   report'

 - Allow filtering conversion by time range in 'perf data'

 - Cover conversion to CTF using 'perf data' in 'perf test'

 - Address newer glibc const-correctness (-Werror=discarded-qualifiers)
   issues

 - Fixes and improvements for ARM's CoreSight support, simplify ARM SPE
   event config in 'perf mem', update docs for 'perf c2c' including the
   ARM events it can be used with

 - Build support for generating metrics from arch specific python
   script, add extra AMD, Intel, ARM64 metrics using it

 - Add AMD Zen 6 events and metrics

 - Add JSON file with OpenHW Risc-V CVA6 hardware counters

 - Add 'perf kvm' stats live testing

 - Add more 'perf stat' tests to 'perf test'

 - Fix segfault in `perf lock contention -b/--use-bpf`

 - Fix various 'perf test' cases for s390

 - Build system cleanups, bump minimum shellcheck version to 0.7.2

 - Support building the capstone based annotation routines as a plugin

 - Allow passing extra Clang flags via EXTRA_BPF_FLAGS

* tag 'perf-tools-for-v7.0-1-2026-02-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (255 commits)
  perf test script: Add python script testing support
  perf test script: Add perl script testing support
  perf script: Allow the generated script to be a path
  perf test: perf data --to-ctf testing
  perf test: Test pipe mode with data conversion --to-json
  perf json: Pipe mode --to-ctf support
  perf json: Pipe mode --to-json support
  perf check: Add libbabeltrace to the listed features
  perf build: Allow passing extra Clang flags via EXTRA_BPF_FLAGS
  perf test data_type_profiling.sh: Skip just the Rust tests if code_with_type workload is missing
  tools build: Fix feature test for rust compiler
  perf libunwind: Fix calls to thread__e_machine()
  perf stat: Add no-affinity flag
  perf evlist: Reduce affinity use and move into iterator, fix no affinity
  perf evlist: Missing TPEBS close in evlist__close()
  perf evlist: Special map propagation for tool events that read on 1 CPU
  perf stat-shadow: In prepare_metric fix guard on reading NULL perf_stat_evsel
  Revert "perf tool_pmu: More accurately set the cpus for tool events"
  tools build: Emit dependencies file for test-rust.bin
  tools build: Make test-rust.bin be removed by the 'clean' target
  ...
2026-02-21 10:51:08 -08:00
Linus Torvalds
3544d5ce36 Merge tag 'cocci-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux
Pull coccinelle updates from Julia Lawall:
 "This simplifies and clarifies the handling of output generated by
  Coccinelle that is sent to standard error.

  By default, this goes to /dev/null. Remind the user of that and
  encourage them to provide another file name (Benjamin Philip)"

* tag 'cocci-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
  Documentation: Coccinelle: document debug log handling
  scripts: coccicheck: warn on unset debug file
  scripts: coccicheck: simplify debug file handling
2026-02-21 10:25:42 -08:00