linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-16 10:11:38 -04:00

Author	SHA1	Message	Date
Marco Crivellari	2bb02691df	RDMA/rxe: Replace use of system_unbound_wq with rxe_wq This patch continues the effort to refactor workqueue APIs, which has begun with the changes introducing new workqueues and a new alloc_workqueue flag: commit `128ea9f6cc` ("workqueue: Add system_percpu_wq and system_dfl_wq") commit `930c2ea566` ("workqueue: Add new WQ_PERCPU flag") The point of the refactoring is to eventually alter the default behavior of workqueues to become unbound by default so that their workload placement is optimized by the scheduler. Before that to happen, workqueue users must be converted to the better named new workqueues with no intended behaviour changes: system_wq -> system_percpu_wq system_unbound_wq -> system_dfl_wq This way the old obsolete workqueues (system_wq, system_unbound_wq) can be removed in the future. This specific driver already allocate an unbound workqueue named "rxe_wq", so replace system_unbound_wq with this one instead of system_dfl_wq. Link: https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/ Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20260318152748.837388-1-marco.crivellari@suse.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-30 13:47:43 -04:00
Jacob Moroni	f3cf74933c	RDMA/irdma: Add support for GEN4 hardware GEN4 hardware is similar to GEN3 and requires only a few special cases. Signed-off-by: Jacob Moroni <jmoroni@google.com> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-30 13:47:37 -04:00
Jay Bhat	9d6ba4ced7	RDMA/irdma: Provide scratch buffers to firmware for internal use For GEN3 and higher, FW requires scratch buffers for bookkeeping during cleanup, specifically during QP and MR destroy ops. Signed-off-by: Jay Bhat <jay.bhat@intel.com> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-30 13:46:56 -04:00
Michael Margolin	5aeb6e0399	RDMA/efa: Rename alloc_ucontext comp_mask to supported_caps Following discussion [1], rename the comp_mask field in efa_ibv_alloc_ucontext_cmd to supported_caps to reflect its actual usage as a capabilities handshake mechanism rather than a standard comp_mask. Rename related constants and align function and macro names. [1] https://lore.kernel.org/linux-rdma/20260312120858.GH1448102@nvidia.com/ Signed-off-by: Michael Margolin <mrgolin@amazon.com> Link: https://patch.msgid.link/20260316180846.30273-1-mrgolin@amazon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-17 07:04:03 -04:00
Dean Luick	6be4ca0ab3	RDMA/rdmavt: Add driver mmap callback Add a reserved range and a driver callback to allow the driver to have custom mmaps. Generated mmap offsets are cookies and are not related to the size of the mmap. Advance the mmap offset by the minimum, PAGE_SIZE, rather than the size of the mmap. Signed-off-by: Dean Luick <dean.luick@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Link: https://patch.msgid.link/177308909972.1279894.15543003811821875042.stgit@awdrv-04.cornelisnetworks.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-11 15:17:28 -04:00
Dean Luick	0fed679e08	RDMA/rdmavt: Correct multi-port QP iteration When finding special QPs, the iterator makes an incorrect port index calculation. Fix the calculation. Signed-off-by: Dean Luick <dean.luick@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Link: https://patch.msgid.link/177308909468.1279894.5073405674644246445.stgit@awdrv-04.cornelisnetworks.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-11 15:17:28 -04:00
Dean Luick	679eb25de4	RDMA/rdmavt: Add ucontext alloc/dealloc passthrough Add a private data pointer to the ucontext structure and add per-client pass-throughs. Signed-off-by: Dean Luick <dean.luick@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Link: https://patch.msgid.link/177325008318.52243.7367786996925601681.stgit@awdrv-04.cornelisnetworks.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-11 15:17:28 -04:00
Dean Luick	786ee8ddf4	RDMA/OPA: Update OPA link speed list Update the list of available link speeds. Fix comments. Signed-off-by: Dean Luick <dean.luick@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Link: https://patch.msgid.link/177308908456.1279894.16723781060261360236.stgit@awdrv-04.cornelisnetworks.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-11 15:17:28 -04:00
Rosen Penev	56521f5877	IB/hfi1: kzalloc to kzalloc_flex Combine kzalloc and kcalloc with a flexible array member. Avoids having to free separately. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20260309215017.4753-1-rosenp@gmail.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-10 14:04:09 -04:00
Dennis Dalessandro	1b50f42049	RDMA/hfi1: Remove opa_vnic OPA Vnic has been abandoned and left to rot. Time to excise. Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Link: https://patch.msgid.link/177308912950.1280237.15051663328388849915.stgit@awdrv-04.cornelisnetworks.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-10 07:51:30 -04:00
Rosen Penev	2afa8b9f5f	RDMA/ocrdma: kzalloc_objs to kzalloc_flex Simplify allocation by eliminating one. No longer need to kfree pages separately. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20260308201419.5260-1-rosenp@gmail.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-10 06:22:59 -04:00
Jacob Moroni	4707bf5f6c	RDMA/irdma: Add support for revocable pinned dmabuf import Use the new API to support importing pinned dmabufs from exporters that require revocation, such as VFIO. The revoke semantic is achieved by issuing a HW invalidation command but not freeing the key. This prevents further accesses to the region (they will result in an invalid key AE), but also keeps the key reserved until the region is actually deregistered (i.e., ibv_dereg_mr) so that a new MR registration cannot acquire the same key. Tested with lockdep+kasan and a memfd backed dmabuf. The rereg_mr path is explicitly blocked in libibverbs for dmabuf MRs (more specifically, any MR not of type IBV_MR_TYPE_MR), so the rereg_mr path for dmabufs was tested with a modified libibverbs. Signed-off-by: Jacob Moroni <jmoroni@google.com> Link: https://patch.msgid.link/20260305170826.3803155-6-jmoroni@google.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-08 08:37:38 -04:00
Jacob Moroni	3a0b171302	RDMA/umem: Add helpers for umem dmabuf revoke lock Added helpers to acquire and release the umem dmabuf revoke lock. The intent is to avoid the need for drivers to peek into the ib_umem_dmabuf internals to get the dma_resv_lock and bring us one step closer to abstracting ib_umem_dmabuf away from drivers in general. Signed-off-by: Jacob Moroni <jmoroni@google.com> Link: https://patch.msgid.link/20260305170826.3803155-5-jmoroni@google.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-08 08:37:38 -04:00
Jacob Moroni	ff85a2ebac	RDMA/umem: Add pinned revocable dmabuf import interface Added an interface for importing a pinned but revocable dmabuf. This interface can be used by drivers that are capable of revocation so that they can import dmabufs from exporters that may require it, such as VFIO. This interface implements a two step process, where drivers will first call ib_umem_dmabuf_get_pinned_revocable_and_lock() which will pin and map the dmabuf (and provide a functional move_notify/invalidate_mappings callback), but will return with the lock still held so that the driver can then populate the callback via ib_umem_dmabuf_set_revoke_locked() without races from concurrent revocations. This scheme also allows for easier integration with drivers that may not have actually allocated their internal MR objects at the time of the get_pinned_revocable* call. Signed-off-by: Jacob Moroni <jmoroni@google.com> Link: https://patch.msgid.link/20260305170826.3803155-4-jmoroni@google.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-08 08:37:38 -04:00
Jacob Moroni	797291a66c	RDMA/umem: Move umem dmabuf revoke logic into helper function This same logic will eventually be reused from within the invalidate_mappings callback which already has the dma_resv_lock held, so break it out into a separate function so it can be reused. Signed-off-by: Jacob Moroni <jmoroni@google.com> Link: https://patch.msgid.link/20260305170826.3803155-3-jmoroni@google.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-08 08:37:38 -04:00
Jacob Moroni	553dfa8cbd	RDMA/umem: Add ib_umem_dmabuf_get_pinned_and_lock helper Move the inner logic of ib_umem_dmabuf_get_pinned_with_dma_device() to a new static function that returns with the lock held upon success. The intent is to allow reuse for the future get_pinned_revocable_and_lock function. Signed-off-by: Jacob Moroni <jmoroni@google.com> Link: https://patch.msgid.link/20260305170826.3803155-2-jmoroni@google.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-08 08:37:38 -04:00
Marco Crivellari	1dc469f669	RDMA/rtrs: add WQ_PERCPU to alloc_workqueue users This continues the effort to refactor workqueue APIs, which began with the introduction of new workqueues and a new alloc_workqueue flag in: commit `128ea9f6cc` ("workqueue: Add system_percpu_wq and system_dfl_wq") commit `930c2ea566` ("workqueue: Add new WQ_PERCPU flag") The refactoring is going to alter the default behavior of alloc_workqueue() to be unbound by default. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. For more details see the Link tag below. In order to keep alloc_workqueue() behavior identical, explicitly request WQ_PERCPU. Link: https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/ Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20260305154117.326472-1-marco.crivellari@suse.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-08 07:02:43 -04:00
Sriharsha Basavapatna	a06165a705	RDMA/bnxt_re: Support application specific CQs This patch supports application allocated memory for CQs. The application allocates and manages the CQs directly. To support this, the driver exports a new comp_mask to indicate direct control of the CQ. When this comp_mask bit is set in the ureq, the driver maps this application allocated CQ memory into hardware. As the application manages this memory, the CQ depth ('cqe') passed by it must be used as is and the driver shouldn't update it. For CQs, ib_core supports pinning dmabuf based application memory, specified through provider attributes. This umem is mananged by the ib_core and is available in ib_cq. Register 'create_cq_user' devop to process this umem. The driver also supports the legacy interface that allocates umem internally. Link: https://patch.msgid.link/r/20260302110036.36387-7-sriharsha.basavapatna@broadcom.com Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:25 -04:00
Sriharsha Basavapatna	cec5157b6c	RDMA/bnxt_re: Separate kernel and user CQ creation paths This patch refactors kernel and user CQ creation logic into two separate code paths. This will be used to support dmabuf based user CQ memory in the next patch. There is no functional change in this patch. Link: https://patch.msgid.link/r/20260302110036.36387-6-sriharsha.basavapatna@broadcom.com Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:25 -04:00
Sriharsha Basavapatna	3d4a42360c	RDMA/bnxt_re: Refactor bnxt_re_create_cq() Some applications may allocate dmabuf based memory for CQs. To support this, update the existing code to use SZ_4K to specify supported HW page size for CQs, as we support only 4K pages for now. Call ib_umem_find_best_pgsz() to ensure umem supports this requested page size. A helper function includes these changes. Link: https://patch.msgid.link/r/20260302110036.36387-5-sriharsha.basavapatna@broadcom.com Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:25 -04:00
Kalesh AP	1234a9d8ae	RDMA/bnxt_re: Support doorbell extensions Some applications may need multiple doorbells to support parallel processing of threads that each operate on a group of resources. The following uapi methods have been implemented in this patch. - BNXT_RE_METHOD_DBR_ALLOC: This will allow the appliation to create extra doorbell regions and use the associated doorbell page index in CREATE_QP and use the associated DB address while ringing the doorbell. - BNXT_RE_METHOD_DBR_FREE: Free the allocated doorbell region. - BNXT_RE_METHOD_GET_DEFAULT_DBR: Return the default doorbell page index and doorbell page address associated with the ucontext. Link: https://patch.msgid.link/r/20260302110036.36387-4-sriharsha.basavapatna@broadcom.com Co-developed-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:25 -04:00
Kalesh AP	13f9a813ee	RDMA/bnxt_re: Refactor bnxt_qplib_create_qp() function Inside bnxt_qplib_create_qp(), driver currently is doing a lot of things like allocating HWQ memory for SQ/RQ/ORRQ/IRRQ, initializing few of qplib_qp fields etc. Refactored the code such that all memory allocation for HWQs have been moved to bnxt_re_init_qp_attr() function and inside bnxt_qplib_create_qp() function just initialize the request structure and issue the HWRM command to firmware. Introduced couple of new functions bnxt_re_setup_qp_hwqs() and bnxt_re_setup_qp_swqs() moved the hwq and swq memory allocation logic there. Link: https://patch.msgid.link/r/20260302110036.36387-3-sriharsha.basavapatna@broadcom.com Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:25 -04:00
Kalesh AP	eee6268421	RDMA/bnxt_re: Move the UAPI methods to a dedicated file This is in preparation for upcoming patches in the series. Driver has to support additional UAPIs for some applications. Moving current UAPI implementation to a new file, uapi.c. Link: https://patch.msgid.link/r/20260302110036.36387-2-sriharsha.basavapatna@broadcom.com Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:25 -04:00
Jason Gunthorpe	613713f251	RDMA: Add IB_UVERBS_CORE_SUPPORT_ROBUST_UDATA This flag can be set by drivers once they have finished auditing and implementing the full udata support on every udata operation. My intention going forward is that driver authors proposing new udata uAPI for their drivers must first do the work and set this flag. If this flag is not set the userspace should not try to use udata based uAPI newer than this commit, though on a case by case basis it may be OK based on what checks historical kernels performed on the specific call. Since bnxt_re is audited now, it is the first driver to set the flag. Link: https://patch.msgid.link/r/13-v3-bd56dd443069+49-bnxt_re_uapi_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:25 -04:00
Jason Gunthorpe	bed686d8dc	RDMA/bnxt_re: Use ib_respond_empty_udata() Like ib_is_udata_in_empty() for the request side ib_respond_empty_udata() is called on the response side if there no response struct. Link: https://patch.msgid.link/r/12-v3-bd56dd443069+49-bnxt_re_uapi_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:25 -04:00
Jason Gunthorpe	bc30311e49	RDMA/bnxt_re: Use ib_respond_udata() All the calls to ib_copy_to_udata() can use this helper safely. Link: https://patch.msgid.link/r/11-v3-bd56dd443069+49-bnxt_re_uapi_jgg@nvidia.com Tested-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:25 -04:00
Jason Gunthorpe	0cee3acab2	RDMA/bnxt_re: Add missing comp_mask validation Two existing req driver data structures have comp_mask but nothing checks them for valid contents. Add the missing checks. Link: https://patch.msgid.link/r/10-v3-bd56dd443069+49-bnxt_re_uapi_jgg@nvidia.com Tested-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:25 -04:00
Jason Gunthorpe	3f6b103c4b	RDMA/bnxt_re: Add compatibility checks to the uapi path for no data If drivers ever want to go from an empty drvdata to something with them they need to have called ib_is_udata_in_empty(). Add the missing calls to all the system calls that don't have req structures. Link: https://patch.msgid.link/r/9-v3-bd56dd443069+49-bnxt_re_uapi_jgg@nvidia.com Tested-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:25 -04:00
Jason Gunthorpe	b33d860a13	RDMA/bnxt_re: Add compatibility checks to the uapi path Check that the driver data is properly sized and properly zeroed by calling ib_copy_validate_udata_in(). Use git history to find the commit introducing each req struct and use that to select the end member. Link: https://patch.msgid.link/r/8-v3-bd56dd443069+49-bnxt_re_uapi_jgg@nvidia.com Tested-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:25 -04:00
Jason Gunthorpe	5ebe8832ef	RDMA: Provide documentation about the uABI compatibility rules Write down how all of this is supposed to work using the new helpers. Link: https://patch.msgid.link/r/7-v3-bd56dd443069+49-bnxt_re_uapi_jgg@nvidia.com Tested-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:25 -04:00
Jason Gunthorpe	4c379ba04c	RDMA: Add ib_is_udata_in_empty() If the driver doesn't yet support any request driver data it should check that it is all zeroed. This is a common pattern, add a helper around _ib_copy_validate_udata_in() to do this. Link: https://patch.msgid.link/r/6-v3-bd56dd443069+49-bnxt_re_uapi_jgg@nvidia.com Tested-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:25 -04:00
Jason Gunthorpe	14badc323e	RDMA: Add ib_respond_udata() Wrap the common copy_to_user() pattern used in drivers and enhance it to zero pad as well. Include debug logging on failures. Link: https://patch.msgid.link/r/5-v3-bd56dd443069+49-bnxt_re_uapi_jgg@nvidia.com Tested-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:25 -04:00
Jason Gunthorpe	dbf6491bb9	RDMA: Add ib_copy_validate_udata_in_cm() For structures with comp_mask also absorb the check of comp_mask valid bits into the helper. This is slightly tricky because ~ might not fully extend to 64 bits, the helper inserts an explicit type to ensure that ~ covers all bits. Link: https://patch.msgid.link/r/4-v3-bd56dd443069+49-bnxt_re_uapi_jgg@nvidia.com Tested-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:25 -04:00
Jason Gunthorpe	1de9287ece	RDMA: Add ib_copy_validate_udata_in() Add a new function to consolidate the required compatibility pattern for driver data of checking against a minimum size, and checking for unknown trailing bytes to be zero into a function. This new function uses the faster copy_struct_from_user() instead of trying to directly check for zero. Incorporate the common ibdev_dbg() logging directly into the error paths of the helper. Link: https://patch.msgid.link/r/3-v3-bd56dd443069+49-bnxt_re_uapi_jgg@nvidia.com Tested-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:25 -04:00
Jason Gunthorpe	b51caeb24a	RDMA/core: Add rdma_udata_to_dev() Get an ib_device out of a udata so it can be used for debug prints. Link: https://patch.msgid.link/r/2-v3-bd56dd443069+49-bnxt_re_uapi_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:24 -04:00
Jason Gunthorpe	38a6e5579d	RDMA: Use copy_struct_from_user() instead of open coding This entire function is just open coding copy_struct_from_user(), call it directly, it is faster anyhow. Link: https://patch.msgid.link/r/1-v3-bd56dd443069+49-bnxt_re_uapi_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2026-03-08 06:20:24 -04:00
Maher Sanalla	75b864f087	RDMA/mlx5: Add support for TLP VAR allocation Extend the VAR allocation UAPI to accept an optional flags attribute, allowing userspace to request TLP VAR allocation via the MLX5_IB_UAPI_VAR_ALLOC_FLAG_TLP flag. When the TLP flag "MLX5_IB_UAPI_VAR_ALLOC_FLAG_TLP" is specified, the driver selects the TLP VAR region for allocation instead of the regular VirtIO VAR region. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2026-03-05 05:42:01 -05:00
Maher Sanalla	d3552a1f1e	RDMA/mlx5: Add TLP VAR region support and infrastructure Add support for TLP (Transaction Layer Packet) VAR regions used by software-defined device emulation. TLP VAR provides dedicated response gateways for sending TLP responses back to the host in TLP emulation scenarios. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2026-03-05 05:42:01 -05:00
Maher Sanalla	ea6641828d	RDMA/mlx5: Refactor VAR table to use region abstraction Extract mlx5_var_region struct from mlx5_var_table to enable supporting multiple VAR regions in VAR table, which will be used in the upcoming patches (Virtio emulation VAR and TLP emulation VAR). Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2026-03-05 05:42:01 -05:00
Leon Romanovsky	f63f1d74e9	Add support for TLP emulation This series adds support for Transaction Layer Packet (TLP) emulation response gateway regions, enabling userspace device emulation software to write TLP responses directly to lower layers without kernel driver involvement. Currently, the mlx5 driver exposes VirtIO emulation access regions via the MLX5_IB_METHOD_VAR_OBJ_ALLOC ioctl. This series extends that ioctl to also support allocating TLP response gateway channels for PCI device emulation use cases. Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-05 05:41:16 -05:00
Cheng Xu	f30bc6f9b9	RDMA/erdma: Remove numa_node from struct erdma_devattr Using dev_to_node() to get the pci device's numa information instead of caching it in struct erdma_devattr. Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Link: https://patch.msgid.link/20260305062929.58881-1-chengyou@linux.alibaba.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-05 05:41:02 -05:00
Leon Romanovsky	9d2994f97d	RDMA/core: Delete not-implemented get_vector_affinity No drivers implement .get_vector_affinity(), and no callers invoke ib_get_vector_affinity(), so remove it. Link: https://patch.msgid.link/20260226-get_vector_affinity-v1-1-910a899c4e5d@nvidia.com Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2026-03-05 05:41:02 -05:00
Michael Guralnik	dbd0472fd7	RDMA/nldev: Expose kernel-internal FRMR pools in netlink Allow netlink users, through the usage of driver-details netlink attribute, to get information about internal FRMR pools that use the kernel_vendor_key FRMR key member. Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com> Link: https://patch.msgid.link/20260226-frmr_pools-v4-11-95360b54f15e@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-05 05:41:02 -05:00
Michael Guralnik	da73d7634f	RDMA/nldev: Add command to set pinned FRMR handles Allow users to set through netlink, for a specific FRMR pool, the amount of handles that are not aged, and fill the pool to this amount. This allows users to warm-up the FRMR pools to an expected amount of handles with specific attributes that fits their expected usage. Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com> Link: https://patch.msgid.link/20260226-frmr_pools-v4-10-95360b54f15e@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-05 05:41:02 -05:00
Maher Sanalla	385a06f74f	net/mlx5: Expose TLP emulation capabilities Expose and query TLP device emulation caps on driver load. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2026-03-05 05:33:58 -05:00
Maher Sanalla	01b7768578	net/mlx5: Add TLP emulation device capabilities Introduce the hardware structures and definitions needed for the driver support of TLP emulation in mlx5_ifc. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2026-03-05 05:33:58 -05:00
Michael Guralnik	d2ea675e86	RDMA/core: Add netlink command to modify FRMR aging Allow users to set FRMR pools aging timer through netlink. This functionality will allow user to control how long handles reside in the kernel before being destroyed, thus being able to tune the tradeoff between memory and HW object consumption and memory registration optimization. Since FRMR pools is highly beneficial for application restart scenarios, this command allows users to modify the aging timer to their application restart time, making sure the FRMR handles deregistered on application teardown are kept for long enough in the pools for reuse in the application startup. Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com> Link: https://patch.msgid.link/20260226-frmr_pools-v4-9-95360b54f15e@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-02 13:45:37 -05:00
Michael Guralnik	50c035976a	RDMA/nldev: Add command to get FRMR pools Add support for a new command in netlink to dump to user the state of the FRMR pools on the devices. Expose each pool with its key and the usage statistics for it. Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com> Link: https://patch.msgid.link/20260226-frmr_pools-v4-8-95360b54f15e@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-02 13:45:34 -05:00
Michael Guralnik	ba51cf9fcf	net/mlx5: Drop MR cache related code Following mlx5_ib move to using FRMR pools, drop all unused code of MR cache. Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Reviewed-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com> Link: https://patch.msgid.link/20260226-frmr_pools-v4-7-95360b54f15e@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-02 13:45:30 -05:00
Michael Guralnik	36680ef7bc	RDMA/mlx5: Switch from MR cache to FRMR pools Use the new generic FRMR pools mechanism to optimize the performance of memory registrations. The move to the new generic FRMR pools will allow users configuring MR cache through debugfs of MR cache to use the netlink API for FRMR pools which will be added later in this series. Thus being able to have more flexibility configuring the kernel and also being able to configure on machines where debugfs is not available. Mlx5_ib will save the mkey index as the handle in FRMR pools, same as the MR cache implementation. Upon each memory registration mlx5_ib will try to pull a handle from FRMR pools and upon each deregistration it will push the handle back to it's appropriate pool. Use the vendor key field in umr pool key to save the access mode of the mkey. Use the option for kernel-only FRMR pool to manage the mkeys used for registration with DMAH as the translation between UAPI of DMAH and the mkey property of st_index is non-trivial and changing dynamically. Since the value for no PH is 0xff and not zero, switch between them in the frmr_key to have a zero'ed kernel_vendor_key when not using DMAH. Remove the limitation we had with MR cache for mkeys up to 2^20 dma blocks and support mkeys up to HW limitations according to caps. Remove all MR cache related code. Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Reviewed-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com> Link: https://patch.msgid.link/20260226-frmr_pools-v4-6-95360b54f15e@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-03-02 13:45:19 -05:00

1 2 3 4 5 ...

1427409 Commits