linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-16 09:02:21 -04:00

Author	SHA1	Message	Date
Randy Dunlap	4e5cba5bb6	RDMA/cm: Correct typedef and bad line warnings In include/rdma/ib_cm.h: Correct a typedef's kernel-doc notation by adding the 'typedef' keyword to it to avoid a warning. Add a leading " *" to a kernel-doc line to avoid a warning. Warning: ib_cm.h:289 function parameter 'ib_cm_handler' not described in 'int' Warning: ib_cm.h:289 expecting prototype for ib_cm_handler(). Prototype was for int() instead Warning: ib_cm.h:484 bad line: connection message in case duplicates are received. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://patch.msgid.link/20251112062908.2711007-1-rdunlap@infradead.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-12 04:39:12 -05:00
Leon Romanovsky	736c561950	Expose definition for 1600Gbps link mode Single patch to expose new link mode for 1600Gbps, utilizing 8 lanes at 200Gbps per lane. Signed-off-by: Leon Romanovsky <leon@kernel.org> * mlx5-next: net/mlx5: Expose definition for 1600Gbps link mode	2025-11-12 03:52:18 -05:00
Tariq Toukan	5422318e27	net/mlx5: Expose definition for 1600Gbps link mode This patch exposes new link mode for 1600Gbps, utilizing 8 lanes at 200Gbps per lane. Co-developed-by: Yael Chemla <ychemla@nvidia.com> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1762863888-1092798-1-git-send-email-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-12 03:35:14 -05:00
Ma Ke	a338d6e849	RDMA/rtrs: server: Fix error handling in get_or_create_srv After device_initialize() is called, use put_device() to release the device according to kernel device management rules. While direct kfree() work in this case, using put_device() is more correct. Found by code review. Fixes: `9cb8374804` ("RDMA/rtrs: server: main functionality") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Link: https://patch.msgid.link/20251110005158.13394-1-make24@iscas.ac.cn Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-10 03:09:41 -05:00
Marco Crivellari	5c467151f6	IB/isert: add WQ_PERCPU to alloc_workqueue users Currently if a user enqueues a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistency cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This continues the effort to refactor workqueue APIs, which began with the introduction of new workqueues and a new alloc_workqueue flag in: commit `128ea9f6cc` ("workqueue: Add system_percpu_wq and system_dfl_wq") commit `930c2ea566` ("workqueue: Add new WQ_PERCPU flag") This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20251107133626.190952-1-marco.crivellari@suse.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-09 06:30:58 -05:00
Marco Crivellari	65d21dee53	IB/iser: add WQ_PERCPU to alloc_workqueue users Currently if a user enqueues a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistency cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This continues the effort to refactor workqueue APIs, which began with the introduction of new workqueues and a new alloc_workqueue flag in: commit `128ea9f6cc` ("workqueue: Add system_percpu_wq and system_dfl_wq") commit `930c2ea566` ("workqueue: Add new WQ_PERCPU flag") This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20251107133306.187939-1-marco.crivellari@suse.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-09 06:30:48 -05:00
Jacob Moroni	5dd68a5914	RDMA/irdma: Remove unused CQ registry The CQ registry was never actually used (ceq->reg_cq was always NULL), so remove the dead code. Signed-off-by: Jacob Moroni <jmoroni@google.com> Link: https://patch.msgid.link/20251105162841.31786-1-jmoroni@google.com Acked-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2025-11-09 06:13:57 -05:00
Patrisious Haddad	6e79e21005	RDMA/mlx5: Add other eswitch support to userspace tables Allows the creation of RDMA TRANSPORT tables over VFs/SFs that belong to another eswitch manager. Which is only possible for PFs that were connected via a create_lag PRM command. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com> Link: https://patch.msgid.link/20251029-support-other-eswitch-v1-7-98bb707b5d57@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-09 05:21:46 -05:00
Patrisious Haddad	f277662b73	RDMA/mlx5: Refactor _get_prio() function Refactor the _get_prio() function to remove redundant arguments by reusing the existing flow table attributes struct instead of passing attributes separately. This improves code clarity and maintainability. In addition allows downstream patch to add new parameter without needing to change __get_prio() arguments. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com> Link: https://patch.msgid.link/20251029-support-other-eswitch-v1-6-98bb707b5d57@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-09 05:21:42 -05:00
Patrisious Haddad	5939decc64	RDMA/mlx5: Add other_eswitch support for devx destruction When building a devx object destruction command for steering objects add consideration for other_eswitch argument to allow proper destruction for objects that were created with it. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com> Link: https://patch.msgid.link/20251029-support-other-eswitch-v1-5-98bb707b5d57@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-09 05:21:38 -05:00
Patrisious Haddad	3506242da0	RDMA/mlx5: Change default device for LAG slaves in RDMA TRANSPORT namespaces In case of a LAG configuration change the root namespace core device for all of the LAG slaves to be the core device of the master device for RDMA_TRANSPORT namespaces, in order to ensure all tables are created through the master device. Once the LAG is disabled revert back to the native core device. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com> Link: https://patch.msgid.link/20251029-support-other-eswitch-v1-4-98bb707b5d57@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-09 05:21:33 -05:00
Leon Romanovsky	d06ccdc952	Add other eswitch support When the device in switchdev mode, the RDMA device manages all the vports which belong to its representors, which can lead to a situation where the PF that is used to manage the RDMA device isn't the native PF of some of the vports it manages. Add infrastructure to allow the master PF to manage all the hardware resources for the vports under its management. Whereas currently the only such resource is RDMA TRANSPORT steering domains. That is done by adding new FW argument other_eswitch which is passed by the driver to the FW to allow the master PF to properly manage vports belonging to other native PF. Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-09 05:17:37 -05:00
Patrisious Haddad	583b4fe1c1	net/mlx5: fs, set non default device per namespace Add mlx5_fs_set_root_dev() function which swaps the root namespace core device with another one for a given table_type. It is intended for usage only by RDMA_TRANSPORT tables in case of LAG configuration, to allow the creation of tables during LAG always through the LAG master device, which is valid since during LAG the master is allowed to manage the RDMA_TRANSPORT tables of its slaves. In addition move the table_type enum to global include to allow its use in a downstream patch in the RDMA driver. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com> Link: https://patch.msgid.link/20251029-support-other-eswitch-v1-3-98bb707b5d57@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-09 05:16:58 -05:00
Patrisious Haddad	3b848dec7e	net/mlx5: fs, Add other_eswitch support for steering tables Add other_eswitch support which allows flow tables creation above vports that reside on different esw managers. The new flag MLX5_FLOW_TABLE_OTHER_ESWITCH indicates if the esw_owner_vhca_id attribute is supported. Note that this is only supported if the Advanced-RDMA cap- rdma_transport_manager_other_eswitch is set. And it is the caller responsibility to check that. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com> Link: https://patch.msgid.link/20251029-support-other-eswitch-v1-2-98bb707b5d57@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-09 05:16:53 -05:00
Patrisious Haddad	6948417b3f	net/mlx5: Add OTHER_ESWITCH HW capabilities Add OTHER_ESWITCH capabilities which includes other_eswitch and eswitch_owner_vhca_id to all steering objects. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com> Link: https://patch.msgid.link/20251029-support-other-eswitch-v1-1-98bb707b5d57@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-09 05:16:47 -05:00
Yishai Hadas	2d838c11e1	net/mlx5: Add direct ST mode support for RDMA Add support for direct ST mode where ST Table Location equals PCI_TPH_LOC_NONE. In that case, no steering table exists, the steering tag itself will be used directly by the SW, FW, HW from the mkey. This enables RDMA users to use the current exposed APIs to work in direct mode. Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com> Link: https://patch.msgid.link/20251027-st-direct-mode-v1-2-e0ad953866b6@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-09 05:13:34 -05:00
Yishai Hadas	7b8a8ec20c	PCI/TPH: Expose pcie_tph_get_st_table_loc() Expose pcie_tph_get_st_table_loc() to be used by drivers as will be done in the next patch from the series. Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com> Link: https://patch.msgid.link/20251027-st-direct-mode-v1-1-e0ad953866b6@nvidia.com Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-09 05:13:02 -05:00
Kalesh AP	cf27490790	RDMA/bnxt_re: Add a debugfs entry for CQE coalescing tuning This patch adds debugfs interfaces that allows the user to enable/disable the RoCE CQ coalescing and fine tune certain CQ coalescing parameters which would be helpful during debug. Signed-off-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/20251103043425.234846-1-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-09 04:02:27 -05:00
Randy Dunlap	512c832657	IB/rdmavt: rdmavt_qp.h: clean up kernel-doc comments Correct the kernel-doc comments format to avoid around 35 kernel-doc warnings: - use struct keyword to introduce struct kernel-doc comments - use correct variable name for some struct members - use correct function name in comments for some functions - fix spelling in a few comments - use a ':' instead of '-' to separate struct members from their descriptions - add a function name heading for rvt_div_mtu() This leaves one struct member that is not described: rdmavt_qp.h:206: warning: Function parameter or struct member 'wq' not described in 'rvt_krwq' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://patch.msgid.link/20251105045127.106822-1-rdunlap@infradead.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-06 02:23:23 -05:00
Marco Crivellari	7196156b0c	IB/rdmavt: WQ_PERCPU added to alloc_workqueue users Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. CC: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20251101163121.78400-6-marco.crivellari@suse.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-06 02:23:23 -05:00
Marco Crivellari	5267feda50	RDMA/mlx4: WQ_PERCPU added to alloc_workqueue users Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. CC: Yishai Hadas <yishaih@nvidia.com> Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20251101163121.78400-5-marco.crivellari@suse.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-06 02:23:23 -05:00
Marco Crivellari	5f93287fa9	hfi1: WQ_PERCPU added to alloc_workqueue users Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. CC: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20251101163121.78400-4-marco.crivellari@suse.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-06 02:23:23 -05:00
Marco Crivellari	e60c5583b6	RDMA/core: WQ_PERCPU added to alloc_workqueue users Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20251101163121.78400-3-marco.crivellari@suse.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-06 02:23:23 -05:00
Marco Crivellari	f673fb3449	RDMA/core: RDMA/mlx5: replace use of system_unbound_wq with system_dfl_wq Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistency cannot be addressed without refactoring the API. system_unbound_wq should be the default workqueue so as not to enforce locality constraints for random work whenever it's not required. Adding system_dfl_wq to encourage its use when unbound work should be used. The old system_unbound_wq will be kept for a few release cycles. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20251101163121.78400-2-marco.crivellari@suse.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-06 02:23:23 -05:00
Jay Bhat	da58d4223b	RDMA/irdma: Take a lock before moving SRQ tail in poll_cq Need to take an SRQ lock in poll_cq before moving SRQ tail. Signed-off-by: Jay Bhat <jay.bhat@intel.com> Reviewed-by: Jacob Moroni <jmoroni@google.com> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Link: https://patch.msgid.link/20251031021726.1003-7-tatyana.e.nikolova@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-06 02:23:23 -05:00
Jay Bhat	0a19274555	RDMA/irdma: CQ size and shadow update changes for GEN3 CQ shadow area should not be updated at the end of a page (once every 64th CQ entry), except when CQ has no more CQEs. SW must also increase the requested CQ size by 1 and make sure the CQ is not exactly one page in size. This is to address a quirk in the hardware. Signed-off-by: Jay Bhat <jay.bhat@intel.com> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Link: https://patch.msgid.link/20251031021726.1003-4-tatyana.e.nikolova@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-02 06:52:58 -05:00
Jay Bhat	cd84d8001e	RDMA/irdma: Silently consume unsignaled completions In case we get an unsignaled error completion, we silently consume the CQE by pretending the QP does not exist. Without this, bookkeeping for signaled completions does not work correctly. Signed-off-by: Jay Bhat <jay.bhat@intel.com> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Link: https://patch.msgid.link/20251031021726.1003-5-tatyana.e.nikolova@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-02 06:52:51 -05:00
Jay Bhat	153243086e	RDMA/irdma: Initialize cqp_cmds_info to prevent resource leaks Failure to initialize info.create field to false in certain cases was resulting in incorrect status code going to rdma-core when dereg_mr failed during reset. To fix this, memset entire cqp_request->info in irdma_alloc_and_get_cqp_request() function, so that this is not spread all over the code. Signed-off-by: Bhat, Jay <jay.bhat@intel.com> Reviewed-by: Jacob Moroni <jmoroni@google.com> Signed-off-by: Krzysztof Czurylo <krzysztof.czurylo@intel.com> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Link: https://patch.msgid.link/20251031021726.1003-2-tatyana.e.nikolova@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-02 06:52:44 -05:00
Jacob Moroni	69e8e429bc	RDMA/irdma: Enforce local fence for LOCAL_INV WRs Enforce local fence for LOCAL_INV WRs to avoid spurious FASTREG_VALID_MKEY async events during heavy invalidation/registration activity. Signed-off-by: Jacob Moroni <jmoroni@google.com> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Link: https://patch.msgid.link/20251031021726.1003-3-tatyana.e.nikolova@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-11-02 06:52:33 -05:00
Zhu Yanjun	503a5e4690	RDMA/rxe: Fix null deref on srq->rq.queue after resize failure A NULL pointer dereference can occur in rxe_srq_chk_attr() when ibv_modify_srq() is invoked twice in succession under certain error conditions. The first call may fail in rxe_queue_resize(), which leads rxe_srq_from_attr() to set srq->rq.queue = NULL. The second call then triggers a crash (null deref) when accessing srq->rq.queue->buf->index_mask. Call Trace: <TASK> rxe_modify_srq+0x170/0x480 [rdma_rxe] ? __pfx_rxe_modify_srq+0x10/0x10 [rdma_rxe] ? uverbs_try_lock_object+0x4f/0xa0 [ib_uverbs] ? rdma_lookup_get_uobject+0x1f0/0x380 [ib_uverbs] ib_uverbs_modify_srq+0x204/0x290 [ib_uverbs] ? __pfx_ib_uverbs_modify_srq+0x10/0x10 [ib_uverbs] ? tryinc_node_nr_active+0xe6/0x150 ? uverbs_fill_udata+0xed/0x4f0 [ib_uverbs] ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2c0/0x470 [ib_uverbs] ? __pfx_ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x10/0x10 [ib_uverbs] ? uverbs_fill_udata+0xed/0x4f0 [ib_uverbs] ib_uverbs_run_method+0x55a/0x6e0 [ib_uverbs] ? __pfx_ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x10/0x10 [ib_uverbs] ib_uverbs_cmd_verbs+0x54d/0x800 [ib_uverbs] ? __pfx_ib_uverbs_cmd_verbs+0x10/0x10 [ib_uverbs] ? __pfx___raw_spin_lock_irqsave+0x10/0x10 ? __pfx_do_vfs_ioctl+0x10/0x10 ? ioctl_has_perm.constprop.0.isra.0+0x2c7/0x4c0 ? __pfx_ioctl_has_perm.constprop.0.isra.0+0x10/0x10 ib_uverbs_ioctl+0x13e/0x220 [ib_uverbs] ? __pfx_ib_uverbs_ioctl+0x10/0x10 [ib_uverbs] __x64_sys_ioctl+0x138/0x1c0 do_syscall_64+0x82/0x250 ? fdget_pos+0x58/0x4c0 ? ksys_write+0xf3/0x1c0 ? __pfx_ksys_write+0x10/0x10 ? do_syscall_64+0xc8/0x250 ? __pfx_vm_mmap_pgoff+0x10/0x10 ? fget+0x173/0x230 ? fput+0x2a/0x80 ? ksys_mmap_pgoff+0x224/0x4c0 ? do_syscall_64+0xc8/0x250 ? do_user_addr_fault+0x37b/0xfe0 ? clear_bhb_loop+0x50/0xa0 ? clear_bhb_loop+0x50/0xa0 ? clear_bhb_loop+0x50/0xa0 entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: `8700e3e7c4` ("Soft RoCE driver") Tested-by: Liu Yi <asatsuyu.liu@gmail.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Link: https://patch.msgid.link/20251027215203.1321-1-yanjun.zhu@linux.dev Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-10-28 04:04:07 -04:00
Håkon Bugge	58aca1f3de	RDMA/cm: Base cm_id destruction timeout on CMA values When a GSI MAD packet is sent on the QP, it will potentially be retried CMA_MAX_CM_RETRIES times with a timeout value of: 4.096usec * 2 ^ CMA_CM_RESPONSE_TIMEOUT The above equates to ~64 seconds using the default CMA values. The cm_id_priv's refcount will be incremented for this period. Therefore, the timeout value waiting for a cm_id destruction must be based on the effective timeout of MAD packets. To provide additional leeway, we add 25% to this timeout and use that instead of the constant 10 seconds timeout, which may result in false negatives. Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Link: https://patch.msgid.link/20251021132738.4179604-1-haakon.bugge@oracle.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-10-27 07:36:39 -04:00
Adithya Jayachandran	eea31f21dc	{rdma,net}/mlx5: Query vports mac address from device Before this patch during either switchdev or legacy mode enablement we cleared the mac address of vports between changes. This change allows us to preserve the vports mac address between eswitch mode changes. Vports hold information for VFs/SFs such as the permanent mac address. VF/SF mac can be set either by iproute vf interface or devlink function interface. For no obvious reason we reset it to 0 on switchdev/legacy mode changes, this patch is fixing that, to align with other vport information that are never reset, e.g GUID,mtu,promisc mode, etc .. Signed-off-by: Adithya Jayachandran <ajayachandra@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Acked-by: Leon Romanovsky <leon@kernel.org> # RDMA	2025-10-24 20:16:01 -07:00
Randy Dunlap	be180c847a	RDMA/uverbs: fix some kernel-doc warnings Fix 49 kernel-doc warnings in ib_verbs.h: - Add struct short description for rdma_stat_desc, rdma_hw_stats. - Fix kernel-doc format for struct members (use ':' instead of '-') for several structs. - Don't use "/**" kernel-doc notation for struct members in ib_device_ops (most members are not documented and most of the kernel-doc was not formatted correctly). - Spell function parameters correctly in ib_dma_map_sgtable_attrs(), ib_device_try_get(), rdma_roce_rescan_device(). - Add kernel-doc for the function parameter in rdma_flow_label_to_udp_sport(). Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://patch.msgid.link/20251020034320.3011094-1-rdunlap@infradead.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-10-20 11:45:46 -04:00
Colin Ian King	1511efaca0	RDMA/rxe: Remove redundant assignment to variable page_offset The variable page_offset is being assigned a value at the start of a loop and being redundantly zero'd at the end of the loop, there is no code that reads the zero'd value. The assignment is redundant and can be removed. Signed-off-by: Colin Ian King <coking@nvidia.com> Link: https://patch.msgid.link/20251014120343.2528608-1-coking@nvidia.com Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-10-19 08:06:21 -04:00
Stefan Metzmacher	879424832d	RDMA/core: let rdma_connect_locked() call lockdep_assert_held(&id_priv->handler_mutex) rdma_accept() also has this, so this is now more consistent and may prevent bugs in future. Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Leon Romanovsky <leon@kernel.org> Cc: linux-rdma@vger.kernel.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Link: https://patch.msgid.link/20251008165913.444276-1-metze@samba.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-10-19 07:03:25 -04:00
Alok Tiwari	cfd51fcf11	RDMA/cxgb4: fix typo in write_pbl() debug message Correct the debug log format string from "pdb_addr" -> "pbl_addr" to match the actual variable name. Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Link: https://patch.msgid.link/20250929091102.50384-1-alok.a.tiwari@oracle.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2025-10-19 06:51:46 -04:00
Linus Torvalds	3a86608788	Linux 6.18-rc1 v6.18-rc1	2025-10-12 13:42:36 -07:00
Linus Torvalds	3dd7b81235	Merge tag 'i2c-for-6.18-rc1-hotfix' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fix from Wolfram Sang: "One revert because of a regression in the I2C core which has sadly not showed up during its time in -next" * tag 'i2c-for-6.18-rc1-hotfix' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: Revert "i2c: boardinfo: Annotate code used in init phase only"	2025-10-12 13:27:56 -07:00
Linus Torvalds	8765f46791	Merge tag 'irq_urgent_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Borislav Petkov: - Skip interrupt ID 0 in sifive-plic during suspend/resume because ID 0 is reserved and accessing reserved register space could result in undefined behavior - Fix a function's retval check in aspeed-scu-ic * tag 'irq_urgent_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/sifive-plic: Avoid interrupt ID 0 handling during suspend/resume irqchip/aspeed-scu-ic: Fix an IS_ERR() vs NULL check	2025-10-12 08:45:52 -07:00
Linus Torvalds	67029a49db	Merge tag 'trace-v6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: "The previous fix to trace_marker required updating trace_marker_raw as well. The difference between trace_marker_raw from trace_marker is that the raw version is for applications to write binary structures directly into the ring buffer instead of writing ASCII strings. This is for applications that will read the raw data from the ring buffer and get the data structures directly. It's a bit quicker than using the ASCII version. Unfortunately, it appears that our test suite has several tests that test writes to the trace_marker file, but lacks any tests to the trace_marker_raw file (this needs to be remedied). Two issues came about the update to the trace_marker_raw file that syzbot found: - Fix tracing_mark_raw_write() to use per CPU buffer The fix to use the per CPU buffer to copy from user space was needed for both the trace_maker and trace_maker_raw file. The fix for reading from user space into per CPU buffers properly fixed the trace_marker write function, but the trace_marker_raw file wasn't fixed properly. The user space data was correctly written into the per CPU buffer, but the code that wrote into the ring buffer still used the user space pointer and not the per CPU buffer that had the user space data already written. - Stop the fortify string warning from writing into trace_marker_raw After converting the copy_from_user_nofault() into a memcpy(), another issue appeared. As writes to the trace_marker_raw expects binary data, the first entry is a 4 byte identifier. The entry structure is defined as: struct { struct trace_entry ent; int id; char buf[]; }; The size of this structure is reserved on the ring buffer with: size = sizeof(entry) + cnt; Then it is copied from the buffer into the ring buffer with: memcpy(&entry->id, buf, cnt); This use to be a copy_from_user_nofault(), but now converting it to a memcpy() triggers the fortify-string code, and causes a warning. The allocated space is actually more than what is copied, as the cnt used also includes the entry->id portion. Allocating sizeof(entry) plus cnt is actually allocating 4 bytes more than what is needed. Change the size function to: size = struct_size(entry, buf, cnt - sizeof(entry->id)); And update the memcpy() to unsafe_memcpy()" * tag 'trace-v6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Stop fortify-string from warning in tracing_mark_raw_write() tracing: Fix tracing_mark_raw_write() to use buf and not ubuf	2025-10-11 16:06:04 -07:00
Linus Torvalds	c04022dccb	Merge tag 'kbuild-fixes-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux Pull Kbuild fixes from Nathan Chancellor: - Fix UAPI types check in headers_check.pl - Only enable -Werror for hostprogs with CONFIG_WERROR / W=e - Ignore fsync() error when output of gen_init_cpio is a pipe - Several little build fixes for recent modules.builtin.modinfo series * tag 'kbuild-fixes-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: kbuild: Use '--strip-unneeded-symbol' for removing module device table symbols s390/vmlinux.lds.S: Move .vmlinux.info to end of allocatable sections kbuild: Add '.rel.*' strip pattern for vmlinux kbuild: Restore pattern to avoid stripping .rela.dyn from vmlinux gen_init_cpio: Ignore fsync() returning EINVAL on pipes scripts/Makefile.extrawarn: Respect CONFIG_WERROR / W=e for hostprogs kbuild: uapi: Strip comments before size type check	2025-10-11 15:47:12 -07:00
Wolfram Sang	a8482d2c90	Revert "i2c: boardinfo: Annotate code used in init phase only" This reverts commit `1a2b423be6` because we got a regression report and need time to find out the details. Reported-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Closes: https://lore.kernel.org/r/29ec0082-4dd4-4120-acd2-44b35b4b9487@oss.qualcomm.com Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>	2025-10-11 23:57:33 +02:00
Linus Torvalds	98906f9d85	Merge tag 'rtc-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC updates from Alexandre Belloni: "This cycle, we have a new RTC driver, for the SpacemiT P1. The optee driver gets alarm support. We also get a fix for a race condition that was fairly rare unless while stress testing the alarms. Subsystem: - Fix race when setting alarm - Ensure alarm irq is enabled when UIE is enabled - remove unneeded 'fast_io' parameter in regmap_config New driver: - SpacemiT P1 RTC Drivers: - efi: Remove wakeup functionality - optee: add alarms support - s3c: Drop support for S3C2410 - zynqmp: Restore alarm functionality after kexec transition" * tag 'rtc-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (29 commits) rtc: interface: Ensure alarm irq is enabled when UIE is enabled rtc: tps6586x: Fix initial enable_irq/disable_irq balance rtc: cpcap: Fix initial enable_irq/disable_irq balance rtc: isl12022: Fix initial enable_irq/disable_irq balance rtc: interface: Fix long-standing race when setting alarm rtc: pcf2127: fix watchdog interrupt mask on pcf2131 rtc: zynqmp: Restore alarm functionality after kexec transition rtc: amlogic-a4: Optimize global variables rtc: sd2405al: Add I2C address. rtc: Kconfig: move symbols to proper section rtc: optee: make optee_rtc_pm_ops static rtc: optee: Fix error code in optee_rtc_read_alarm() rtc: optee: fix error code in probe() dt-bindings: rtc: Convert apm,xgene-rtc to DT schema rtc: spacemit: support the SpacemiT P1 RTC rtc: optee: add alarm related rtc ops to optee rtc driver rtc: optee: remove unnecessary memory operations rtc: optee: fix memory leak on driver removal rtc: x1205: Fix Xicor X1205 vendor prefix dt-bindings: rtc: Fix Xicor X1205 vendor prefix ...	2025-10-11 11:56:47 -07:00
Linus Torvalds	2a6edd867b	Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Fixes only in drivers (ufs, mvsas, qla2xxx, target) that came in just before or during the merge window. The most important one is the qla2xxx which reverts a conversion to fix flexible array member warnings, that went up in this merge window but which turned out on further testing to be causing data corruption" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Include UTP error in INT_FATAL_ERRORS scsi: ufs: sysfs: Make HID attributes visible scsi: mvsas: Fix use-after-free bugs in mvs_work_queue scsi: ufs: core: Fix PM QoS mutex initialization scsi: ufs: core: Fix runtime suspend error deadlock Revert "scsi: qla2xxx: Fix memcpy() field-spanning write issue" scsi: target: target_core_configfs: Add length check to avoid buffer overflow	2025-10-11 11:49:00 -07:00
Linus Torvalds	9591fdb061	Merge tag 'x86_core_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull more x86 updates from Borislav Petkov: - Remove a bunch of asm implementing condition flags testing in KVM's emulator in favor of int3_emulate_jcc() which is written in C - Replace KVM fastops with C-based stubs which avoids problems with the fastop infra related to latter not adhering to the C ABI due to their special calling convention and, more importantly, bypassing compiler control-flow integrity checking because they're written in asm - Remove wrongly used static branches and other ugliness accumulated over time in hyperv's hypercall implementation with a proper static function call to the correct hypervisor call variant - Add some fixes and modifications to allow running FRED-enabled kernels in KVM even on non-FRED hardware - Add kCFI improvements like validating indirect calls and prepare for enabling kCFI with GCC. Add cmdline params documentation and other code cleanups - Use the single-byte 0xd6 insn as the official #UD single-byte undefined opcode instruction as agreed upon by both x86 vendors - Other smaller cleanups and touchups all over the place * tag 'x86_core_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86,retpoline: Optimize patch_retpoline() x86,ibt: Use UDB instead of 0xEA x86/cfi: Remove __noinitretpoline and __noretpoline x86/cfi: Add "debug" option to "cfi=" bootparam x86/cfi: Standardize on common "CFI:" prefix for CFI reports x86/cfi: Document the "cfi=" bootparam options x86/traps: Clarify KCFI instruction layout compiler_types.h: Move __nocfi out of compiler-specific header objtool: Validate kCFI calls x86/fred: KVM: VMX: Always use FRED for IRQs when CONFIG_X86_FRED=y x86/fred: Play nice with invoking asm_fred_entry_from_kvm() on non-FRED hardware x86/fred: Install system vector handlers even if FRED isn't fully enabled x86/hyperv: Use direct call to hypercall-page x86/hyperv: Clean up hv_do_hypercall() KVM: x86: Remove fastops KVM: x86: Convert em_salc() to C KVM: x86: Introduce EM_ASM_3WCL KVM: x86: Introduce EM_ASM_1SRC2 KVM: x86: Introduce EM_ASM_2CL KVM: x86: Introduce EM_ASM_2W ...	2025-10-11 11:19:16 -07:00
Linus Torvalds	2f0a750453	Merge tag 'x86_cleanups_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Borislav Petkov: - Simplify inline asm flag output operands now that the minimum compiler version supports the =@ccCOND syntax - Remove a bunch of AS_* Kconfig symbols which detect assembler support for various instruction mnemonics now that the minimum assembler version supports them all - The usual cleanups all over the place * tag 'x86_cleanups_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm: Remove code depending on __GCC_ASM_FLAG_OUTPUTS__ x86/sgx: Use ENCLS mnemonic in <kernel/cpu/sgx/encls.h> x86/mtrr: Remove license boilerplate text with bad FSF address x86/asm: Use RDPKRU and WRPKRU mnemonics in <asm/special_insns.h> x86/idle: Use MONITORX and MWAITX mnemonics in <asm/mwait.h> x86/entry/fred: Push __KERNEL_CS directly x86/kconfig: Remove CONFIG_AS_AVX512 crypto: x86 - Remove CONFIG_AS_VPCLMULQDQ crypto: X86 - Remove CONFIG_AS_VAES crypto: x86 - Remove CONFIG_AS_GFNI x86/kconfig: Drop unused and needless config X86_64_SMP	2025-10-11 10:51:14 -07:00
Linus Torvalds	6bb71f0fe5	Merge tag 'slab-for-6.18-rc1-hotfix' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab fix from Vlastimil Babka: "A NULL pointer deref hotfix" * tag 'slab-for-6.18-rc1-hotfix' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: slab: fix barn NULL pointer dereference on memoryless nodes	2025-10-11 10:40:24 -07:00
Linus Torvalds	fbde105f13	Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Alexei Starovoitov: - Finish constification of 1st parameter of bpf_d_path() (Rong Tao) - Harden userspace-supplied xdp_desc validation (Alexander Lobakin) - Fix metadata_dst leak in __bpf_redirect_neigh_v{4,6}() (Daniel Borkmann) - Fix undefined behavior in {get,put}_unaligned_be32() (Eric Biggers) - Use correct context to unpin bpf hash map with special types (KaFai Wan) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Add test for unpinning htab with internal timer struct bpf: Avoid RCU context warning when unpinning htab with internal structs xsk: Harden userspace-supplied xdp_desc validation bpf: Fix metadata_dst leak __bpf_redirect_neigh_v{4,6} libbpf: Fix undefined behavior in {get,put}_unaligned_be32() bpf: Finish constification of 1st parameter of bpf_d_path()	2025-10-11 10:31:38 -07:00
Linus Torvalds	ae13bd2310	Merge tag 'mm-nonmm-stable-2025-10-10-15-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more updates from Andrew Morton: "Just one series here - Mike Rappoport has taught KEXEC handover to preserve vmalloc allocations across handover" * tag 'mm-nonmm-stable-2025-10-10-15-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: lib/test_kho: use kho_preserve_vmalloc instead of storing addresses in fdt kho: add support for preserving vmalloc allocations kho: replace kho_preserve_phys() with kho_preserve_pages() kho: check if kho is finalized in __kho_preserve_order() MAINTAINERS, .mailmap: update Umang's email address	2025-10-11 10:27:52 -07:00
Linus Torvalds	971370a88c	Merge tag 'mm-hotfixes-stable-2025-10-10-15-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "7 hotfixes. All 7 are cc:stable and all 7 are for MM. All singletons, please see the changelogs for details" * tag 'mm-hotfixes-stable-2025-10-10-15-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm: hugetlb: avoid soft lockup when mprotect to large memory area fsnotify: pass correct offset to fsnotify_mmap_perm() mm/ksm: fix flag-dropping behavior in ksm_madvise mm/damon/vaddr: do not repeat pte_offset_map_lock() until success mm/rmap: fix soft-dirty and uffd-wp bit loss when remapping zero-filled mTHP subpage to shared zeropage mm/thp: fix MTE tag mismatch when replacing zero-filled subpages memcg: skip cgroup_file_notify if spinning is not allowed	2025-10-11 10:14:55 -07:00

1 2 3 4 5 ...

1396684 Commits