From 9bee81cc5e8811c8bbe67fbf5214a7998457324b Mon Sep 17 00:00:00 2001 From: Edward Srouji Date: Mon, 27 Apr 2026 14:02:33 +0300 Subject: [PATCH] RDMA/mlx5: Fix UAF in DCT destroy due to race with create A potential race condition exists between mlx5_core_destroy_dct() and mlx5_core_create_dct() that can lead to a use-after-free. After _mlx5_core_destroy_dct() releases the DCT to firmware, the DCTN can be immediately reallocated for a new DCT being created concurrently. If the create path stores the new DCT in the xarray before the destroy path erases it, the destroy will incorrectly delete the new DCT's entry. Later accesses then hit freed memory. Fix by replacing the unconditional xa_erase_irq() with xa_cmpxchg_irq() that only erases the entry if it hasn't already been replaced (still contains XA_ZERO_ENTRY), preserving any newly created DCT. Fixes: afff24899846 ("RDMA/mlx5: Handle DCT QP logic separately from low level QP interface") Link: https://patch.msgid.link/r/20260427-security-bug-fixes-v3-2-4621fa52de0e@nvidia.com Signed-off-by: Edward Srouji Reviewed-by: Michael Guralnik Signed-off-by: Jason Gunthorpe --- drivers/infiniband/hw/mlx5/qpc.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/mlx5/qpc.c b/drivers/infiniband/hw/mlx5/qpc.c index 146d03ae40bd..a7a4f9420271 100644 --- a/drivers/infiniband/hw/mlx5/qpc.c +++ b/drivers/infiniband/hw/mlx5/qpc.c @@ -314,7 +314,14 @@ int mlx5_core_destroy_dct(struct mlx5_ib_dev *dev, xa_cmpxchg_irq(&table->dct_xa, dct->mqp.qpn, XA_ZERO_ENTRY, dct, 0); return err; } - xa_erase_irq(&table->dct_xa, dct->mqp.qpn); + + /* + * A race can occur where a concurrent create gets the same dctn + * (after hardware released it) and overwrites XA_ZERO_ENTRY with + * its new DCT before we reach here. In that case, we must not erase + * the entry as it now belongs to the new DCT. + */ + xa_cmpxchg_irq(&table->dct_xa, dct->mqp.qpn, XA_ZERO_ENTRY, NULL, 0); return 0; }