selftests: ublk: cap nthreads to kernel's actual nr_hw_queues

dev->nthreads is derived from the user-requested queue count before the
ADD command, but the kernel may reduce nr_hw_queues (capped to
nr_cpu_ids). When the VM has fewer CPUs than requested queues, the
daemon creates more handler threads than there are kernel queues.

In non-batch mode, the extra threads access uninitialized queues
(q_depth=0), submit zero io_uring SQEs, and block forever in
io_cqring_wait. In batch mode, the extra threads cause similar hangs
during device removal.

In both cases, the stuck threads prevent the daemon from closing the
char device, holding the last ublk_device reference and causing
ublk_ctrl_del_dev() to hang in wait_event_interruptible().

Fix by capping dev->nthreads to the kernel-returned nr_hw_queues after
the ADD command completes. per_io_tasks mode is excluded because threads
interleave across all queues, so nthreads > nr_hw_queues is valid.

Fixes: abe54c1603 ("selftests: ublk: kublk: decouple ublk_queues from ublk server threads")
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Link: https://patch.msgid.link/20260513101941.1373998-1-tom.leiming@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This commit is contained in:
Ming Lei
2026-05-13 18:19:40 +08:00
committed by Jens Axboe
parent 836efd35c4
commit 87d0740b7c

View File

@@ -1735,6 +1735,17 @@ static int __cmd_dev_add(const struct dev_ctx *ctx)
goto fail;
}
/*
* The kernel may reduce nr_hw_queues (e.g. capped to nr_cpu_ids).
* Cap nthreads to the actual queue count to avoid creating extra
* handler threads that will hang during device removal.
*
* per_io_tasks mode is excluded: threads interleave across all
* queues so nthreads > nr_hw_queues is valid and intentional.
*/
if (!ctx->per_io_tasks && dev->nthreads > info->nr_hw_queues)
dev->nthreads = info->nr_hw_queues;
ret = ublk_start_daemon(ctx, dev);
ublk_dbg(UBLK_DBG_DEV, "%s: daemon exit %d\n", __func__, ret);
if (ret < 0)