wbt_queue_depth_changed just updates a field and calls another function.
Open code it in wbt_init, so that the local queue variable can be used
instead of the one stored in the rq_qos. This will allow delaying that
rq_qos->queue assignment in a subsequent patch.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-12-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Currently each blkcg_gq holds a request_queue reference, which is what
is used in the policies. But a lot of these interfaces will move over to
use a gendisk, so store a disk in struct blkcg_gq and hold a reference to
it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Unwind only the previous initialization steps that happened in blkg_alloc
using goto based unwinding. This avoids the need for the !queue special
case in blkg_free and thus ensures that any blkg seens outside of
blkg_alloc is always fully constructed.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There is no need to initialize the cgroup code before the disk is marked
live. Moving the cgroup initialization earlier will help to have a
fully initialized struct device in the gendisk for the cgroup code to
use in the future. Similarly tear the cgroup information down in
del_gendisk to be symmetric and because none of the cgroup tracking is
needed once non-passthrough I/O stops.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blk_throtl_stat_add is called from blk_stat_add explicitly, unlike the
other stats that go through q->stats->callbacks. To prepare for cgroup
data moving to the gendisk, ensure blk_throtl_stat_add is only called
for the plain READ and WRITE commands that it actually handles internally,
as blk_stat_add can also be called for passthrough commands on queues that
do not have a gendisk associated with them.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull MD updates from Song:
"Non-urgent fixes:
md: don't update recovery_cp when curr_resync is ACTIVE
md: Free writes_pending in md_stop
Performance optimization:
md: Change active_io to percpu"
* 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
md: use MD_RESYNC_* whenever possible
md: Free writes_pending in md_stop
md: Change active_io to percpu
md: Factor out is_md_suspended helper
md: don't update recovery_cp when curr_resync is ACTIVE
dm raid calls md_stop to stop the raid device. It needs to
free the writes_pending here.
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>
Now the type of active_io is atomic. It's used to count how many ios are
in the submitting process and it's added and decreased very time. But it
only needs to check if it's zero when suspending the raid. So we can
switch atomic to percpu to improve the performance.
After switching active_io to percpu type, we use the state of active_io
to judge if the raid device is suspended. And we don't need to wake up
->sb_wait in md_handle_request anymore. It's done in the callback function
which is registered when initing active_io. The argument mddev->suspended
is only used to count how many users are trying to set raid to suspend
state.
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>
This helper function will be used in next patch. It's easy for
understanding.
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>
Don't update recovery_cp when curr_resync is MD_RESYNC_ACTIVE, otherwise
md may skip the resync of the first 3 sectors if the resync procedure is
interrupted before the first calling of ->sync_request() as shown below:
md_do_sync thread control thread
// setup resync
mddev->recovery_cp = 0
j = 0
mddev->curr_resync = MD_RESYNC_ACTIVE
// e.g., set array as idle
set_bit(MD_RECOVERY_INTR, &&mddev_recovery)
// resync loop
// check INTR before calling sync_request
!test_bit(MD_RECOVERY_INTR, &mddev->recovery
// resync interrupted
// update recovery_cp from 0 to 3
// the resync of three 3 sectors will be skipped
mddev->recovery_cp = 3
Fixes: eac58d08d4 ("md: Use enum for overloaded magic numbers used by mddev->curr_resync")
Cc: stable@vger.kernel.org # 6.0+
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Song Liu <song@kernel.org>
Make the following minor changes which were reported by colleagues
while reviewing this code:
- Remove the parentheses from around the LOOP_DEFAULT_HW_Q_DEPTH
definition since these are superfluous.
- Accept other number formats than decimal, e.g. hexadecimal.
- Do not set hw_queue_depth to an out-of-range value, even if that value
won't be used.
- Use the LOOP_DEFAULT_HW_Q_DEPTH macro in the kernel module parameter
description to prevent that the description gets out of sync.
This patch has been tested as follows:
# modprobe -r loop
# modprobe loop hw_queue_depth=-1
modprobe: ERROR: could not insert 'loop': Invalid argument
# modprobe loop hw_queue_depth=0
modprobe: ERROR: could not insert 'loop': Invalid argument
# modprobe loop hw_queue_depth=1; cat /sys/module/loop/parameters/hw_queue_depth
1
# modprobe -r loop; modprobe loop; cat /sys/module/loop/parameters/hw_queue_depth hw_queue_depth=0x10
16
# modprobe -r loop; modprobe loop; cat /sys/module/loop/parameters/hw_queue_depth hw_queue_depth=128
128
# modprobe -r loop; modprobe loop hw_queue_depth=129; cat /sys/module/loop/parameters/hw_queue_depth
129
# modprobe -r loop; modprobe loop hw_queue_depth=$((1<<32))
modprobe: ERROR: could not insert 'loop': Numerical result out of range
See also commit ef44c50837 ("loop: allow user to set the queue
depth").
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20230130211347.832110-1-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Owner of one unprivileged ublk device could be one evil user, which
can grant this disk's privilege to other users deliberately, and
this way could be like making one trap and waiting for other users
to be caught.
So only owner to open unprivileged disk even though the owner
grants disk privilege to other user. This way is reasonable too
given anyone can create ublk disk, and no need other's grant.
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Fixes: 4093cb5a06 ("ublk_drv: add mechanism for supporting unprivileged ublk device")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230131040446.214583-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>