linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-01 09:34:10 -04:00

Author	SHA1	Message	Date
Christoph Hellwig	b38c8be255	loop: refactor queue limits updates Replace loop_reconfigure_limits with a slightly less encompassing loop_update_limits that expects the caller to acquire and commit the queue limits to prepare for sorting out the freeze vs limits lock ordering. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Link: https://lore.kernel.org/r/20250110054726.1499538-11-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-10 07:29:24 -07:00
Christoph Hellwig	1233751f7d	usb-storage: fix queue freeze vs limits lock order Match the locking order used by the core block code by only freezing the queue after taking the limits lock using the queue_limits_commit_update_frozen helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20250110054726.1499538-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-10 07:29:24 -07:00
Christoph Hellwig	f3dec61d75	nbd: fix queue freeze vs limits lock order Match the locking order used by the core block code by only freezing the queue after taking the limits lock using the queue_limits_commit_update_frozen helper. This also allows removes the need for the separate __nbd_set_size helper, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20250110054726.1499538-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-10 07:29:24 -07:00
Christoph Hellwig	473106dd3a	nvme: fix queue freeze vs limits lock order Match the locking order used by the core block code by only freezing the queue after taking the limits lock. Unlike most queue updates this does not use the queue_limits_commit_update_frozen helper as the nvme driver want the queue frozen for more than just the limits update. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20250110054726.1499538-8-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-10 07:29:24 -07:00
Christoph Hellwig	c99f66e408	block: fix queue freeze vs limits lock order in sysfs store methods queue_attr_store() always freezes a device queue before calling the attribute store operation. For attributes that control queue limits, the store operation will also lock the queue limits with a call to queue_limits_start_update(). However, some drivers (e.g. SCSI sd) may need to issue commands to a device to obtain limit values from the hardware with the queue limits locked. This creates a potential ABBA deadlock situation if a user attempts to modify a limit (thus freezing the device queue) while the device driver starts a revalidation of the device queue limits. Avoid such deadlock by not freezing the queue before calling the ->store_limit() method in struct queue_sysfs_entry and instead use the queue_limits_commit_update_frozen helper to freeze the queue after taking the limits lock. This also removes taking the sysfs lock for the store_limit method as it doesn't protect anything here, but creates even more nesting. Hopefully it will go away from the actual sysfs methods entirely soon. (commit log adapted from a similar patch from Damien Le Moal) Fixes: `ff956a3be9` ("block: use queue_limits_commit_update in queue_discard_max_store") Fixes: `0327ca9d53` ("block: use queue_limits_commit_update in queue_max_sectors_store") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20250110054726.1499538-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-10 07:29:24 -07:00
Christoph Hellwig	a16230649c	block: add a store_limit operations for sysfs entries De-duplicate the code for updating queue limits by adding a store_limit method that allows having common code handle the actual queue limits update. Note that this is a pure refactoring patch and does not address the existing freeze vs limits lock order problem in the refactored code, which will be addressed next. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20250110054726.1499538-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-10 07:29:23 -07:00
Christoph Hellwig	d432c817c2	block: don't update BLK_FEAT_POLL in __blk_mq_update_nr_hw_queues When __blk_mq_update_nr_hw_queues changes the number of tag sets, it might have to disable poll queues. Currently it does so by adjusting the BLK_FEAT_POLL, which is a bit against the intent of features that describe hardware / driver capabilities, but more importantly causes nasty lock order problems with the broadly held freeze when updating the number of hardware queues and the limits lock. Fix this by leaving BLK_FEAT_POLL alone, and instead check for the number of poll queues in the bio submission and poll handlers. While this adds extra work to the fast path, the variables are in cache lines used by these operations anyway, so it should be cheap enough. Fixes: `8023e144f9` ("block: move the poll flag to queue_limits") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Link: https://lore.kernel.org/r/20250110054726.1499538-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-10 07:29:23 -07:00
Christoph Hellwig	958148a6ac	block: check BLK_FEAT_POLL under q_usage_count Otherwise feature reconfiguration can race with I/O submission. Also drop the bio_clear_polled in the error path, as the flag does not matter for instant error completions, it is a left over from when we allowed polled I/O to proceed unpolled in this case. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20250110054726.1499538-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-10 07:29:23 -07:00
Christoph Hellwig	aa427d7b73	block: add a queue_limits_commit_update_frozen helper Add a helper that freezes the queue, updates the queue limits and unfreezes the queue and convert all open coded versions of that to the new helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20250110054726.1499538-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-10 07:29:23 -07:00
Christoph Hellwig	9c96821b44	block: fix docs for freezing of queue limits updates queue_limits_commit_update is the function that needs to operate on a frozen queue, not queue_limits_start_update. Update the kerneldoc comments to reflect that. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20250110054726.1499538-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-10 07:29:23 -07:00
Yu Kuai	844b8cdc68	nbd: don't allow reconnect after disconnect Following process can cause nbd_config UAF: 1) grab nbd_config temporarily; 2) nbd_genl_disconnect() flush all recv_work() and release the initial reference: nbd_genl_disconnect nbd_disconnect_and_put nbd_disconnect flush_workqueue(nbd->recv_workq) if (test_and_clear_bit(NBD_RT_HAS_CONFIG_REF, ...)) nbd_config_put -> due to step 1), reference is still not zero 3) nbd_genl_reconfigure() queue recv_work() again; nbd_genl_reconfigure config = nbd_get_config_unlocked(nbd) if (!config) -> succeed if (!test_bit(NBD_RT_BOUND, ...)) -> succeed nbd_reconnect_socket queue_work(nbd->recv_workq, &args->work) 4) step 1) release the reference; 5) Finially, recv_work() will trigger UAF: recv_work nbd_config_put(nbd) -> nbd_config is freed atomic_dec(&config->recv_threads) -> UAF Fix the problem by clearing NBD_RT_BOUND in nbd_genl_disconnect(), so that nbd_genl_reconfigure() will fail. Fixes: `b7aa3d3938` ("nbd: add a reconfigure netlink command") Reported-by: syzbot+6b0df248918b92c33e6a@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/675bfb65.050a0220.1a2d0d.0006.GAE@google.com/ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250103092859.3574648-1-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-06 07:38:20 -07:00
Christoph Hellwig	ce32496ec1	block: simplify tag allocation policy selection Use a plain BLK_MQ_F_* flag to select the round robin tag selection instead of overlaying an enum with just two possible values into the flags space. Doing so allows adding a BLK_MQ_F_MAX sentinel for simplified overflow checking in the messy debugfs helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20250106083531.799976-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-06 07:37:41 -07:00
Christoph Hellwig	e7602bb4f3	block: remove BLK_MQ_F_NO_SCHED The only queues that really can't support a scheduler are those that do not have a gendisk associated with them, and thus can't be used for non-passthrough commands. In addition to those null_blk can optionally set the flag, which is a bad odd. Replace the null_blk usage with BLK_MQ_F_NO_SCHED_BY_DEFAULT to keep the expected semantics and then remove BLK_MQ_F_NO_SCHED as the non-disk queues never call into elevator_init_mq or blk_register_queue which adds the sysfs attributes. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250106083531.799976-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-06 07:37:41 -07:00
Christoph Hellwig	68ed451222	block: remove blk_mq_init_bitmaps The little work done in blk_mq_init_bitmaps is easier done in the only caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20250106083531.799976-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-06 07:37:41 -07:00
Christoph Hellwig	6783811569	block: better split mq vs non-mq code in add_disk_fwnode Add a big conditional for blk-mq vs not mq at the beginning of add_disk_fwnode so that elevator_init_mq is only called for blk-mq disks, and add checks that the right methods or set or not set based on the queue type. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250106083531.799976-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-06 07:37:41 -07:00
Christoph Hellwig	b7175e24d6	block: add a dma mapping iterator blk_rq_map_sg is maze of nested loops. Untangle it by creating an iterator that returns [paddr,len] tuples for DMA mapping, and then implement the DMA logic on top of this. This not only removes code at the source level, but also generates nicer binary code: $ size block/blk-merge.o.* text data bss dec hex filename 10001 432 0 10433 28c1 block/blk-merge.o.new 10317 468 0 10785 2a21 block/blk-merge.o.old Last but not least it will be used as a building block for a new DMA mapping helper that doesn't rely on struct scatterlist. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250106081609.798289-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-06 07:37:11 -07:00
Christoph Hellwig	2caca8fc7a	block: use page_to_phys in bvec_phys Use page_to_phys instead of open coding it now that it is available in an architecture independent way. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250106081437.798213-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-06 07:36:39 -07:00
Christoph Hellwig	02ee5d69e3	block: remove blk_rq_bio_prep There is not real point in a helper just to assign three values to four fields, especially when the surrounding code is working on the neighbor fields directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Link: https://lore.kernel.org/r/20250103073417.459715-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-04 15:27:35 -07:00
Christoph Hellwig	6aeb4f8364	block: remove bio_add_pc_page Lift bio_split_rw_at into blk_rq_append_bio so that it validates the hardware limits. With this all passthrough callers can simply add bio_add_page to build the bio and delay checking for exceeding of limits to this point instead of doing it for each page. While this looks like adding a new expensive loop over all bio_vecs, blk_rq_append_bio is already doing that just to counter the number of segments. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Link: https://lore.kernel.org/r/20250103073417.459715-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-04 15:27:35 -07:00
Geert Uytterhoeven	c2398e6d5f	ps3disk: Do not use dev->bounce_size before it is set dev->bounce_size is only initialized after it is used to set the queue limits. Fix this by using BOUNCE_SIZE instead. Fixes: `a7f18b74db` ("ps3disk: pass queue_limits to blk_mq_alloc_disk") Reported-by: Philipp Hortmann <philipp.g.hortmann@gmail.com> Closes: https://lore.kernel.org/39256db9-3d73-4e86-a49b-300dfd670212@gmail.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/06988f959ea6885b8bd7fb3b9059dd54bc6bbad7.1735894216.git.geert+renesas@glider.be Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-03 11:44:25 -07:00
Yang Erkun	457ef47c08	block: retry call probe after request_module in blk_request_module Set kernel config: CONFIG_BLK_DEV_LOOP=m CONFIG_BLK_DEV_LOOP_MIN_COUNT=0 Do latter: mknod loop0 b 7 0 exec 4<> loop0 Before commit `e418de3abc` ("block: switch gendisk lookup to a simple xarray"), lookup_gendisk will first use base_probe to load module loop, and then the retry will call loop_probe to prepare the loop disk. Finally open for this disk will success. However, after this commit, we lose the retry logic, and open will fail with ENXIO. Block device autoloading is deprecated and will be removed soon, but maybe we should keep open success until we really remove it. So, give a retry to fix it. Fixes: `e418de3abc` ("block: switch gendisk lookup to a simple xarray") Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Yang Erkun <yangerkun@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20241209110435.3670985-1-yangerkun@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-03 11:43:50 -07:00
Thomas Weißschuh	00aab2f236	kyber: constify sysfs attributes The elevator core now allows instances of 'struct elv_fs_entry' to be moved into read-only memory. Make use of that to protect them against accidental or malicious modifications. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20250102-sysfs-const-attr-elevator-v1-4-9837d2058c60@weissschuh.net Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-02 13:20:29 -07:00
Thomas Weißschuh	c40f9f6ac5	block, bfq: constify sysfs attributes The elevator core now allows instances of 'struct elv_fs_entry' to be moved into read-only memory. Make use of that to protect them against accidental or malicious modifications. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20250102-sysfs-const-attr-elevator-v1-3-9837d2058c60@weissschuh.net Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-02 13:20:29 -07:00
Thomas Weißschuh	8686e1deda	block: mq-deadline: Constify sysfs attributes The elevator core now allows instances of 'struct elv_fs_entry' to be moved into read-only memory. Make use of that to protect them against accidental or malicious modifications. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20250102-sysfs-const-attr-elevator-v1-2-9837d2058c60@weissschuh.net Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-02 13:20:29 -07:00
Thomas Weißschuh	044792cda0	elevator: Enable const sysfs attributes The elevator core does not need to modify the sysfs attributes added by the elevators. Reflect this in the types, so the attributes can be moved into read-only memory. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20250102-sysfs-const-attr-elevator-v1-1-9837d2058c60@weissschuh.net Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-02 13:20:29 -07:00
Bart Van Assche	cb01ecb799	blk-zoned: Split queue_zone_wplugs_show() Reduce the indentation level of the code in queue_zone_wplugs_show() by moving the body of the loop in that function into a new function. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241217210310.645966-5-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:23 -07:00
Bart Van Assche	fa8555630b	blk-zoned: Improve the queue reference count strategy documentation For the blk_queue_exit() calls, document where the corresponding code can be found that increases q->q_usage_counter. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241217210310.645966-4-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:23 -07:00
Bart Van Assche	cbac56e523	blk-zoned: Document locking assumptions Document which functions expect that their callers must hold a lock. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241217210310.645966-3-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:23 -07:00
Bart Van Assche	48ea518d00	blk-zoned: Minimize #include directives Only include those header files that are necessary. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241217210310.645966-2-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:23 -07:00
Andreas Hindborg	31d813a3b8	rust: block: fix use of BLK_MQ_F_SHOULD_MERGE BLK_MQ_F_SHOULD_MERGE has was removed [1] and is now in effect by default. So remove the flag from tag sets of Rust block device drivers. Link: https://lore.kernel.org/r/20241219060214.1928848-1-hch@lst.de [1] Fixes: 9377b95cda73 ("block: remove BLK_MQ_F_SHOULD_MERGE") Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org> Link: https://lore.kernel.org/r/20241220-merge-flag-fix-v1-1-41b7778dac06@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:23 -07:00
Christoph Hellwig	cc76ace465	block: remove BLK_MQ_F_SHOULD_MERGE BLK_MQ_F_SHOULD_MERGE is set for all tag_sets except those that purely process passthrough commands (bsg-lib, ufs tmf, various nvme admin queues) and thus don't even check the flag. Remove it to simplify the driver interface. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20241219060214.1928848-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:23 -07:00
Daniel Wagner	9bc1e897a8	blk-mq: remove unused queue mapping helpers There are no users left of the pci and virtio queue mapping helpers. Thus remove them. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Daniel Wagner <wagi@kernel.org> Link: https://lore.kernel.org/r/20241202-refactor-blk-affinity-helpers-v6-8-27211e9c2cd5@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:23 -07:00
Daniel Wagner	a5665c3d15	virtio: blk/scsi: replace blk_mq_virtio_map_queues with blk_mq_map_hw_queues Replace all users of blk_mq_virtio_map_queues with the more generic blk_mq_map_hw_queues. This in preparation to retire blk_mq_virtio_map_queues. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Daniel Wagner <wagi@kernel.org> Link: https://lore.kernel.org/r/20241202-refactor-blk-affinity-helpers-v6-7-27211e9c2cd5@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:23 -07:00
Daniel Wagner	4425f6492a	nvme: replace blk_mq_pci_map_queues with blk_mq_map_hw_queues Replace all users of blk_mq_pci_map_queues with the more generic blk_mq_map_hw_queues. This in preparation to retire blk_mq_pci_map_queues. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Daniel Wagner <wagi@kernel.org> Link: https://lore.kernel.org/r/20241202-refactor-blk-affinity-helpers-v6-6-27211e9c2cd5@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:23 -07:00
Daniel Wagner	bd326a5ad6	scsi: replace blk_mq_pci_map_queues with blk_mq_map_hw_queues Replace all users of blk_mq_pci_map_queues with the more generic blk_mq_map_hw_queues. This in preparation to retire blk_mq_pci_map_queues. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Daniel Wagner <wagi@kernel.org> Link: https://lore.kernel.org/r/20241202-refactor-blk-affinity-helpers-v6-5-27211e9c2cd5@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:23 -07:00
Daniel Wagner	1452e9b470	blk-mq: introduce blk_mq_map_hw_queues blk_mq_pci_map_queues and blk_mq_virtio_map_queues will create a CPU to hardware queue mapping based on affinity information. These two function share common code and only differ on how the affinity information is retrieved. Also, those functions are located in the block subsystem where it doesn't really fit in. They are virtio and pci subsystem specific. Thus introduce provide a generic mapping function which uses the irq_get_affinity callback from bus_type. Originally idea from Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Daniel Wagner <wagi@kernel.org> Link: https://lore.kernel.org/r/20241202-refactor-blk-affinity-helpers-v6-4-27211e9c2cd5@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:23 -07:00
Daniel Wagner	c7f63c5d13	virtio: hookup irq_get_affinity callback struct bus_type has a new callback for retrieving the IRQ affinity for a device. Hook this callback up for virtio based devices. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Daniel Wagner <wagi@kernel.org> Link: https://lore.kernel.org/r/20241202-refactor-blk-affinity-helpers-v6-3-27211e9c2cd5@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:23 -07:00
Daniel Wagner	22d813bf00	PCI: hookup irq_get_affinity callback struct bus_type has a new callback for retrieving the IRQ affinity for a device. Hook this callback up for PCI based devices. Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Daniel Wagner <wagi@kernel.org> Link: https://lore.kernel.org/r/20241202-refactor-blk-affinity-helpers-v6-2-27211e9c2cd5@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:23 -07:00
Daniel Wagner	fea4952df0	driver core: bus: add irq_get_affinity callback to bus_type Introducing a callback in struct bus_type so that a subsystem can hook up the getters directly. This approach avoids exposing random getters in any subsystems APIs. Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Daniel Wagner <wagi@kernel.org> Link: https://lore.kernel.org/r/20241202-refactor-blk-affinity-helpers-v6-1-27211e9c2cd5@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:23 -07:00
Matthew Wilcox (Oracle)	0e20669a91	null_blk: Remove accesses to page->index Use page->private to store the index instead of page->index. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20241216160849.31739-1-willy@infradead.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:23 -07:00
Benoît du Garreau	53328a3671	block: rnull: Initialize the module in place Using `InPlaceModule` avoids an allocation and an indirection. Signed-off-by: Benoît du Garreau <benoit@dugarreau.fr> Acked-by: Andreas Hindborg <a.hindborg@kernel.org> Link: https://lore.kernel.org/r/20241204-rnull_in_place-v1-1-efe3eafac9fb@dugarreau.fr Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:23 -07:00
Colin Ian King	ccb9868ab7	blktrace: remove redundant return at end of function A recent change added return 0 before an existing return statement at the end of function blk_trace_setup. The final return is now redundant, so remove it. Fixes: 64d124798244 ("blktrace: move copy_[to\|from]_user() out of ->debugfs_lock") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20241204150450.399005-1-colin.i.king@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:23 -07:00
John Garry	19206d3f5e	block: Delete bio_set_prio() Since commit `43b62ce3ff` ("block: move bio io prio to a new field"), macro bio_set_prio() does nothing but set bio->bi_ioprio. All other places just set bio->bi_ioprio directly, so replace bio_set_prio() remaining callsites with setting bio->bi_ioprio directly and delete that macro. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20241202111957.2311683-3-john.g.garry@oracle.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:23 -07:00
John Garry	5c292ac6e6	block: Delete bio_prio() Since commit `43b62ce3ff` ("block: move bio io prio to a new field"), macro bio_prio() does nothing but return the value in bio->bi_ioprio. Most other places just read bio->bi_ioprio directly, so replace bi_ioprio() callsites with reading bio->bi_ioprio directly and delete that macro. Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20241202111957.2311683-2-john.g.garry@oracle.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:22 -07:00
Ming Lei	b769a2f409	blktrace: move copy_[to\|from]_user() out of ->debugfs_lock Move copy_[to\|from]_user() out of ->debugfs_lock and cut the dependency between mm->mmap_lock and q->debugfs_lock, then we avoids lots of lockdep false positive warning. Obviously ->debug_lock isn't needed for copy_[to\|from]_user(). The only behavior change is to call blk_trace_remove() in case of setup failure handling by re-grabbing ->debugfs_lock, and this way is just fine since we do cover concurrent setup() & remove(). Reported-by: syzbot+91585b36b538053343e4@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-block/67450fd4.050a0220.1286eb.0007.GAE@google.com/ Closes: https://lore.kernel.org/linux-block/6742e584.050a0220.1cc393.0038.GAE@google.com/ Closes: https://lore.kernel.org/linux-block/6742a600.050a0220.1cc393.002e.GAE@google.com/ Closes: https://lore.kernel.org/linux-block/67420102.050a0220.1cc393.0019.GAE@google.com/ Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241128125029.4152292-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:22 -07:00
Ming Lei	fd9b0244f5	blktrace: don't centralize grabbing q->debugfs_mutex in blk_trace_ioctl Call each handler directly and the handler do grab q->debugfs_mutex, prepare for killing dependency between ->debug_mutex and ->mmap_lock. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241128125029.4152292-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:22 -07:00
Damien Le Moal	b56426bcf8	null_blk: Add rotational feature support To facilitate testing of kernel functions related to the rotational feature (BLK_FEAT_ROTATIONAL) of a block device (e.g. NVMe rotational bit support), add the rotational boolean configfs attribute and module parameter to the null_blk driver. If set, a null block device will report being a rotational device through it queue limits features with the BLK_FEAT_ROTATIONAL flag. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20241126000956.95983-1-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:22 -07:00
Ming Lei	f6661b1d05	block: track queue dying state automatically for modeling queue freeze lockdep Now we only verify the outmost freeze & unfreeze in current context in case that !q->mq_freeze_depth, so it is reliable to save queue lying state when we want to lock the freeze queue since the state is one per-task variable now. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241127135133.3952153-5-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:22 -07:00
Ming Lei	b9d4eee7e0	block: don't verify queue freeze manually in elevator_init_mq() Now blk_freeze_queue_start() can track disk state automatically, and it isn't necessary to verify queue freeze manually in elevator_init_mq() any more. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241127135133.3952153-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:22 -07:00
Ming Lei	6f491a8d4b	block: track disk DEAD state automatically for modeling queue freeze lockdep Now we only verify the outmost freeze & unfreeze in current context in case that !q->mq_freeze_depth, so it is reliable to save disk DEAD state when we want to lock the freeze queue since the state is one per-task variable now. Doing this way can kill lots of false positive when freeze queue is called before adding disk[1]. [1] https://lore.kernel.org/linux-block/6741f6b2.050a0220.1cc393.0017.GAE@google.com/ Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241127135133.3952153-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-23 08:17:22 -07:00

1 2 3 4 5 ...

1324583 Commits