linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-04-30 22:50:54 -04:00

Author	SHA1	Message	Date
Yu Kuai	eb5bca7365	block, bfq: cleanup __bfq_weights_tree_remove() It's the same with bfq_weights_tree_remove() now. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Paolo Valente <paolo.valente@linaro.org> Link: https://lore.kernel.org/r/20220916071942.214222-7-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-11-01 07:09:44 -06:00
Yu Kuai	afdba14612	block, bfq: cleanup bfq_weights_tree add/remove apis The 'bfq_data' and 'rb_root_cached' can both be accessed through 'bfq_queue', thus only pass 'bfq_queue' as parameter. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Paolo Valente <paolo.valente@linaro.org> Link: https://lore.kernel.org/r/20220916071942.214222-6-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-11-01 07:09:44 -06:00
Yu Kuai	eed3ecc991	block, bfq: do not idle if only one group is activated Now that root group is counted into 'num_groups_with_pending_reqs', 'num_groups_with_pending_reqs > 0' is always true in bfq_asymmetric_scenario(). Thus change the condition to '> 1'. On the other hand, this change can enable concurrent sync io if only one group is activated. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Paolo Valente <paolo.valente@linaro.org> Link: https://lore.kernel.org/r/20220916071942.214222-5-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-11-01 07:09:44 -06:00
Yu Kuai	71f8ca77cb	block, bfq: refactor the counting of 'num_groups_with_pending_reqs' Currently, bfq can't handle sync io concurrently as long as they are not issued from root group. This is because 'bfqd->num_groups_with_pending_reqs > 0' is always true in bfq_asymmetric_scenario(). The way that bfqg is counted into 'num_groups_with_pending_reqs': Before this patch: 1) root group will never be counted. 2) Count if bfqg or it's child bfqgs have pending requests. 3) Don't count if bfqg and it's child bfqgs complete all the requests. After this patch: 1) root group is counted. 2) Count if bfqg have pending requests. 3) Don't count if bfqg complete all the requests. With this change, the occasion that only one group is activated can be detected, and next patch will support concurrent sync io in the occasion. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Paolo Valente <paolo.valente@linaro.org> Link: https://lore.kernel.org/r/20220916071942.214222-4-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-11-01 07:09:44 -06:00
Yu Kuai	60a6e10c53	block, bfq: record how many queues have pending requests Prepare to refactor the counting of 'num_groups_with_pending_reqs'. Add a counter in bfq_group, update it while tracking if bfqq have pending requests and when bfq_bfqq_move() is called. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Paolo Valente <paolo.valente@linaro.org> Link: https://lore.kernel.org/r/20220916071942.214222-3-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-11-01 07:09:44 -06:00
Yu Kuai	3d89bd12d3	block, bfq: support to track if bfqq has pending requests If entity belongs to bfqq, then entity->in_groups_with_pending_reqs is not used currently. This patch use it to track if bfqq has pending requests through callers of weights_tree insertion and removal. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Paolo Valente <paolo.valente@linaro.org> Link: https://lore.kernel.org/r/20220916071942.214222-2-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-11-01 07:09:44 -06:00
Jinlong Chen	56c1ee9224	blk-mq: remove redundant call to blk_freeze_queue_start in blk_mq_destroy_queue The calling relationship in blk_mq_destroy_queue() is as follows: blk_mq_destroy_queue() ... -> blk_queue_start_drain() -> blk_freeze_queue_start() <- called ... -> blk_freeze_queue() -> blk_freeze_queue_start() <- called again -> blk_mq_freeze_queue_wait() ... So there is a redundant call to blk_freeze_queue_start(). Replace blk_freeze_queue() with blk_mq_freeze_queue_wait() to avoid the redundant call. Signed-off-by: Jinlong Chen <nickyc975@zju.edu.cn> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221030083212.1251255-1-nickyc975@zju.edu.cn Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-31 07:30:42 -06:00
Jinlong Chen	219cf43c55	blk-mq: move queue_is_mq out of blk_mq_cancel_work_sync The only caller that needs queue_is_mq check is del_gendisk, so move the check into it. Signed-off-by: Jinlong Chen <nickyc975@zju.edu.cn> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221030094730.1275463-1-nickyc975@zju.edu.cn Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-31 07:27:54 -06:00
Dawei Li	adff215830	block: simplify blksize_bits() implementation Convert current looping-based implementation into bit operation, which can bring improvement for: 1) bitops is more efficient for its arch-level optimization. 2) Given that blksize_bits() is inline, _if_ @size is compile-time constant, it's possible that order_base_2() _may_ make output compile-time evaluated, depending on code context and compiler behavior. Signed-off-by: Dawei Li <set_pte_at@outlook.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/TYCP286MB23238842958D7C083D6B67CECA349@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-31 07:26:27 -06:00
David Jeffery	82c229476b	blk-mq: avoid double ->queue_rq() because of early timeout David Jeffery found one double ->queue_rq() issue, so far it can be triggered in VM use case because of long vmexit latency or preempt latency of vCPU pthread or long page fault in vCPU pthread, then block IO req could be timed out before queuing the request to hardware but after calling blk_mq_start_request() during ->queue_rq(), then timeout handler may handle it by requeue, then double ->queue_rq() is caused, and kernel panic. So far, it is driver's responsibility to cover the race between timeout and completion, so it seems supposed to be solved in driver in theory, given driver has enough knowledge. But it is really one common problem, lots of driver could have similar issue, and could be hard to fix all affected drivers, even it isn't easy for driver to handle the race. So David suggests this patch by draining in-progress ->queue_rq() for solving this issue. Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Keith Busch <kbusch@kernel.org> Cc: virtualization@lists.linux-foundation.org Cc: Bart Van Assche <bvanassche@acm.org> Signed-off-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221026051957.358818-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-31 07:24:22 -06:00
Bart Van Assche	9546531884	block: Micro-optimize get_max_segment_size() This patch removes a conditional jump from get_max_segment_size(). The x86-64 assembler code for this function without this patch is as follows: 206 return min_not_zero(mask - offset + 1, 0x0000000000000118 <+72>: not %rax 0x000000000000011b <+75>: and 0x8(%r10),%rax 0x000000000000011f <+79>: add $0x1,%rax 0x0000000000000123 <+83>: je 0x138 <bvec_split_segs+104> 0x0000000000000125 <+85>: cmp %rdx,%rax 0x0000000000000128 <+88>: mov %rdx,%r12 0x000000000000012b <+91>: cmovbe %rax,%r12 0x000000000000012f <+95>: test %rdx,%rdx 0x0000000000000132 <+98>: mov %eax,%edx 0x0000000000000134 <+100>: cmovne %r12d,%edx With this patch applied: 206 return min(mask - offset, (unsigned long)lim->max_segment_size - 1) + 1; 0x000000000000003f <+63>: mov 0x28(%rdi),%ebp 0x0000000000000042 <+66>: not %rax 0x0000000000000045 <+69>: and 0x8(%rdi),%rax 0x0000000000000049 <+73>: sub $0x1,%rbp 0x000000000000004d <+77>: cmp %rbp,%rax 0x0000000000000050 <+80>: cmova %rbp,%rax 0x0000000000000054 <+84>: add $0x1,%eax Reviewed-by: Ming Lei <ming.lei@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Keith Busch <kbusch@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221025191755.1711437-4-bvanassche@acm.org Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-25 13:41:20 -06:00
Bart Van Assche	aa261f2058	block: Constify most queue limits pointers Document which functions do not modify the queue limits. Reviewed-by: Ming Lei <ming.lei@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Keith Busch <kbusch@kernel.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221025191755.1711437-3-bvanassche@acm.org Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-25 13:41:17 -06:00
Bart Van Assche	b179c98f76	block: Remove request.write_hint Commit `c75e707fe1` ("block: remove the per-bio/request write hint") removed all code that uses the struct request write_hint member. Hence also remove 'write_hint' itself. Reviewed-by: Ming Lei <ming.lei@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221025191755.1711437-2-bvanassche@acm.org Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-25 13:41:12 -06:00
Christoph Hellwig	a55b70f127	block: remove bio_start_io_acct_time bio_start_io_acct_time is not actually used anywhere, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221025155916.270303-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-25 12:08:31 -06:00
Christoph Hellwig	941f7298c7	nvme-apple: remove an extra queue reference Now that blk_mq_destroy_queue does not release the queue reference, there is no need for a second admin queue reference to be held by the apple_nvme structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Sven Peter <sven@svenpeter.dev> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20221018135720.670094-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-25 08:25:35 -06:00
Christoph Hellwig	7dcebef90d	nvme-pci: remove an extra queue reference Now that blk_mq_destroy_queue does not release the queue reference, there is no need for a second admin queue reference to be held by the nvme_dev. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20221018135720.670094-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-25 08:25:35 -06:00
Christoph Hellwig	dc917c3614	scsi: remove an extra queue reference Now that blk_mq_destroy_queue does not release the queue reference, there is no need for a second queue reference to be held by the scsi_device. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20221018135720.670094-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-25 08:25:35 -06:00
Christoph Hellwig	2b3f056f72	blk-mq: move the call to blk_put_queue out of blk_mq_destroy_queue The fact that blk_mq_destroy_queue also drops a queue reference leads to various places having to grab an extra reference. Move the call to blk_put_queue into the callers to allow removing the extra references. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20221018135720.670094-2-hch@lst.de [axboe: fix fabrics_q vs admin_q conflict in nvme core.c] Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-25 08:25:10 -06:00
Jinlong Chen	8ed40ee35d	block: fix up elevator_type refcounting The current reference management logic of io scheduler modules contains refcnt problems. For example, blk_mq_init_sched may fail before or after the calling of e->ops.init_sched. If it fails before the calling, it does nothing to the reference to the io scheduler module. But if it fails after the calling, it releases the reference by calling kobject_put(&eq->kobj). As the callers of blk_mq_init_sched can't know exactly where the failure happens, they can't handle the reference to the io scheduler module properly: releasing the reference on failure results in double-release if blk_mq_init_sched has released it, and not releasing the reference results in ghost reference if blk_mq_init_sched did not release it either. The same problem also exists in io schedulers' init_sched implementations. We can address the problem by adding releasing statements to the error handling procedures of blk_mq_init_sched and init_sched implementations. But that is counterintuitive and requires modifications to existing io schedulers. Instead, We make elevator_alloc get the io scheduler module references that will be released by elevator_release. And then, we match each elevator_get with an elevator_put. Therefore, each reference to an io scheduler module explicitly has its own getter and releaser, and we no longer need to worry about the refcnt problems. The bugs and the patch can be validated with tools here: https://github.com/nickyc975/linux_elv_refcnt_bug.git [hch: split out a few bits into separate patches, use a non-try module_get in elevator_alloc] Signed-off-by: Jinlong Chen <nickyc975@zju.edu.cn> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221020064819.1469928-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-23 18:59:17 -06:00
Jinlong Chen	b54c2ad9b7	block: check for an unchanged elevator earlier in __elevator_change No need to find the actual elevator_type struct for this comparism, the name is all that is needed. Signed-off-by: Jinlong Chen <nickyc975@zju.edu.cn> [hch: split from a larger patch] Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221020064819.1469928-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-23 18:59:17 -06:00
Christoph Hellwig	58367c8a5f	block: sanitize the elevator name before passing it to __elevator_change The stripped name should also be used for the none check. To do so strip it in the caller and pass in the sanitized name. Drop the pointless __ prefix in the function name while we're at it. Based on a patch from Jinlong Chen <nickyc975@zju.edu.cn>. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221020064819.1469928-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-23 18:59:17 -06:00
Christoph Hellwig	dd6f7f17bf	block: add proper helpers for elevator_type module refcount management Make sure we have helpers for all relevant module refcount operations on the elevator_type in elevator.h, and use them. Move the call to the get helper in blk_mq_elv_switch_none a bit so that it is obvious with a less verbose comment. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221020064819.1469928-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-23 18:59:17 -06:00
Yu Kuai	671fae5e51	blk-wbt: don't enable throttling if default elevator is bfq Commit `b5dc5d4d1f` ("block,bfq: Disable writeback throttling") tries to disable wbt for bfq, it's done by calling wbt_disable_default() in bfq_init_queue(). However, wbt is still enabled if default elevator is bfq: device_add_disk elevator_init_mq bfq_init_queue wbt_disable_default -> done nothing blk_register_queue wbt_enable_default -> wbt is enabled Fix the problem by adding a new flag ELEVATOR_FLAG_DISBALE_WBT, bfq will set the flag in bfq_init_queue, and following wbt_enable_default() won't enable wbt while the flag is set. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221019121518.3865235-7-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-23 18:59:17 -06:00
Yu Kuai	181d066374	elevator: add new field flags in struct elevator_queue There are only one flag to indicate that elevator is registered currently, prepare to add a flag to disable wbt if default elevator is bfq. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221019121518.3865235-6-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-23 18:59:17 -06:00
Yu Kuai	3642ef4d95	blk-wbt: don't show valid wbt_lat_usec in sysfs while wbt is disabled Currently, if wbt is initialized and then disabled by wbt_disable_default(), sysfs will still show valid wbt_lat_usec, which will confuse users that wbt is still enabled. This patch shows wbt_lat_usec as zero if it's disabled. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reported-and-tested-by: Holger Hoffstätte <holger@applied-asynchrony.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221019121518.3865235-5-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-23 18:59:17 -06:00
Yu Kuai	a9a236d238	blk-wbt: make enable_state more accurate Currently, if user disable wbt through sysfs, 'enable_state' will be 'WBT_STATE_ON_MANUAL', which will be confusing. Add a new state 'WBT_STATE_OFF_MANUAL' to cover that case. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221019121518.3865235-4-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-23 18:59:17 -06:00
Yu Kuai	b11d31ae01	blk-wbt: remove unnecessary check in wbt_enable_default() If CONFIG_BLK_WBT_MQ is disabled, wbt_init() won't do anything. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221019121518.3865235-3-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-23 18:59:17 -06:00
Yu Kuai	6d9f4cf125	elevator: remove redundant code in elv_unregister_queue() "elevator_queue *e" is already declared and initialized in the beginning of elv_unregister_queue(). Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20221019121518.3865235-2-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-23 18:59:17 -06:00
Yu Kuai	074501bce3	blk-iocost: read 'ioc->params' inside 'ioc->lock' in ioc_timer_fn() 'ioc->params' is updated in ioc_refresh_params(), which is proteced by 'ioc->lock', however, ioc_timer_fn() read params outside the lock. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20221012094035.390056-5-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-23 18:59:17 -06:00
Yu Kuai	2b2da2f6dc	blk-iocost: prevent configuration update concurrent with io throttling This won't cause any severe problem currently, however, this doesn't seems appropriate: 1) 'ioc->params' is read from multiple places without holding 'ioc->lock', unexpected value might be read if writing it concurrently. 2) If configuration is changed while io is throttling, the functionality might be affected. For example, if module params is updated and cost becomes smaller, waiting for timer that is caculated under old configuration is not appropriate. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20221012094035.390056-4-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-23 18:59:16 -06:00
Yu Kuai	2c06479884	blk-iocost: don't release 'ioc->lock' while updating params ioc_qos_write() and ioc_cost_model_write() are the same: 1) hold lock to read 'ioc->params' to local variable; 2) update params to local variable without lock; 3) hold lock to write local variable to 'ioc->params'; In theroy, if user updates params concurrenty, the params might be lost: t1: update params a t2: update params b spin_lock_irq(&ioc->lock); memcpy(qos, ioc->params.qos, sizeof(qos)) spin_unlock_irq(&ioc->lock); qos[a] = xxx; spin_lock_irq(&ioc->lock); memcpy(qos, ioc->params.qos, sizeof(qos)) spin_unlock_irq(&ioc->lock); qos[b] = xxx; spin_lock_irq(&ioc->lock); memcpy(ioc->params.qos, qos, sizeof(qos)); ioc_refresh_params(ioc, true); spin_unlock_irq(&ioc->lock); spin_lock_irq(&ioc->lock); // updates of a will be lost memcpy(ioc->params.qos, qos, sizeof(qos)); ioc_refresh_params(ioc, true); spin_unlock_irq(&ioc->lock); Althrough this is not common case, the problem can by fixed easily by holding the lock through the read, update, write process. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20221012094035.390056-3-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-23 18:59:16 -06:00
Yu Kuai	8796acbc9a	blk-iocost: disable writeback throttling Commit `b5dc5d4d1f` ("block,bfq: Disable writeback throttling") disable wbt for bfq, because different write-throttling heuristics should not work together. For the same reason, wbt and iocost should not work together as well, unless admin really want to do that, dispite that performance is affected. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20221012094035.390056-2-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-23 18:59:16 -06:00
Linus Torvalds	247f34f7b8	Linux 6.1-rc2 v6.1-rc2	2022-10-23 15:27:33 -07:00
Linus Torvalds	05b4ebd2c7	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "RISC-V: - Fix compilation without RISCV_ISA_ZICBOM - Fix kvm_riscv_vcpu_timer_pending() for Sstc ARM: - Fix a bug preventing restoring an ITS containing mappings for very large and very sparse device topology - Work around a relocation handling error when compiling the nVHE object with profile optimisation - Fix for stage-2 invalidation holding the VM MMU lock for too long by limiting the walk to the largest block mapping size - Enable stack protection and branch profiling for VHE - Two selftest fixes x86: - add compat implementation for KVM_X86_SET_MSR_FILTER ioctl selftests: - synchronize includes between include/uapi and tools/include/uapi" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: tools: include: sync include/api/linux/kvm.h KVM: x86: Add compat handler for KVM_X86_SET_MSR_FILTER KVM: x86: Copy filter arg outside kvm_vm_ioctl_set_msr_filter() kvm: Add support for arch compat vm ioctls RISC-V: KVM: Fix kvm_riscv_vcpu_timer_pending() for Sstc RISC-V: Fix compilation without RISCV_ISA_ZICBOM KVM: arm64: vgic: Fix exit condition in scan_its_table() KVM: arm64: nvhe: Fix build with profile optimization KVM: selftests: Fix number of pages for memory slot in memslot_modification_stress_test KVM: arm64: selftests: Fix multiple versions of GIC creation KVM: arm64: Enable stack protection and branch profiling for VHE KVM: arm64: Limit stage2_apply_range() batch size to largest block KVM: arm64: Work out supported block level at compile time	2022-10-23 15:00:43 -07:00
Jason A. Donenfeld	ca4582c286	Revert "mfd: syscon: Remove repetition of the regmap_get_val_endian()" This reverts commit `72a9585972`. It broke reboots on big-endian MIPS and MIPS64 malta QEMU instances, which use the syscon driver. Little-endian is not effected, which means likely it's important to handle regmap_get_val_endian() in this function after all. Fixes: `72a9585972` ("mfd: syscon: Remove repetition of the regmap_get_val_endian()") Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Lee Jones <lee@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-10-23 12:04:56 -07:00
Linus Torvalds	52826d3b2d	kernel/utsname_sysctl.c: Fix hostname polling Commit `bfca3dd3d0` ("kernel/utsname_sysctl.c: print kernel arch") added a new entry to the uts_kern_table[] array, but didn't update the UTS_PROC_xyz enumerators of older entries, breaking anything that used them. Which is admittedly not many cases: it's really just the two uses of uts_proc_notify() in kernel/sys.c. But apparently journald-systemd actually uses this to detect hostname changes. Reported-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Fixes: `bfca3dd3d0` ("kernel/utsname_sysctl.c: print kernel arch") Link: https://lore.kernel.org/lkml/0c2b92a6-0f25-9538-178f-eee3b06da23f@secunet.com/ Link: https://linux-regtracking.leemhuis.info/regzbot/regression/0c2b92a6-0f25-9538-178f-eee3b06da23f@secunet.com/ Cc: Petr Vorel <pvorel@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-10-23 12:01:01 -07:00
Linus Torvalds	a703852408	Merge tag 'perf_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Borislav Petkov: - Fix raw data handling when perf events are used in bpf - Rework how SIGTRAPs get delivered to events to address a bunch of problems with it. Add a selftest for that too * tag 'perf_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: bpf: Fix sample_flags for bpf_perf_event_output selftests/perf_events: Add a SIGTRAP stress test with disables perf: Fix missing SIGTRAPs	2022-10-23 10:14:45 -07:00
Linus Torvalds	c70055d8d9	Merge tag 'sched_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Borislav Petkov: - Adjust code to not trip up CFI - Fix sched group cookie matching * tag 'sched_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Introduce struct balance_callback to avoid CFI mismatches sched/core: Fix comparison in sched_group_cookie_match()	2022-10-23 10:10:55 -07:00
Linus Torvalds	6204a81aa3	Merge tag 'objtool_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fix from Borislav Petkov: - Fix ORC stack unwinding when GCOV is enabled * tag 'objtool_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/unwind/orc: Fix unreliable stack dump with gcov	2022-10-23 10:07:01 -07:00
Linus Torvalds	295dad10bf	Merge tag 'x86_urgent_for_v6.0_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: "As usually the case, right after a major release, the tip urgent branches accumulate a couple more fixes than normal. And here is the x86, a bit bigger, urgent pile. - Use the correct CPU capability clearing function on the error path in Intel perf LBR - A CFI fix to ftrace along with a simplification - Adjust handling of zero capacity bit mask for resctrl cache allocation on AMD - A fix to the AMD microcode loader to attempt patch application on every logical thread - A couple of topology fixes to handle CPUID leaf 0x1f enumeration info properly - Drop a -mabi=ms compiler option check as both compilers support it now anyway - A couple of fixes to how the initial, statically allocated FPU buffer state is setup and its interaction with dynamic states at runtime" * tag 'x86_urgent_for_v6.0_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu: Fix copy_xstate_to_uabi() to copy init states correctly perf/x86/intel/lbr: Use setup_clear_cpu_cap() instead of clear_cpu_cap() ftrace,kcfi: Separate ftrace_stub() and ftrace_stub_graph() x86/ftrace: Remove ftrace_epilogue() x86/resctrl: Fix min_cbm_bits for AMD x86/microcode/AMD: Apply the patch early on every logical thread x86/topology: Fix duplicated core ID within a package x86/topology: Fix multiple packages shown on a single-package system hwmon/coretemp: Handle large core ID value x86/Kconfig: Drop check for -mabi=ms for CONFIG_EFI_STUB x86/fpu: Exclude dynamic states from init_fpstate x86/fpu: Fix the init_fpstate size check with the actual size x86/fpu: Configure init_fpstate attributes orderly	2022-10-23 10:01:34 -07:00
Linus Torvalds	942e01ab90	Merge tag 'io_uring-6.1-2022-10-22' of git://git.kernel.dk/linux Pull io_uring follow-up from Jens Axboe: "Currently the zero-copy has automatic fallback to normal transmit, and it was decided that it'd be cleaner to return an error instead if the socket type doesn't support it. Zero-copy does work with UDP and TCP, it's more of a future proofing kind of thing (eg for samba)" * tag 'io_uring-6.1-2022-10-22' of git://git.kernel.dk/linux: io_uring/net: fail zc sendmsg when unsupported by socket io_uring/net: fail zc send when unsupported by socket net: flag sockets supporting msghdr originated zerocopy	2022-10-23 09:55:50 -07:00
Linus Torvalds	d47136c280	Merge tag 'hwmon-for-v6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - corsair-psu: Fix typo in USB id description, and add USB ID for new PSU - pwm-fan: Fix fan power handling when disabling fan control * tag 'hwmon-for-v6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (corsair-psu) Add USB id of the new HX1500i psu hwmon: (pwm-fan) Explicitly switch off fan power when setting pwm1_enable to 0 hwmon: (corsair-psu) fix typo in USB id description	2022-10-22 16:04:34 -07:00
Linus Torvalds	cda5d92054	Merge tag 'i2c-for-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "RPM fix for qcom-cci, platform module alias for xiic, build warning fix for mlxbf, typo fixes in comments" * tag 'i2c-for-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: mlxbf: depend on ACPI; clean away ifdeffage i2c: fix spelling typos in comments i2c: qcom-cci: Fix ordering of pm_runtime_xx and i2c_add_adapter i2c: xiic: Add platform module alias	2022-10-22 15:59:46 -07:00
Linus Torvalds	fd79882ff2	Merge tag 'pci-v6.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci fixes from Bjorn Helgaas: - Revert a simplification that broke pci-tegra due to a masking error - Update MAINTAINERS for Kishon's email address change and TI DRA7XX/J721E maintainer change * tag 'pci-v6.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: MAINTAINERS: Update Kishon's email address in PCI endpoint subsystem MAINTAINERS: Add Vignesh Raghavendra as maintainer of TI DRA7XX/J721E PCI driver Revert "PCI: tegra: Use PCI_CONF1_EXT_ADDRESS() macro"	2022-10-22 15:52:36 -07:00
Linus Torvalds	3272eb1ace	Merge tag 'media/v6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull missed media updates from Mauro Carvalho Chehab: "It seems I screwed-up my previous pull request: it ends up that only half of the media patches that were in linux-next got merged in -rc1. The script which creates the signed tags silently failed due to 5.19->6.0 so it ended generating a tag with incomplete stuff. So here are the missing parts: - a DVB core security fix - lots of fixes and cleanups for atomisp staging driver - old drivers that are VB1 are being moved to staging to be deprecated - several driver updates - mostly for embedded systems, but there are also some things addressing issues with some PC webcams, in the UVC video driver" * tag 'media/v6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (163 commits) media: sun6i-csi: Move csi buffer definition to main header file media: sun6i-csi: Introduce and use video helper functions media: sun6i-csi: Add media ops with link notify callback media: sun6i-csi: Remove controls handler from the driver media: sun6i-csi: Register the media device after creation media: sun6i-csi: Pass and store csi device directly in video code media: sun6i-csi: Tidy up video code media: sun6i-csi: Tidy up v4l2 code media: sun6i-csi: Tidy up Kconfig media: sun6i-csi: Use runtime pm for clocks and reset media: sun6i-csi: Define and use variant to get module clock rate media: sun6i-csi: Always set exclusive module clock rate media: sun6i-csi: Tidy up platform code media: sun6i-csi: Refactor main driver data structures media: sun6i-csi: Define and use driver name and (reworked) description media: cedrus: Add a Kconfig dependency on RESET_CONTROLLER media: sun8i-rotate: Add a Kconfig dependency on RESET_CONTROLLER media: sun8i-di: Add a Kconfig dependency on RESET_CONTROLLER media: sun4i-csi: Add a Kconfig dependency on RESET_CONTROLLER media: sun6i-csi: Add a Kconfig dependency on RESET_CONTROLLER ...	2022-10-22 15:30:15 -07:00
Pavel Begunkov	cc767e7c69	io_uring/net: fail zc sendmsg when unsupported by socket The previous patch fails zerocopy send requests for protocols that don't support it, do the same for zerocopy sendmsg. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0854e7bb4c3d810a48ec8b5853e2f61af36a0467.1666346426.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-22 08:43:03 -06:00
Pavel Begunkov	edf8143879	io_uring/net: fail zc send when unsupported by socket If a protocol doesn't support zerocopy it will silently fall back to copying. This type of behaviour has always been a source of troubles so it's better to fail such requests instead. Cc: <stable@vger.kernel.org> # 6.0 Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/2db3c7f16bb6efab4b04569cd16e6242b40c5cb3.1666346426.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-22 08:43:03 -06:00
Pavel Begunkov	e993ffe3da	net: flag sockets supporting msghdr originated zerocopy We need an efficient way in io_uring to check whether a socket supports zerocopy with msghdr provided ubuf_info. Add a new flag into the struct socket flags fields. Cc: <stable@vger.kernel.org> # 6.0 Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/3dafafab822b1c66308bb58a0ac738b1e3f53f74.1666346426.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-10-22 08:42:58 -06:00
Wilken Gottwalt	5619c66091	hwmon: (corsair-psu) Add USB id of the new HX1500i psu Also update the documentation accordingly. Signed-off-by: Wilken Gottwalt <wilken.gottwalt@posteo.net> Link: https://lore.kernel.org/r/Y0FghqQCHG/cX5Jz@monster.localdomain Signed-off-by: Guenter Roeck <linux@roeck-us.net>	2022-10-22 06:59:12 -07:00
Paolo Bonzini	9aec606c16	tools: include: sync include/api/linux/kvm.h Provide a definition of KVM_CAP_DIRTY_LOG_RING_ACQ_REL. Fixes: `17601bfed9` ("KVM: Add KVM_CAP_DIRTY_LOG_RING_ACQ_REL capability and config option") Cc: Marc Zyngier <maz@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-22 07:54:19 -04:00

1 2 3 4 5 ...

1136479 Commits