linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-06-05 09:08:10 -04:00

Author	SHA1	Message	Date
Yu Kuai	696936838b	md/md-bitmap: merge md_bitmap_status() into bitmap_operations So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-20-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:16 -07:00
Yu Kuai	fe59b34676	md/md-bitmap: merge md_bitmap_update_sb() into bitmap_operations So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-19-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:16 -07:00
Yu Kuai	a0240e3ec7	md/md-bitmap: make md_bitmap_print_sb() internal md_bitmap_print_sb() is only used inside md-bitmap.c, hence make it static, also rename it to bitmap_print_sb. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-18-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:16 -07:00
Yu Kuai	ca925302e8	md/md-bitmap: merge md_bitmap_flush() into bitmap_operations So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-17-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:16 -07:00
Yu Kuai	a2bd703192	md/md-bitmap: merge md_bitmap_destroy() into bitmap_operations So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-16-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:16 -07:00
Yu Kuai	e1e4908059	md/md-bitmap: merge md_bitmap_load() into bitmap_operations So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-15-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:16 -07:00
Yu Kuai	04c80e6495	md/md-bitmap: merge md_bitmap_create() into bitmap_operations So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-14-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:15 -07:00
Yu Kuai	7545d385ec	md/md-bitmap: simplify md_bitmap_create() + md_bitmap_load() Other than internal api get_bitmap_from_slot(), all other places will set returned bitmap to mddev->bitmap. So move the setting of mddev->bitmap into md_bitmap_create() to simplify code. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-13-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:15 -07:00
Yu Kuai	7add9db6ba	md/md-bitmap: introduce struct bitmap_operations The structure is empty for now, and will be used in later patches to merge in bitmap operations, so that bitmap implementation won't be exposed. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-12-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:15 -07:00
Yu Kuai	27832ad3f7	md/md-bitmap: add a new helper md_bitmap_set_pages() Currently md-cluster will set bitmap->counts.pages directly, add a helper to do this to avoid dereferencing bitmap directly. Noted that after this patch bitmap is not dereferenced directly anymore and following patches will move the structure inside md-bitmap.c. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-11-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:15 -07:00
Yu Kuai	9e4481ce0e	md/md-cluster: use helper md_bitmap_get_stats() to get pages in resize_bitmaps() Use the existed helper instead of open coding it, avoid dereferencing bitmap directly to prepare inventing a new bitmap. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-10-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:15 -07:00
Yu Kuai	a0e7744a46	md/md-bitmap: add 'behind_writes' and 'behind_wait' into struct md_bitmap_stats There are no functional changes, avoid dereferencing bitmap directly to prepare inventing a new bitmap. Also fix following checkpatch warning by using wq_has_sleeper(). WARNING: waitqueue_active without comment Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-9-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:15 -07:00
Yu Kuai	10bc2ac105	md/md-bitmap: add 'file_pages' into struct md_bitmap_stats There are no functional changes, avoid dereferencing bitmap directly to prepare inventing a new bitmap. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-8-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:15 -07:00
Yu Kuai	ec6bb299c7	md/md-bitmap: add 'sync_size' into struct md_bitmap_stats To avoid dereferencing bitmap directly in md-cluster to prepare inventing a new bitmap. BTW, also fix following checkpatch warnings: WARNING: Deprecated use of 'kmap_atomic', prefer 'kmap_local_page' instead WARNING: Deprecated use of 'kunmap_atomic', prefer 'kunmap_local' instead Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-7-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:15 -07:00
Yu Kuai	82697ccf7e	md/md-cluster: fix spares warnings for __le64 drivers/md/md-cluster.c:1220:22: warning: incorrect type in assignment (different base types) drivers/md/md-cluster.c:1220:22: expected unsigned long my_sync_size drivers/md/md-cluster.c:1220:22: got restricted __le64 [usertype] sync_size drivers/md/md-cluster.c:1252:35: warning: incorrect type in assignment (different base types) drivers/md/md-cluster.c:1252:35: expected unsigned long sync_size drivers/md/md-cluster.c:1252:35: got restricted __le64 [usertype] sync_size drivers/md/md-cluster.c:1253:41: warning: restricted __le64 degrades to integer Fix the warnings by using le64_to_cpu() to convet __le64 to integer. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-6-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:14 -07:00
Yu Kuai	d004442f46	md/md-bitmap: add 'events_cleared' into struct md_bitmap_stats Also add a new helper to get events_cleared to avoid dereferencing bitmap directly to prepare inventing a new bitmap. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-5-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:14 -07:00
Yu Kuai	9681538122	md: use new helper md_bitmap_get_stats() in update_array_info() There are no functional changes, avoid dereferencing bitmap directly to prepare inventing a new bitmap. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-4-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:14 -07:00
Yu Kuai	38f287d7e4	md/md-bitmap: replace md_bitmap_status() with a new helper md_bitmap_get_stats() There are no functional changes, and the new helper will be used in multiple places in following patches to avoid dereferencing bitmap directly. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-3-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:14 -07:00
Yu Kuai	2db4fa1b7e	md/raid1: use md_bitmap_wait_behind_writes() in raid1_read_request() Use the existed helper instead of open coding it to make the code cleaner. There are no functional changes, and also avoid dereferencing bitmap directly to prepare inventing a new bitmap. Noted that this patch also export md_bitmap_wait_behind_writes(), which is necessary for now, and the exported api will be removed in following patches to convert bitmap apis into ops. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240826074452.1490072-2-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:14:14 -07:00
Yu Kuai	2d389a759d	md/raid1: Clean up local variable 'b' from raid1_read_request() The local variable will only be used onced, in the error path that read_balance() failed to find a valid rdev to read. Since now the rdev is ensured can't be removed from conf while IO is still pending, remove the local variable and dereference rdev directly. Since we're here, also remove an extra empty line, and unnecessary type conversion from sector_t(u64) to unsigned long long. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240801133008.459998-1-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 10:08:22 -07:00
Yu Kuai	86ad4cda79	md: Don't flush sync_work in md_write_start() Because flush sync_work may trigger mddev_suspend() if there are spares, and this should never be done in IO path because mddev_suspend() is used to wait for IO. This problem is found by code review. Fixes: `bc08041b32` ("md: suspend array in md_start_sync() if array need reconfiguration") Cc: stable@vger.kernel.org Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240801124746.242558-1-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-27 09:55:16 -07:00
Yuesong Li	02c0207ecd	dm bufio: Remove NULL check of list_entry() list_entry() will never return a NULL pointer, thus remove the check. Signed-off-by: Yuesong Li <liyuesong@vivo.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2024-08-22 18:32:55 +02:00
Ingo Franzki	4441686b24	dm-crypt: Allow to specify the integrity key size as option For the MAC based integrity operation, the integrity key size (i.e. key_mac_size) is currently set to the digest size of the used digest. For wrapped key HMAC algorithms, the key size is independent of the cryptographic key size. So there is no known size of the mac key in such cases. The desired key size can optionally be specified as argument when the dm-crypt device is configured via 'integrity_key_size:%u'. If no integrity_key_size argument is specified, the mac key size is still set to the digest size, as before. Increase version number to 1.28.0 so that support for the new argument can be detected by user space (i.e. cryptsetup). Signed-off-by: Ingo Franzki <ifranzki@linux.ibm.com> Reviewed-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2024-08-21 15:36:27 +02:00
Zhang Zekun	f3631ae11d	dm: Remove unused declaration and empty definition "dm_zone_map_bio" dm_zone_map_bio() has beed removed since commit `f211268ed1` ("dm: Use the block layer zone append emulation"), remain the declaration unused in header files. So, let's remove this unused declaration and empty definition. Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2024-08-21 13:12:12 +02:00
Susan LeGendre-McGhee	448c4e4eb1	dm vdo: force read-only mode for a corrupt recovery journal Ensure the recovery journal does not attempt recovery when blocks with mismatched metadata versions are detected. This check is performed after determining that the blocks are otherwise valid so that it does not interfere with normal recovery. Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2024-08-21 13:12:12 +02:00
Susan LeGendre-McGhee	f3ff668352	dm vdo: abort loading dirty VDO with the old recovery journal format Abort the load process with status code VDO_UNSUPPORTED_VERSION without forcing read-only mode when a journal block with the old format version is detected. Forcing the VDO volume into read-only mode and thus requiring a read-only rebuild should only be done when absolutely necessary. Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2024-08-21 13:11:34 +02:00
Bruce Johnston	47874c98dc	dm vdo: add dmsetup message for returning configuration info Add a new dmsetup message called config, which will return useful configuration information for the vdo volume and the uds index associated with it. The output is a YAML string, and contains a version number to allow future additions to the content. Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2024-08-21 13:05:56 +02:00
Ken Raeburn	3a59b2ec24	dm vdo: remove bad check of bi_next field Remove this check to prevent spurious warning messages due to the behavior of other storage layers. This check was intended to make sure dm-vdo does not chain metadata bios together. However, vdo has no control over underlying storage layers, so the assertion is not always true. Signed-off-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2024-08-21 13:05:56 +02:00
Ken Raeburn	0808ebf2f8	dm vdo: don't refer to dedupe_context after releasing it Clear the dedupe_context pointer in a data_vio whenever ownership of the context is lost, so that vdo can't examine it accidentally. Signed-off-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2024-08-21 13:05:56 +02:00
Deven Bowers	a6af7bc3d7	dm-verity: expose root hash digest and signature data to LSMs dm-verity provides a strong guarantee of a block device's integrity. As a generic way to check the integrity of a block device, it provides those integrity guarantees to its higher layers, including the filesystem level. However, critical security metadata like the dm-verity roothash and its signing information are not easily accessible to the LSMs. To address this limitation, this patch introduces a mechanism to store and manage these essential security details within a newly added LSM blob in the block_device structure. This addition allows LSMs to make access control decisions on the integrity data stored within the block_device, enabling more flexible security policies. For instance, LSMs can now revoke access to dm-verity devices based on their roothashes, ensuring that only authorized and verified content is accessible. Additionally, LSMs can enforce policies to only allow files from dm-verity devices that have a valid digital signature to execute, effectively blocking any unsigned files from execution, thus enhancing security against unauthorized modifications. The patch includes new hook calls, `security_bdev_setintegrity()`, in dm-verity to expose the dm-verity roothash and the roothash signature to LSMs via preresume() callback. By using the preresume() callback, it ensures that the security metadata is consistently in sync with the metadata of the dm-verity target in the current active mapping table. The hook calls are depended on CONFIG_SECURITY. Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com> Signed-off-by: Fan Wu <wufan@linux.microsoft.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> [PM: moved sig_size field as discussed] Signed-off-by: Paul Moore <paul@paul-moore.com>	2024-08-20 14:02:38 -04:00
Linus Torvalds	85652baa89	Merge tag 'block-6.11-20240824' of git://git.kernel.dk/linux Pull block fixes from Jens Axboe: - Fix corruption issues with s390/dasd (Eric, Stefan) - Fix a misuse of non irq locking grab of a lock (Li) - MD pull request with a single data corruption fix for raid1 (Yu) * tag 'block-6.11-20240824' of git://git.kernel.dk/linux: block: Fix lockdep warning in blk_mq_mark_tag_wait md/raid1: Fix data corruption for degraded array with slow disk s390/dasd: fix error recovery leading to data corruption on ESE devices s390/dasd: Remove DMA alignment	2024-08-16 14:03:31 -07:00
Yu Kuai	c916ca3530	md/raid1: Fix data corruption for degraded array with slow disk read_balance() will avoid reading from slow disks as much as possible, however, if valid data only lands in slow disks, and a new normal disk is still in recovery, unrecovered data can be read: raid1_read_request read_balance raid1_should_read_first -> return false choose_best_rdev -> normal disk is not recovered, return -1 choose_bb_rdev -> missing the checking of recovery, return the normal disk -> read unrecovered data Root cause is that the checking of recovery is missing in choose_bb_rdev(). Hence add such checking to fix the problem. Also fix similar problem in choose_slow_rdev(). Cc: stable@vger.kernel.org Fixes: `9f3ced7922` ("md/raid1: factor out choose_bb_rdev() from read_balance()") Fixes: `dfa8ecd167` ("md/raid1: factor out choose_slow_rdev() from read_balance()") Reported-and-tested-by: Mateusz Jończyk <mat.jonczyk@o2.pl> Closes: https://lore.kernel.org/all/9952f532-2554-44bf-b906-4880b2e88e3a@o2.pl/ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240803091137.3197008-1-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>	2024-08-15 13:38:17 -07:00
Chen Ni	ca958879ad	md: convert comma to semicolon Replace a comma between expression statements by a semicolon. Fixes: `5e5702898e` ("md/raid10: Handle read errors during recovery better.") Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Link: https://lore.kernel.org/r/20240716025852.400259-1-nichen@iscas.ac.cn Signed-off-by: Song Liu <song@kernel.org>	2024-08-15 00:06:50 -07:00
Mikulas Patocka	faada2174c	dm persistent data: fix memory allocation failure kmalloc is unreliable when allocating more than 8 pages of memory. It may fail when there is plenty of free memory but the memory is fragmented. Zdenek Kabelac observed such failure in his tests. This commit changes kmalloc to kvmalloc - kvmalloc will fall back to vmalloc if the large allocation fails. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reported-by: Zdenek Kabelac <zkabelac@redhat.com> Reviewed-by: Mike Snitzer <snitzer@kernel.org> Cc: stable@vger.kernel.org	2024-08-13 21:14:21 +02:00
Khazhismel Kumykov	7a636b4f03	dm resume: don't return EINVAL when signalled If the dm_resume method is called on a device that is not suspended, the method will suspend the device briefly, before resuming it (so that the table will be swapped). However, there was a bug that the return value of dm_suspended_md was not checked. dm_suspended_md may return an error when it is interrupted by a signal. In this case, do_resume would call dm_swap_table, which would return -EINVAL. This commit fixes the logic, so that error returned by dm_suspend is checked and the resume operation is undone. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Khazhismel Kumykov <khazhy@google.com> Cc: stable@vger.kernel.org	2024-08-13 13:51:34 +02:00
Mikulas Patocka	1e1fd567d3	dm suspend: return -ERESTARTSYS instead of -EINTR This commit changes device mapper, so that it returns -ERESTARTSYS instead of -EINTR when it is interrupted by a signal (so that the ioctl can be restarted). The manpage signal(7) says that the ioctl function should be restarted if the signal was handled with SA_RESTART. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org	2024-08-13 13:50:45 +02:00
Linus Torvalds	4477b39c32	minmax: add a few more MIN_T/MAX_T users Commit `3a7e02c040` ("minmax: avoid overly complicated constant expressions in VM code") added the simpler MIN_T/MAX_T macros in order to avoid some excessive expansion from the rather complicated regular min/max macros. The complexity of those macros stems from two issues: (a) trying to use them in situations that require a C constant expression (in static initializers and for array sizes) (b) the type sanity checking and MIN_T/MAX_T avoids both of these issues. Now, in the whole (long) discussion about all this, it was pointed out that the whole type sanity checking is entirely unnecessary for min_t/max_t which get a fixed type that the comparison is done in. But that still leaves min_t/max_t unnecessarily complicated due to worries about the C constant expression case. However, it turns out that there really aren't very many cases that use min_t/max_t for this, and we can just force-convert those. This does exactly that. Which in turn will then allow for much simpler implementations of min_t()/max_t(). All the usual "macros in all upper case will evaluate the arguments multiple times" rules apply. We should do all the same things for the regular min/max() vs MIN/MAX() cases, but that has the added complexity of various drivers defining their own local versions of MIN/MAX, so that needs another level of fixes first. Link: https://lore.kernel.org/all/b47fad1d0cf8449886ad148f8c013dae@AcuMS.aculab.com/ Cc: David Laight <David.Laight@aculab.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-07-28 13:41:14 -07:00
Linus Torvalds	7d080fa867	Merge tag 'for-6.11/block-20240722' of git://git.kernel.dk/linux Pull more block updates from Jens Axboe: - MD fixes via Song: - md-cluster fixes (Heming Zhao) - raid1 fix (Mateusz Jończyk) - s390/dasd module description (Jeff) - Series cleaning up and hardening the blk-mq debugfs flag handling (John, Christoph) - blk-cgroup cleanup (Xiu) - Error polled IO attempts if backend doesn't support it (hexue) - Fix for an sbitmap hang (Yang) * tag 'for-6.11/block-20240722' of git://git.kernel.dk/linux: (23 commits) blk-cgroup: move congestion_count to struct blkcg sbitmap: fix io hung due to race on sbitmap_word::cleared block: avoid polling configuration errors block: Catch possible entries missing from rqf_name[] block: Simplify definition of RQF_NAME() block: Use enum to define RQF_x bit indexes block: Catch possible entries missing from cmd_flag_name[] block: Catch possible entries missing from alloc_policy_name[] block: Catch possible entries missing from hctx_flag_name[] block: Catch possible entries missing from hctx_state_name[] block: Catch possible entries missing from blk_queue_flag_name[] block: Make QUEUE_FLAG_x as an enum block: Relocate BLK_MQ_MAX_DEPTH block: Relocate BLK_MQ_CPU_WORK_BATCH block: remove QUEUE_FLAG_STOPPED block: Add missing entry to hctx_flag_name[] block: Add zone write plugging entry to rqf_name[] block: Add missing entries from cmd_flag_name[] s390/dasd: fix error checks in dasd_copy_pair_store() s390/dasd: add missing MODULE_DESCRIPTION() macros ...	2024-07-22 11:32:05 -07:00
Linus Torvalds	0256994887	Merge tag 'for-6.11/block-post-20240722' of git://git.kernel.dk/linux Pull block integrity mapping updates from Jens Axboe: "A set of cleanups and fixes for the block integrity support. Sent separately from the main block changes from last week, as they depended on later fixes in the 6.10-rc cycle" * tag 'for-6.11/block-post-20240722' of git://git.kernel.dk/linux: block: don't free the integrity payload in bio_integrity_unmap_free_user block: don't free submitter owned integrity payload on I/O completion block: call bio_integrity_unmap_free_user from blk_rq_unmap_user block: don't call bio_uninit from bio_endio block: also return bio_integrity_payload * from stubs block: split integrity support out of bio.h	2024-07-22 11:04:09 -07:00
Linus Torvalds	527eff227d	Merge tag 'mm-nonmm-stable-2024-07-21-15-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - In the series "treewide: Refactor heap related implementation", Kuan-Wei Chiu has significantly reworked the min_heap library code and has taught bcachefs to use the new more generic implementation. - Yury Norov's series "Cleanup cpumask.h inclusion in core headers" reworks the cpumask and nodemask headers to make things generally more rational. - Kuan-Wei Chiu has sent along some maintenance work against our sorting library code in the series "lib/sort: Optimizations and cleanups". - More library maintainance work from Christophe Jaillet in the series "Remove usage of the deprecated ida_simple_xx() API". - Ryusuke Konishi continues with the nilfs2 fixes and clanups in the series "nilfs2: eliminate the call to inode_attach_wb()". - Kuan-Ying Lee has some fixes to the gdb scripts in the series "Fix GDB command error". - Plus the usual shower of singleton patches all over the place. Please see the relevant changelogs for details. * tag 'mm-nonmm-stable-2024-07-21-15-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (98 commits) ia64: scrub ia64 from poison.h watchdog/perf: properly initialize the turbo mode timestamp and rearm counter tsacct: replace strncpy() with strscpy() lib/bch.c: use swap() to improve code test_bpf: convert comma to semicolon init/modpost: conditionally check section mismatch to __meminit* init: remove unused __MEMINIT* macros nilfs2: Constify struct kobj_type nilfs2: avoid undefined behavior in nilfs_cnt32_ge macro math: rational: add missing MODULE_DESCRIPTION() macro lib/zlib: add missing MODULE_DESCRIPTION() macro fs: ufs: add MODULE_DESCRIPTION() lib/rbtree.c: fix the example typo ocfs2: add bounds checking to ocfs2_check_dir_entry() fs: add kernel-doc comments to ocfs2_prepare_orphan_dir() coredump: simplify zap_process() selftests/fpu: add missing MODULE_DESCRIPTION() macro compiler.h: simplify data_race() macro build-id: require program headers to be right after ELF header resource: add missing MODULE_DESCRIPTION() ...	2024-07-21 17:56:22 -07:00
Linus Torvalds	661fb4e68c	Merge tag 'for-6.11/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mikulas Patocka: - Optimize processing of flush bios in the dm-linear and dm-stripe targets - Dm-io cleansups and refactoring - Remove unused 'struct thunk' in dm-cache - Handle minor device numbers > 255 in dm-init - Dm-verity refactoring & enabling platform keyring - Fix warning in dm-raid - Improve dm-crypt performance - split bios to smaller pieces, so that They could be processed concurrently - Stop using blk_limits_io_{min,opt} - Dm-vdo cleanup and refactoring - Remove max_write_zeroes_granularity and max_secure_erase_granularity - Dm-multipath cleanup & refactoring - Add dm-crypt and dm-integrity support for non-power-of-2 sector size - Fix reshape in dm-raid - Make dm_block_validator const * tag 'for-6.11/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (33 commits) dm vdo: fix a minor formatting issue in vdo.rst dm vdo int-map: fix kerneldoc formatting dm vdo repair: add missing kerneldoc fields dm: Constify struct dm_block_validator dm-integrity: introduce the Inline mode dm: introduce the target flag mempool_needs_integrity dm raid: fix stripes adding reshape size issues dm raid: move _get_reshape_sectors() as prerequisite to fixing reshape size issues dm-crypt: support for per-sector NVMe metadata dm mpath: don't call dm_get_device in multipath_message dm: factor out helper function from dm_get_device dm-verity: fix dm_is_verity_target() when dm-verity is builtin dm: Remove max_secure_erase_granularity dm: Remove max_write_zeroes_granularity dm vdo indexer: use swap() instead of open coding it dm vdo: remove unused struct 'uds_attribute' dm: stop using blk_limits_io_{min,opt} dm-crypt: limit the size of encryption requests dm verity: add support for signature verification with platform keyring dm-raid: Fix WARN_ON_ONCE check for sync_thread in raid_resume ...	2024-07-19 10:48:44 -07:00
Matthew Sakai	513789b7fb	dm vdo int-map: fix kerneldoc formatting Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202407141607.M3E2XQ0Z-lkp@intel.com/ Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2024-07-19 12:08:21 +02:00
Matthew Sakai	fa398e603f	dm vdo repair: add missing kerneldoc fields Also remove trivial comment for increment_recovery_point. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9518 Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2024-07-19 12:08:21 +02:00
Christophe JAILLET	0b60be1628	dm: Constify struct dm_block_validator 'struct dm_block_validator' are not modified in these drivers. Constifying this structure moves some data to a read-only section, so increase overall security. On a x86_64, with allmodconfig, as an example: Before: ====== text data bss dec hex filename 32047 920 16 32983 80d7 drivers/md/dm-cache-metadata.o After: ===== text data bss dec hex filename 32075 896 16 32987 80db drivers/md/dm-cache-metadata.o Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2024-07-19 12:08:15 +02:00
Mikulas Patocka	fb0987682c	dm-integrity: introduce the Inline mode This commit introduces a new 'I' mode for dm-integrity. The 'I' mode may be selected if the underlying device has non-power-of-2 sector size. In this mode, dm-integrity will store integrity data directly in device's sectors and it will not use journal. This mode improves performance and reduces flash wear because there would be no journal writes. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-07-19 12:06:44 +02:00
Linus Torvalds	f097ef0e76	Merge tag 'dlm-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: - New flag DLM_LSFL_SOFTIRQ_SAFE can be set by code using dlm to indicate callbacks can be run from softirq - Change md-cluster to set DLM_LSFL_SOFTIRQ_SAFE - Clean up for previous changes, e.g. unused code and parameters - Remove custom pre-allocation of rsb structs which is unnecessary with kmem caches - Change idr to xarray for lkb structs in use - Change idr to xarray for rsb structs being recovered - Change outdated naming related to internal rsb states - Fix some incorrect add/remove of rsb on scan list - Use rcu to free rsb structs * tag 'dlm-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: add rcu_barrier before destroy kmem cache dlm: remove DLM_LSFL_SOFTIRQ from exflags fs: dlm: remove unused struct 'dlm_processed_nodes' md-cluster: use DLM_LSFL_SOFTIRQ for dlm_new_lockspace() dlm: implement LSFL_SOFTIRQ_SAFE dlm: introduce DLM_LSFL_SOFTIRQ_SAFE dlm: use LSFL_FS to check for kernel lockspace dlm: use rcu to avoid an extra rsb struct lookup dlm: fix add_scan and del_scan usage dlm: change list and timer names dlm: move recover idr to xarray datastructure dlm: move lkb idr to xarray datastructure dlm: drop own rsb pre allocation mechanism dlm: remove ls_local_handle from struct dlm_ls dlm: remove unused parameter in dlm_midcomms_addr dlm: don't kref_init rsbs created for toss list dlm: remove scand leftovers	2024-07-17 12:16:22 -07:00
Linus Torvalds	3e78198862	Merge tag 'for-6.11/block-20240710' of git://git.kernel.dk/linux Pull block updates from Jens Axboe: - NVMe updates via Keith: - Device initialization memory leak fixes (Keith) - More constants defined (Weiwen) - Target debugfs support (Hannes) - PCIe subsystem reset enhancements (Keith) - Queue-depth multipath policy (Redhat and PureStorage) - Implement get_unique_id (Christoph) - Authentication error fixes (Gaosheng) - MD updates via Song - sync_action fix and refactoring (Yu Kuai) - Various small fixes (Christoph Hellwig, Li Nan, and Ofir Gal, Yu Kuai, Benjamin Marzinski, Christophe JAILLET, Yang Li) - Fix loop detach/open race (Gulam) - Fix lower control limit for blk-throttle (Yu) - Add module descriptions to various drivers (Jeff) - Add support for atomic writes for block devices, and statx reporting for same. Includes SCSI and NVMe (John, Prasad, Alan) - Add IO priority information to block trace points (Dongliang) - Various zone improvements and tweaks (Damien) - mq-deadline tag reservation improvements (Bart) - Ignore direct reclaim swap writes in writeback throttling (Baokun) - Block integrity improvements and fixes (Anuj) - Add basic support for rust based block drivers. Has a dummy null_blk variant for now (Andreas) - Series converting driver settings to queue limits, and cleanups and fixes related to that (Christoph) - Cleanup for poking too deeply into the bvec internals, in preparation for DMA mapping API changes (Christoph) - Various minor tweaks and fixes (Jiapeng, John, Kanchan, Mikulas, Ming, Zhu, Damien, Christophe, Chaitanya) * tag 'for-6.11/block-20240710' of git://git.kernel.dk/linux: (206 commits) floppy: add missing MODULE_DESCRIPTION() macro loop: add missing MODULE_DESCRIPTION() macro ublk_drv: add missing MODULE_DESCRIPTION() macro xen/blkback: add missing MODULE_DESCRIPTION() macro block/rnbd: Constify struct kobj_type block: take offset into account in blk_bvec_map_sg again block: fix get_max_segment_size() warning loop: Don't bother validating blocksize virtio_blk: Don't bother validating blocksize null_blk: Don't bother validating blocksize block: Validate logical block size in blk_validate_limits() virtio_blk: Fix default logical block size fallback nvmet-auth: fix nvmet_auth hash error handling nvme: implement ->get_unique_id block: pass a phys_addr_t to get_max_segment_size block: add a bvec_phys helper blk-lib: check for kill signal in ioctl BLKZEROOUT block: limit the Write Zeroes to manually writing zeroes fallback block: refacto blkdev_issue_zeroout block: move read-only and supported checks into (__)blkdev_issue_zeroout ...	2024-07-15 14:20:22 -07:00
Mikulas Patocka	617069741d	dm: introduce the target flag mempool_needs_integrity This commit introduces the dm target flag mempool_needs_integrity. When the flag is set, device mapper will call bioset_integrity_create on it's bio sets. The target can then call bio_integrity_alloc on the bios allocated from the table's mempool. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-07-12 12:39:06 -04:00
Mateusz Jończyk	36a5c03f23	md/raid1: set max_sectors during early return from choose_slow_rdev() Linux 6.9+ is unable to start a degraded RAID1 array with one drive, when that drive has a write-mostly flag set. During such an attempt, the following assertion in bio_split() is hit: BUG_ON(sectors <= 0); Call Trace: ? bio_split+0x96/0xb0 ? exc_invalid_op+0x53/0x70 ? bio_split+0x96/0xb0 ? asm_exc_invalid_op+0x1b/0x20 ? bio_split+0x96/0xb0 ? raid1_read_request+0x890/0xd20 ? __call_rcu_common.constprop.0+0x97/0x260 raid1_make_request+0x81/0xce0 ? __get_random_u32_below+0x17/0x70 ? new_slab+0x2b3/0x580 md_handle_request+0x77/0x210 md_submit_bio+0x62/0xa0 __submit_bio+0x17b/0x230 submit_bio_noacct_nocheck+0x18e/0x3c0 submit_bio_noacct+0x244/0x670 After investigation, it turned out that choose_slow_rdev() does not set the value of max_sectors in some cases and because of it, raid1_read_request calls bio_split with sectors == 0. Fix it by filling in this variable. This bug was introduced in commit `dfa8ecd167` ("md/raid1: factor out choose_slow_rdev() from read_balance()") but apparently hidden until commit `0091c5a269` ("md/raid1: factor out helpers to choose the best rdev from read_balance()") shortly thereafter. Cc: stable@vger.kernel.org # 6.9.x+ Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl> Fixes: `dfa8ecd167` ("md/raid1: factor out choose_slow_rdev() from read_balance()") Cc: Song Liu <song@kernel.org> Cc: Yu Kuai <yukuai3@huawei.com> Cc: Paul Luse <paul.e.luse@linux.intel.com> Cc: Xiao Ni <xni@redhat.com> Cc: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Link: https://lore.kernel.org/linux-raid/20240706143038.7253-1-mat.jonczyk@o2.pl/ -- Tested on both Linux 6.10 and 6.9.8. Inside a VM, mdadm testsuite for RAID1 on 6.10 did not find any problems: ./test --dev=loop --no-error --raidtype=raid1 (on 6.9.8 there was one failure, caused by external bitmap support not compiled in). Notes: - I was reliably getting deadlocks when adding / removing devices on such an array - while the array was loaded with fsstress with 20 concurrent processes. When the array was idle or loaded with fsstress with 8 processes, no such deadlocks happened in my tests. This occurred also on unpatched Linux 6.8.0 though, but not on 6.1.97-rc1, so this is likely an independent regression (to be investigated). - I was also getting deadlocks when adding / removing the bitmap on the array in similar conditions - this happened on Linux 6.1.97-rc1 also though. fsstress with 8 concurrent processes did cause it only once during many tests. - in my testing, there was once a problem with hot adding an internal bitmap to the array: mdadm: Cannot add bitmap while array is resyncing or reshaping etc. mdadm: failed to set internal bitmap. even though no such reshaping was happening according to /proc/mdstat. This seems unrelated, though. Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240711202316.10775-1-mat.jonczyk@o2.pl	2024-07-12 01:30:38 +00:00
Heming Zhao	35a0a409fa	md-cluster: fix no recovery job when adding/re-adding a disk The commit `db5e653d7c` ("md: delay choosing sync action to md_start_sync()") delays the start of the sync action. In a clustered environment, this will cause another node to first activate the spare disk and skip recovery. As a result, no nodes will perform recovery when a disk is added or re-added. Before `db5e653d7c`: ``` node1 node2 ---------------------------------------------------------------- md_check_recovery + md_update_sb \| sendmsg: METADATA_UPDATED + md_choose_sync_action process_metadata_update \| remove_and_add_spares //node1 has not finished adding + call mddev->sync_work //the spare disk:do nothing md_start_sync starts md_do_sync md_do_sync + grabbed resync_lockres:DLM_LOCK_EX + do syncing job md_check_recovery sendmsg: METADATA_UPDATED process_metadata_update //activate spare disk ... ... md_do_sync waiting to grab resync_lockres:EX ``` After `db5e653d7c`: (note: if 'cmd:idle' sets MD_RECOVERY_INTR after md_check_recovery starts md_start_sync, setting the INTR action will exacerbate the delay in node1 calling the md_do_sync function.) ``` node1 node2 ---------------------------------------------------------------- md_check_recovery + md_update_sb \| sendmsg: METADATA_UPDATED + calls mddev->sync_work process_metadata_update //node1 has not finished adding //the spare disk:do nothing md_start_sync + md_choose_sync_action \| remove_and_add_spares + calls md_do_sync md_check_recovery md_update_sb sendmsg: METADATA_UPDATED process_metadata_update //activate spare disk ... ... ... ... md_do_sync + grabbed resync_lockres:EX + raid1_sync_request skip sync under conf->fullsync:0 md_do_sync 1. waiting to grab resync_lockres:EX 2. when node1 could grab EX lock, node1 will skip resync under recovery_offset:MaxSector ``` How to trigger: ```(commands @node1) # to easily watch the recovery status echo 2000 > /proc/sys/dev/raid/speed_limit_max ssh root@node2 "echo 2000 > /proc/sys/dev/raid/speed_limit_max" mdadm -CR /dev/md0 -l1 -b clustered -n 2 /dev/sda /dev/sdb --assume-clean ssh root@node2 mdadm -A /dev/md0 /dev/sda /dev/sdb mdadm --manage /dev/md0 --fail /dev/sda --remove /dev/sda mdadm --manage /dev/md0 --add /dev/sdc === "cat /proc/mdstat" on both node, there are no recovery action. === ``` How to fix: because md layer code logic is hard to restore for speeding up sync job on local node, we add new cluster msg to pending the another node to active disk. Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Su Yue <glass.su@suse.com> Acked-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240709104120.22243-2-heming.zhao@suse.com	2024-07-12 01:30:18 +00:00

... 5 6 7 8 9 ...

8552 Commits