linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-12-27 11:06:41 -05:00

Author	SHA1	Message	Date
Linus Torvalds	6a1636e066	Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "The only core fix is in doc; all the others are in drivers, with the biggest impacts in libsas being the rollback on error handling and in ufs coming from a couple of error handling fixes, one causing a crash if it's activated before scanning and the other fixing W-LUN resumption" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: qcom: Fix confusing cleanup.h syntax scsi: libsas: Add rollback handling when an error occurs scsi: device_handler: Return error pointer in scsi_dh_attached_handler_name() scsi: ufs: core: Fix a deadlock in the frequency scaling code scsi: ufs: core: Fix an error handler crash scsi: Revert "scsi: libsas: Fix exp-attached device scan after probe failure scanned in again after probe failed" scsi: ufs: core: Fix RPMB link error by reversing Kconfig dependencies scsi: qla4xxx: Use time conversion macros scsi: qla2xxx: Enable/disable IRQD_NO_BALANCING during reset scsi: ipr: Enable/disable IRQD_NO_BALANCING during reset scsi: imm: Fix use-after-free bug caused by unfinished delayed work scsi: target: sbp: Remove KMSG_COMPONENT macro scsi: core: Correct documentation for scsi_device_quiesce() scsi: mpi3mr: Prevent duplicate SAS/SATA device entries in channel 1 scsi: target: Reset t_task_cdb pointer in error case scsi: ufs: core: Fix EH failure after W-LUN resume error	2025-12-14 15:35:35 +12:00
Linus Torvalds	35ebee7e72	Merge tag 'block-6.19-20251211' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block fixes from Jens Axboe: - Always initialize DMA state, fixing a potentially nasty issue on the block side - btrfs zoned write fix with cached zone reports - Fix corruption issues in bcache with chained bio's, and further make it clear that the chained IO handler is simply a marker, it's not code meant to be executed - Kill old code dealing with synchronous IO polling in the block layer, that has been dead for a long time. Only async polling is supported these days - Fix a lockdep issue in tag_set management, moving it to RCU - Fix an issue with ublks bio_vec iteration - Don't unconditionally enforce blocking issue of ublk control commands, allow some of them with non-blocking issue as they do not block * tag 'block-6.19-20251211' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: blk-mq-dma: always initialize dma state blk-mq: delete task running check in blk_hctx_poll() block: fix cached zone reports on devices with native zone append block: Use RCU in blk_mq_[un]quiesce_tagset() instead of set->tag_list_lock ublk: don't mutate struct bio_vec in iteration block: prohibit calls to bio_chain_endio bcache: fix improper use of bi_end_io ublk: allow non-blocking ctrl cmds in IO_URING_F_NONBLOCK issue	2025-12-12 22:04:18 +12:00
Linus Torvalds	d358e52546	Merge tag 'for-6.19/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mikulas Patocka: - convert crypto_shash users to direct crypto library use with simpler and faster code and reduced stack usage (Eric Biggers): - the dm-verity SHA-256 conversion also teaches it to do two-way interleaved hashing for added performance - dm-crypt MD5 conversion (used for Loop-AES compatibility) - added document for for takeover/reshape raid1 -> raid5 examples (Heinz Mauelshagen) - fix dm-vdo kerneldoc warnings (Matthew Sakai) - various random fixes and cleanups * tag 'for-6.19/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (29 commits) dm pcache: fix segment info indexing dm pcache: fix cache info indexing dm-pcache: advance slot index before writing slot dm raid: add documentation for takeover/reshape raid1 -> raid5 table line examples dm log-writes: Add missing set_freezable() for freezable kthread dm-raid: fix possible NULL dereference with undefined raid type dm-snapshot: fix 'scheduling while atomic' on real-time kernels dm: ignore discard return value MAINTAINERS: add Benjamin Marzinski as a device mapper maintainer dm-mpath: Simplify the setup_scsi_dh code dm vdo: fix kerneldoc warnings dm-bufio: align write boundary on physical block size dm-crypt: enable DM_TARGET_ATOMIC_WRITES dm: test for REQ_ATOMIC in dm_accept_partial_bio() dm-verity: remove useless mempool dm-verity: disable recursive forward error correction dm-ebs: Mark full buffer dirty even on partial write dm mpath: enable DM_TARGET_ATOMIC_WRITES dm verity fec: Expose corrected block count via status dm: Don't warn if IMA_DISABLE_HTABLE is not enabled ...	2025-12-11 12:13:29 +09:00
Li Chen	13ea55ea20	dm pcache: fix segment info indexing Segment info indexing also used sizeof(struct) instead of the 4K metadata stride, so info_index could point between slots and subsequent writes would advance incorrectly. Derive info_index from the pointer returned by the segment meta search using PCACHE_SEG_INFO_SIZE and advance to the next slot for future updates. Signed-off-by: Li Chen <chenl311@chinatelecom.cn> Signed-off-by: Dongsheng Yang <dongsheng.yang@linux.dev> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Zheng Gu <cengku@gmail.com> Cc: stable@vger.kernel.org # 6.18	2025-12-10 19:28:23 +01:00
Li Chen	ee76331783	dm pcache: fix cache info indexing The on-media cache_info index used sizeof(struct) instead of the 4K metadata stride, so gc_percent updates from dmsetup message were written between slots and lost after reboot. Use PCACHE_CACHE_INFO_SIZE in get_cache_info_addr() and align info_index with the slot returned by pcache_meta_find_latest(). Signed-off-by: Li Chen <chenl311@chinatelecom.cn> Signed-off-by: Dongsheng Yang <dongsheng.yang@linux.dev> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Zheng Gu <cengku@gmail.com> Cc: stable@vger.kernel.org # 6.18	2025-12-10 19:28:23 +01:00
Dongsheng Yang	ebbb90344a	dm-pcache: advance slot index before writing slot In dm-pcache, in order to ensure crash-consistency, a dual-copy scheme is used to alternately update metadata, and there is a slot index that records the current slot. However, in the write path the current implementation writes directly to the current slot indexed by slot index, and then advances the slot — which ends up overwriting the existing slot, violating the crash-consistency guarantee. This patch fixes that behavior, preventing metadata from being overwritten incorrectly. In addition, this patch add a missing pmem_wmb() after memcpy_flushcache(). Signed-off-by: Dongsheng Yang <dongsheng.yang@linux.dev> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Zheng Gu <cengku@gmail.com> Cc: stable@vger.kernel.org # 6.18	2025-12-10 19:28:23 +01:00
Haotian Zhang	ab08f9c8b3	dm log-writes: Add missing set_freezable() for freezable kthread The log_writes_kthread() calls try_to_freeze() but lacks set_freezable(), rendering the freeze attempt ineffective since kernel threads are non-freezable by default. This prevents proper thread suspension during system suspend/hibernate. Add set_freezable() to explicitly mark the thread as freezable. Fixes: `0e9cebe724` ("dm: add log writes target") Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn> Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2025-12-10 19:28:22 +01:00
Alexey Simakov	2f6cfd6d7c	dm-raid: fix possible NULL dereference with undefined raid type rs->raid_type is assigned from get_raid_type_by_ll(), which may return NULL. This NULL value could be dereferenced later in the condition 'if (!(rs_is_raid10(rs) && rt_is_raid0(rs->raid_type)))'. Add a fail-fast check to return early with an error if raid_type is NULL, similar to other uses of this function. Found by Linux Verification Center (linuxtesting.org) with Svace. Fixes: `33e53f0685` ("dm raid: introduce extended superblock and new raid types to support takeover/reshaping") Signed-off-by: Alexey Simakov <bigalex934@gmail.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2025-12-10 19:28:22 +01:00
Mikulas Patocka	8581b19eb2	dm-snapshot: fix 'scheduling while atomic' on real-time kernels There is reported 'scheduling while atomic' bug when using dm-snapshot on real-time kernels. The reason for the bug is that the hlist_bl code does preempt_disable() when taking the lock and the kernel attempts to take other spinlocks while holding the hlist_bl lock. Fix this by converting a hlist_bl spinlock into a regular spinlock. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reported-by: Jiping Ma <jiping.ma2@windriver.com>	2025-12-10 19:28:22 +01:00
Chaitanya Kulkarni	f4412c7d5a	dm: ignore discard return value __blkdev_issue_discard() always returns 0, making all error checking at call sites dead code. For dm-thin change issue_discard() return type to void, in passdown_double_checking_shared_status() remove the r assignment from return value of the issue_discard(), for end_discard() hardcode value of r to 0 that matches only value returned from __blkdev_issue_discard(). Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chaitanya Kulkarni <ckulkarnilinux@gmail.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2025-12-10 19:28:22 +01:00
Benjamin Marzinski	27f204c215	dm-mpath: Simplify the setup_scsi_dh code There's no point to the MPATHF_RETAIN_ATTACHED_HW_HANDLER flag any more. The way setup_scsi_dh() worked, if that flag wasn't set, it would attempt to attach any passed in hardware handler. This would always fail if a different hardware handler was attached, which caused setup_scsi_dh() to rerun as if the flag was set. So the code would already retain any attached handler, because attaching a different one would always fail. Also, the code had a bug. If attached_handler_name was NULL but there was a scsi device handler attached (because either scsi_dh_attached_handler_name failed() to allocate a name, a handler got attached after it was called) the code would loop endlessly. Instead, ignore MPATHF_RETAIN_ATTACHED_HW_HANDLER, and always free the passed in handler if *attached_handler_name is set. This simplifies the code, and avoids the endless loop bug, while keeping the functionality the same. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2025-12-10 19:28:22 +01:00
Matthew Sakai	4efe85b0c4	dm vdo: fix kerneldoc warnings Fix kerneldoc warnings across the dm-vdo target. Also remove some unhelpful or inaccurate doc comments, and fix some format inconsistencies that did not produce warnings. No functional changes. Suggested-by: Sunday Adelodun <adelodunolaoluwa@yahoo.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2025-12-10 19:28:22 +01:00
Mikulas Patocka	d0ac06ae53	dm-bufio: align write boundary on physical block size There may be devices with physical block size larger than 4k. If dm-bufio sends I/O that is not aligned on physical block size, performance is degraded. The 4k minimum alignment limit is there because some SSDs report logical and physical block size 512 despite having 4k internally - so dm-bufio shouldn't send I/Os not aligned on 4k boundary, because they perform badly (the SSD does read-modify-write for them). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reported-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: stable@vger.kernel.org	2025-12-10 19:28:22 +01:00
Mikulas Patocka	ce51c6963a	dm-crypt: enable DM_TARGET_ATOMIC_WRITES Allow handling of bios with REQ_ATOMIC flag set. Don't split these bios and fail them if they overrun the hard limit "BIO_MAX_VECS << PAGE_SHIFT". In order to simplify the code, this commit joins the logic that avoids splitting emulated zone append bios with the logic that avoids splitting atomic write bios. Signed-off-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Tested-by: John Garry <john.g.garry@oracle.com>	2025-12-10 19:28:22 +01:00
Mikulas Patocka	de67c139b3	dm: test for REQ_ATOMIC in dm_accept_partial_bio() Any bio with REQ_ATOMIC flag set should never be split or partially completed, so BUG_ON() on this scenario in dm_accept_partial_bio() (whose intent is to allow partial completions). Also, we must reject atomic bio to targets that don't support them, otherwise this BUG could be triggered by stray bios that have the REQ_ATOMIC set. Signed-off-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Tested-by: John Garry <john.g.garry@oracle.com>	2025-12-10 19:28:12 +01:00
Mikulas Patocka	b9dd1f71e6	dm-verity: remove useless mempool v->fec->extra_pool has zero reserved entries, so we can remove it and use the kernel cache directly. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Eric Biggers <ebiggers@kernel.org>	2025-12-10 19:28:12 +01:00
Mikulas Patocka	d9f3e47d3f	dm-verity: disable recursive forward error correction There are two problems with the recursive correction: 1. It may cause denial-of-service. In fec_read_bufs, there is a loop that has 253 iterations. For each iteration, we may call verity_hash_for_block recursively. There is a limit of 4 nested recursions - that means that there may be at most 253^4 (4 billion) iterations. Red Hat QE team actually created an image that pushes dm-verity to this limit - and this image just makes the udev-worker process get stuck in the 'D' state. 2. It doesn't work. In fec_read_bufs we store data into the variable "fio->bufs", but fio bufs is shared between recursive invocations, if "verity_hash_for_block" invoked correction recursively, it would overwrite partially filled fio->bufs. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reported-by: Guangwu Zhang <guazhang@redhat.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Eric Biggers <ebiggers@kernel.org>	2025-12-10 19:27:59 +01:00
Shida Zhang	53280e3984	bcache: fix improper use of bi_end_io Don't call bio->bi_end_io() directly. Use the bio_endio() helper function instead, which handles completion more safely and uniformly. Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Shida Zhang <zhangshida@kylinos.cn> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-12-09 10:20:03 -07:00
Benjamin Marzinski	fd81bc5cca	scsi: device_handler: Return error pointer in scsi_dh_attached_handler_name() If scsi_dh_attached_handler_name() fails to allocate the handler name, dm-multipath (its only caller) assumes there is no attached device handler, and sets the device up incorrectly. Return an error pointer instead, so multipath can distinguish between failure, success where there is no attached device handler, or when the path device is not a SCSI device at all. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Link: https://patch.msgid.link/20251206010015.1595225-1-bmarzins@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-12-08 22:04:38 -05:00
Linus Torvalds	cc25df3e2e	Merge tag 'for-6.19/block-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: - Fix head insertion for mq-deadline, a regression from when priority support was added - Series simplifying and improving the ublk user copy code - Various ublk related cleanups - Fixup REQ_NOWAIT handling in loop/zloop, clearing NOWAIT when the request is punted to a thread for handling - Merge and then later revert loop dio nowait support, as it ended up causing excessive stack usage for when the inline issue code needs to dip back into the full file system code - Improve auto integrity code, making it less deadlock prone - Speedup polled IO handling, but manually managing the hctx lookups - Fixes for blk-throttle for SSD devices - Small series with fixes for the S390 dasd driver - Add support for caching zones, avoiding unnecessary report zone queries - MD pull requests via Yu: - fix null-ptr-dereference regression for dm-raid0 - fix IO hang for raid5 when array is broken with IO inflight - remove legacy 1s delay to speed up system shutdown - change maintainer's email address - data can be lost if array is created with different lbs devices, fix this problem and record lbs of the array in metadata - fix rcu protection for md_thread - fix mddev kobject lifetime regression - enable atomic writes for md-linear - some cleanups - bcache updates via Coly - remove useless discard and cache device code - improve usage of per-cpu workqueues - Reorganize the IO scheduler switching code, fixing some lockdep reports as well - Improve the block layer P2P DMA support - Add support to the block tracing code for zoned devices - Segment calculation improves, and memory alignment flexibility improvements - Set of prep and cleanups patches for ublk batching support. The actual batching hasn't been added yet, but helps shrink down the workload of getting that patchset ready for 6.20 - Fix for how the ps3 block driver handles segments offsets - Improve how block plugging handles batch tag allocations - nbd fixes for use-after-free of the configuration on device clear/put - Set of improvements and fixes for zloop - Add Damien as maintainer of the block zoned device code handling - Various other fixes and cleanups * tag 'for-6.19/block-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (162 commits) block/rnbd: correct all kernel-doc complaints blk-mq: use queue_hctx in blk_mq_map_queue_type md: remove legacy 1s delay in md_notify_reboot md/raid5: fix IO hang when array is broken with IO inflight md: warn about updating super block failure md/raid0: fix NULL pointer dereference in create_strip_zones() for dm-raid sbitmap: fix all kernel-doc warnings ublk: add helper of __ublk_fetch() ublk: pass const pointer to ublk_queue_is_zoned() ublk: refactor auto buffer register in ublk_dispatch_req() ublk: add `union ublk_io_buf` with improved naming ublk: add parameter `struct io_uring_cmd *` to ublk_prep_auto_buf_reg() kfifo: add kfifo_alloc_node() helper for NUMA awareness blk-mq: fix potential uaf for 'queue_hw_ctx' blk-mq: use array manage hctx map instead of xarray ublk: prevent invalid access with DEBUG s390/dasd: Use scnprintf() instead of sprintf() s390/dasd: Move device name formatting into separate function s390/dasd: Remove unnecessary debugfs_create() return checks s390/dasd: Fix gendisk parent after copy pair swap ...	2025-12-03 19:26:18 -08:00
Linus Torvalds	7f8d5f70ff	Merge tag 'core-core-2025-12-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core irq cleanup from Thomas Gleixner: "Tree wide cleanup of the remaining users of in_irq() which got replaced by in_hardirq() and marked deprecated in 2020" * tag 'core-core-2025-12-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: treewide: Remove in_irq()	2025-12-02 10:18:49 -08:00
Tarun Sahu	fdd0c6a649	md: remove legacy 1s delay in md_notify_reboot During system shutdown, the md driver registered notifier function (md_notify_reboot) currently imposes a hardcoded one-second delay. This delay was introduced approximately 23 years ago and was likely necessary for the hardware generation of that time. Proposing this patch to make sure there are no known devices that need this delay. Link: https://lore.kernel.org/linux-raid/20251121191422.2758555-1-tarunsahu@google.com Signed-off-by: Tarun Sahu <tarunsahu@google.com> Reviewed-by: Yu Kuai<yukuai@fnnas.com> Signed-off-by: Yu Kuai <yukuai@fnnas.com>	2025-11-30 09:42:28 +08:00
Yu Kuai	a913d1f6a7	md/raid5: fix IO hang when array is broken with IO inflight Following test can cause IO hang: mdadm -CvR /dev/md0 -l10 -n4 /dev/sd[abcd] --assume-clean --chunk=64K --bitmap=none sleep 5 echo 1 > /sys/block/sda/device/delete echo 1 > /sys/block/sdb/device/delete echo 1 > /sys/block/sdc/device/delete echo 1 > /sys/block/sdd/device/delete dd if=/dev/md0 of=/dev/null bs=8k count=1 iflag=direct Root cause: 1) all disks removed, however all rdevs in the array is still in sync, IO will be issued normally. 2) IO failure from sda, and set badblocks failed, sda will be faulty and MD_SB_CHANGING_PENDING will be set. 3) error recovery try to recover this IO from other disks, IO will be issued to sdb, sdc, and sdd. 4) IO failure from sdb, and set badblocks failed again, now array is broken and will become read-only. 5) IO failure from sdc and sdd, however, stripe can't be handled anymore because MD_SB_CHANGING_PENDING is set: handle_stripe handle_stripe if (test_bit MD_SB_CHANGING_PENDING) set_bit STRIPE_HANDLE goto finish // skip handling failed stripe release_stripe if (test_bit STRIPE_HANDLE) list_add_tail conf->hand_list 6) later raid5d can't handle failed stripe as well: raid5d md_check_recovery md_update_sb if (!md_is_rdwr()) // can't clear pending bit return if (test_bit MD_SB_CHANGING_PENDING) break; // can't handle failed stripe Since MD_SB_CHANGING_PENDING can never be cleared for read-only array, fix this problem by skip this checking for read-only array. Link: https://lore.kernel.org/linux-raid/20251117085557.770572-3-yukuai@fnnas.com Fixes: `d87f064f58` ("md: never update metadata when array is read-only.") Signed-off-by: Yu Kuai <yukuai@fnnas.com> Reviewed-by: Li Nan <linan122@huawei.com>	2025-11-30 09:38:45 +08:00
Yu Kuai	8c9e376b9d	md: warn about updating super block failure Many personalities will handle IO error from daemon thread(like raid1d, raid10d, raid5d), and sb will require to be clean before hanlding these failed IO. However update sb can fail, for example array is broken by IO failure, or user config sysfs api array_state. This patch adds warning if updating sb failed first, in case this will be related to IO hang. Link: https://lore.kernel.org/linux-raid/20251117085557.770572-2-yukuai@fnnas.com Signed-off-by: Yu Kuai <yukuai@fnnas.com> Reviewed-by: Li Nan <linan122@huawei.com>	2025-11-30 09:38:22 +08:00
Yu Kuai	46f21952c4	md/raid0: fix NULL pointer dereference in create_strip_zones() for dm-raid Commit `2107457e31` ("md/raid0: Move queue limit setup before r0conf initialization") dereference mddev->gendisk unconditionally, which is NULL for dm-raid. Fix this problem by reverting to old codes for dm-raid. Link: https://lore.kernel.org/linux-raid/20251116021816.107648-1-yukuai@fnnas.com Fixes: `2107457e31` ("md/raid0: Move queue limit setup before r0conf initialization") Reported-and-tested-by: Changhui Zhong <czhong@redhat.com> Closes: https://lore.kernel.org/all/CAGVVp+VqVnvGeneUoTbYvBv2cw6GwQRrR3B-iQ-_9rVfyumoKA@mail.gmail.com/ Signed-off-by: Yu Kuai <yukuai@fnnas.com> Reviewed-by: Xiao Ni <xni@redhat.com> Reviewed-by: Li Nan <linan122@huawei.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>	2025-11-30 09:36:50 +08:00
Mikulas Patocka	fe680d8c74	dm-verity: fix unreliable memory allocation GFP_NOWAIT allocation may fail anytime. It needs to be changed to GFP_NOIO. There's no need to handle an error because mempool_alloc with GFP_NOIO can't fail. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Eric Biggers <ebiggers@kernel.org>	2025-11-21 12:51:41 +01:00
Mikulas Patocka	a612d24e85	dm: fix failure when empty flush's bi_sector points beyond the device end An empty flush bio can have arbitrary bi_sector. The commit `2b1c6d7a89` introduced a regression that device mapper would fail an empty flush bio with -EIO if the sector pointed beyond the end of the device. The commit introduced an optimization, that optimization would pass flushes to __split_and_process_bio and __split_and_process_bio is not prepared to handle empty bios. Fix this bug by passing only non-empty flushes to __split_and_process_bio - non-empty flushes must have valid bi_sector. Empty bios will go through __send_empty_flush, as they did before the optimization. This problem can be reproduced by running the lvm2 test: make check_local T=lvconvert-thin.sh LVM_TEST_PREFER_BRD=0 Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Fixes: `2b1c6d7a89` ("dm: optimize REQ_PREFLUSH with data when using the linear target") Reported-by: Zdenek Kabelac <zkabelac@redhat.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org>	2025-11-20 19:50:42 +01:00
Li Chen	a6ee8422b4	dm-pcache: zero cache_info before default init pcache_meta_find_latest() leaves whatever it last copied into the caller’s buffer even when it returns NULL. For cache_info_init(), that meant cache->cache_info could still contain CRC-bad garbage when no valid metadata exists, leading later initialization paths to read bogus flags. Explicitly memset cache->cache_info in cache_info_init_default() so new-cache paths start from a clean slate. The default sequence number assignment becomes redundant with this reset, so it drops out. Signed-off-by: Li Chen <chenl311@chinatelecom.cn> Reviewed-by: Zheng Gu <cengku@gmail.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2025-11-18 19:02:08 +01:00
Li Chen	840b80af74	dm-pcache: reuse meta_addr in pcache_meta_find_latest pcache_meta_find_latest() already computes the metadata address as meta_addr. Reuse that instead of recomputing. Signed-off-by: Li Chen <chenl311@chinatelecom.cn> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2025-11-18 19:01:57 +01:00
Li Chen	341d14bd69	dm-pcache: allow built-in build and rename flush helper CONFIG_BCACHE is tristate, so dm-pcache can also be built-in. Switch the Makefile to use obj-$(CONFIG_DM_PCACHE) so the target can be linked into vmlinux instead of always being a loadable module. Also rename cache_flush() to pcache_cache_flush() to avoid a global symbol clash with sunrpc/cache.c's cache_flush(). Signed-off-by: Li Chen <chenl311@chinatelecom.cn> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2025-11-18 19:01:47 +01:00
Uladzislau Rezki (Sony)	7fa3e7d114	dm-ebs: Mark full buffer dirty even on partial write When performing a read-modify-write(RMW) operation, any modification to a buffered block must cause the entire buffer to be marked dirty. Marking only a subrange as dirty is incorrect because the underlying device block size(ubs) defines the minimum read/write granularity. A lower device can perform I/O only on regions which are fully aligned and sized to ubs. This change ensures that write-back operations always occur in full ubs-sized chunks, matching the intended emulation semantics of the EBS target. As for user space visible impact, submitting sub-ubs and misaligned I/O for devices which are tuned to ubs sizes only, will reject such requests, therefore it can lead to losing data. Example: 1) Create a 8K nvme device in qemu by adding -device nvme,drive=drv0,serial=foo,logical_block_size=8192,physical_block_size=8192 2) Setup dm-ebs to emulate 512B to 8K mapping urezki@pc638:~/bin$ cat dmsetup.sh lower=/dev/nvme0n1 len=$(blockdev --getsz "$lower") echo "0 $len ebs $lower 0 1 16" \| dmsetup create nvme-8k urezki@pc638:~/bin$ offset 0, ebs=1 and ubs=16(in sectors). 3) Create an ext4 filesystem(default 4K block size) urezki@pc638:~/bin$ sudo mkfs.ext4 -F /dev/dm-0 mke2fs 1.47.0 (5-Feb-2023) Discarding device blocks: done Creating filesystem with 2072576 4k blocks and 518144 inodes Filesystem UUID: bd0b6ca6-0506-4e31-86da-8d22c9d50b63 Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632 Allocating group tables: done Writing inode tables: done Creating journal (16384 blocks): done Writing superblocks and filesystem accounting information: mkfs.ext4: Input/output error while writing out and closing file system urezki@pc638:~/bin$ dmesg <snip> [ 1618.875449] buffer_io_error: 1028 callbacks suppressed [ 1618.875456] Buffer I/O error on dev dm-0, logical block 0, lost async page write [ 1618.875527] Buffer I/O error on dev dm-0, logical block 1, lost async page write [ 1618.875602] Buffer I/O error on dev dm-0, logical block 2, lost async page write [ 1618.875620] Buffer I/O error on dev dm-0, logical block 3, lost async page write [ 1618.875639] Buffer I/O error on dev dm-0, logical block 4, lost async page write [ 1618.894316] Buffer I/O error on dev dm-0, logical block 5, lost async page write [ 1618.894358] Buffer I/O error on dev dm-0, logical block 6, lost async page write [ 1618.894380] Buffer I/O error on dev dm-0, logical block 7, lost async page write [ 1618.894405] Buffer I/O error on dev dm-0, logical block 8, lost async page write [ 1618.894427] Buffer I/O error on dev dm-0, logical block 9, lost async page write <snip> Many I/O errors because the lower 8K device rejects sub-ubs/misaligned requests. with a patch: urezki@pc638:~/bin$ sudo mkfs.ext4 -F /dev/dm-0 mke2fs 1.47.0 (5-Feb-2023) Discarding device blocks: done Creating filesystem with 2072576 4k blocks and 518144 inodes Filesystem UUID: 9b54f44f-ef55-4bd4-9e40-c8b775a616ac Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632 Allocating group tables: done Writing inode tables: done Creating journal (16384 blocks): done Writing superblocks and filesystem accounting information: done urezki@pc638:~/bin$ sudo mount /dev/dm-0 /mnt/ urezki@pc638:~/bin$ ls -al /mnt/ total 24 drwxr-xr-x 3 root root 4096 Oct 17 15:13 . drwxr-xr-x 19 root root 4096 Jul 10 19:42 .. drwx------ 2 root root 16384 Oct 17 15:13 lost+found urezki@pc638:~/bin$ After this change: mkfs completes; mount succeeds. Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org	2025-11-18 18:24:26 +01:00
John Garry	61c73e8de9	dm mpath: enable DM_TARGET_ATOMIC_WRITES Both the bio- and rq-based paths have no problem supporting REQ_ATOMIC, so enable DM_TARGET_ATOMIC_WRITES. Signed-off-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2025-11-18 18:24:26 +01:00
Shubhankar Mishra	ae97648e14	dm verity fec: Expose corrected block count via status Enhance visibility into dm-verity Forward Error Correction (FEC) activity. While FEC can correct on-disk corruptions, the number of successful correction events is not readily exposed through a standard interface. This change integrates FEC statistics into the verity target's .status handler for STATUSTYPE_INFO. The info output now includes count of corrected block by FEC. The counter is a per-device instance atomic64_t, maintained within the struct dm_verity_fec, tracking blocks successfully repaired by FEC on this specific device instance since it was created. This approach aligns with the standard Device Mapper mechanism for targets to report runtime information, as used by other targets like dm-integrity. This patch also updates Documentation/admin-guide/device-mapper/verity.rst to reflect the new status information. Tested: Induced single-bit errors on a block device protected by dm-verity with FEC on android phone. Confirmed 'dmctl status <device>' on Android reports an incrementing 'fec_corrected_blocks' count after the corrupted blocks were accessed. Signed-off-by: Shubhankar Mishra <shubhankarm@google.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2025-11-18 18:24:26 +01:00
Jon Hunter	c82faa8934	dm: Don't warn if IMA_DISABLE_HTABLE is not enabled Commit `f1cd6cb24b` ("dm ima: add a warning in dm_init if duplicate ima events are not measured") added a warning message if CONFIG_IMA is enabled but CONFIG_IMA_DISABLE_HTABLE is not to inform users. When enabling CONFIG_IMA, CONFIG_IMA_DISABLE_HTABLE is disabled by default and so warning is seen. Therefore, it seems more appropriate to make this an INFO level message than warning. If this truly is a warning, then maybe CONFIG_IMA_DISABLE_HTABLE should default to y if CONFIG_IMA is enabled. Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2025-11-18 18:24:25 +01:00
Gustavo A. R. Silva	699122b590	bcache: Avoid -Wflex-array-member-not-at-end warning -Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Use the new TRAILING_OVERLAP() helper to fix the following warning: drivers/md/bcache/bset.h:330:27: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] This helper creates a union between a flexible-array member (FAM) and a set of MEMBERS that would otherwise follow it. This overlays the trailing MEMBER struct btree_iter_set stack_data[MAX_BSETS]; onto the FAM struct btree_iter::data[], while keeping the FAM and the start of MEMBER aligned. The static_assert() ensures this alignment remains, and it's intentionally placed immediately after the corresponding structures --no blank line in between. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Coly Li <colyli@fnnas.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-11-13 09:18:06 -07:00
Marco Crivellari	c0c8082142	bcache: WQ_PERCPU added to alloc_workqueue users Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This patch continues the effort to refactor worqueue APIs, which has begun with the change introducing new workqueues and a new alloc_workqueue flag: commit `128ea9f6cc` ("workqueue: Add system_percpu_wq and system_dfl_wq") commit `930c2ea566` ("workqueue: Add new WQ_PERCPU flag") This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Signed-off-by: Coly Li <colyli@fnnas.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-11-13 09:18:06 -07:00
Marco Crivellari	fd82071814	bcache: replace use of system_wq with system_percpu_wq Currently if a user enqueues a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistency cannot be addressed without refactoring the API. This patch continues the effort to refactor worqueue APIs, which has begun with the change introducing new workqueues and a new alloc_workqueue flag: commit `128ea9f6cc` ("workqueue: Add system_percpu_wq and system_dfl_wq") commit `930c2ea566` ("workqueue: Add new WQ_PERCPU flag") system_wq should be the per-cpu workqueue, yet in this name nothing makes that clear, so replace system_wq with system_percpu_wq. The old wq (system_wq) will be kept for a few release cycles. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Signed-off-by: Coly Li <colyli@fnnas.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-11-13 09:18:06 -07:00
Qianfeng Rong	21194c44b6	bcache: remove redundant __GFP_NOWARN GFP_NOWAIT already includes __GFP_NOWARN, so let's remove the redundant __GFP_NOWARN. Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Acked-by: Coly Li <colyli@fnnas.com> Acked-by: Coly Li <colyli@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-11-13 09:18:06 -07:00
Coly Li	70bc173ce0	bcache: reduce gc latency by processing less nodes and sleep less time When bcache device is busy for high I/O loads, there are two methods to reduce the garbage collection latency, - Process less nodes in eac loop of incremental garbage collection in btree_gc_recurse(). - Sleep less time between two full garbage collection in bch_btree_gc(). This patch introduces to hleper routines to provide different garbage collection nodes number and sleep intervel time. - btree_gc_min_nodes() If there is no front end I/O, return 128 nodes to process in each incremental loop, otherwise only 10 nodes are returned. Then front I/O is able to access the btree earlier. - btree_gc_sleep_ms() If there is no synchronized wait for bucket allocation, sleep 100 ms between two incremental GC loop. Othersize only sleep 10 ms before incremental GC loop. Then a faster GC may provide available buckets earlier, to avoid most of bcache working threads from being starved by buckets allocation. The idea is inspired by works from Mingzhe Zou and Robert Pang, but much simpler and the expected behavior is more predictable. Signed-off-by: Coly Li <colyli@fnnas.com> Signed-off-by: Robert Pang <robertpang@google.com> Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-11-13 09:18:06 -07:00
Coly Li	73a004f83c	bcache: drop discard sysfs interface Since discard code is removed, now the sysfs interface to enable discard is useless. This patch removes the corresponding sysfs entry, and remove bool variable 'discard' from struct cache as well. Signed-off-by: Coly Li <colyli@fnnas.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-11-13 09:18:06 -07:00
Coly Li	b4056afbd4	bcache: remove discard code from alloc.c Bcache allocator initially has no free space to allocate. Firstly it does a garbage collection which is triggered by a cache device write and fills free space into ca->free[] lists. The discard happens after the free bucket is handled by garbage collection added into one of the ca->free[] lists. But normally this bucket will be allocated out very soon to requester and filled data onto it. The discard hint on this bucket LBA range doesn't help SSD control to improve internal erasure performance, and waste extra CPU cycles to issue discard bios. This patch removes the almost-useless discard code from alloc.c. Signed-off-by: Coly Li <colyli@fnnas.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-11-13 09:18:06 -07:00
Coly Li	0c72e9fcc1	bcache: get rid of discard code from journal In bcache journal there is discard functionality but almost useless in reality. Because discard happens after a journal bucket is reclaimed, and the reclaimed bucket is allocated for new journaling immediately. There is no time for underlying SSD to use the discard hint for internal data management. The discard code in bcache journal doesn't bring any performance optimization and wastes CPU cycles for issuing discard bios. Therefore this patch gits rid of it from journal.c and journal.h. Signed-off-by: Coly Li <colyli@fnnas.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-11-13 09:18:06 -07:00
Damien Le Moal	7b2038b1b1	dm: fix zone reset all operation processing dm_zone_get_reset_bitmap() is used to generate a bitmap of the zones of a zoned device target when a REQ_OP_ZONE_RESET_ALL request is being processed. This bitmap is built by executing a zone report with a report callback set to the function dm_zone_need_reset_cb() in struct dm_report_zones_args. However, the cb callback pointer is not anymore the same as the callback specified by callers of the blkdev_report_zones() function. Rather, this is a DM internal callback and report zones callback functions from blkdev_report_zones() are passed using struct blk_report_zones_args, introduced with commit db9aed869f34 ("block: introduce disk_report_zone()"). This commit changed the DM main report zones callback handler function dm_report_zones_cb() to call the new disk_report_zone() so that callback functions from blkdev_report_zones() are executed, and this change resulted in the DM internal dm_zone_need_reset_cb() callback function to not be executed anymore, turning any REQ_OP_ZONE_RESET_ALL request into a no-op. Fix this by calling in dm_report_zones_cb() the DM internal cb function specified in struct dm_report_zones_args. Fixes: db9aed869f34 ("block: introduce disk_report_zone()"). Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-11-13 09:10:04 -07:00
Li Nan	62ed1b5822	md: allow configuring logical block size Previously, raid array used the maximum logical block size (LBS) of all member disks. Adding a larger LBS disk at runtime could unexpectedly increase RAID's LBS, risking corruption of existing partitions. This can be reproduced by: ``` # LBS of sd[de] is 512 bytes, sdf is 4096 bytes. mdadm -CRq /dev/md0 -l1 -n3 /dev/sd[de] missing --assume-clean # LBS is 512 cat /sys/block/md0/queue/logical_block_size # create partition md0p1 parted -s /dev/md0 mklabel gpt mkpart primary 1MiB 100% lsblk \| grep md0p1 # LBS becomes 4096 after adding sdf mdadm --add -q /dev/md0 /dev/sdf cat /sys/block/md0/queue/logical_block_size # partition lost partprobe /dev/md0 lsblk \| grep md0p1 ``` Simply restricting larger-LBS disks is inflexible. In some scenarios, only disks with 512 bytes LBS are available currently, but later, disks with 4KB LBS may be added to the array. Making LBS configurable is the best way to solve this scenario. After this patch, the raid will: - store LBS in disk metadata - add a read-write sysfs 'mdX/logical_block_size' Future mdadm should support setting LBS via metadata field during RAID creation and the new sysfs. Though the kernel allows runtime LBS changes, users should avoid modifying it after creating partitions or filesystems to prevent compatibility issues. Only 1.x metadata supports configurable LBS. 0.90 metadata inits all fields to default values at auto-detect. Supporting 0.90 would require more extensive changes and no such use case has been observed. Note that many RAID paths rely on PAGE_SIZE alignment, including for metadata I/O. A larger LBS than PAGE_SIZE will result in metadata read/write failures. So this config should be prevented. Link: https://lore.kernel.org/linux-raid/20251103125757.1405796-6-linan666@huaweicloud.com Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Yu Kuai <yukuai@fnnas.com>	2025-11-11 11:20:15 +08:00
Li Nan	9c47127a80	md: add check_new_feature module parameter Raid checks if pad3 is zero when loading superblock from disk. Arrays created with new features may fail to assemble on old kernels as pad3 is used. Add module parameter check_new_feature to bypass this check. Link: https://lore.kernel.org/linux-raid/20251103125757.1405796-5-linan666@huaweicloud.com Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Yu Kuai <yukuai@fnnas.com>	2025-11-11 11:19:54 +08:00
Li Nan	2107457e31	md/raid0: Move queue limit setup before r0conf initialization Prepare for making logical blocksize configurable. This change has no impact until logical block size becomes configurable. Move raid0_set_limits() before create_strip_zones(). It is safe as fields modified in create_strip_zones() do not involve mddev configuration, and rdev modifications there are not used in raid0_set_limits(). 'blksize' in create_strip_zones() fetches mddev's logical block size, which is already the maximum aross all rdevs, so the later max() can be removed. Link: https://lore.kernel.org/linux-raid/20251103125757.1405796-4-linan666@huaweicloud.com Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Yu Kuai <yukuai@fnnas.com>	2025-11-11 11:19:27 +08:00
Li Nan	381a3ce1c0	md: init bioset in mddev_init IO operations may be needed before md_run(), such as updating metadata after writing sysfs. Without bioset, this triggers a NULL pointer dereference as below: BUG: kernel NULL pointer dereference, address: 0000000000000020 Call Trace: md_update_sb+0x658/0xe00 new_level_store+0xc5/0x120 md_attr_store+0xc9/0x1e0 sysfs_kf_write+0x6f/0xa0 kernfs_fop_write_iter+0x141/0x2a0 vfs_write+0x1fc/0x5a0 ksys_write+0x79/0x180 __x64_sys_write+0x1d/0x30 x64_sys_call+0x2818/0x2880 do_syscall_64+0xa9/0x580 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Reproducer ``` mdadm -CR /dev/md0 -l1 -n2 /dev/sd[cd] echo inactive > /sys/block/md0/md/array_state echo 10 > /sys/block/md0/md/new_level ``` mddev_init() can only be called once per mddev, no need to test if bioset has been initialized anymore. Link: https://lore.kernel.org/linux-raid/20251103125757.1405796-3-linan666@huaweicloud.com Fixes: `d981ed8419` ("md: Add new_level sysfs interface") Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Yu Kuai <yukuai@fnnas.com>	2025-11-11 11:19:10 +08:00
Li Nan	0ce112d917	md: delete md_redundancy_group when array is becoming inactive 'md_redundancy_group' are created in md_run() and deleted in del_gendisk(), but these are not paired. Writing inactive/active to sysfs array_state can trigger md_run() multiple times without del_gendisk(), leading to duplicate creation as below: sysfs: cannot create duplicate filename '/devices/virtual/block/md0/md/sync_action' Call Trace: dump_stack_lvl+0x9f/0x120 dump_stack+0x14/0x20 sysfs_warn_dup+0x96/0xc0 sysfs_add_file_mode_ns+0x19c/0x1b0 internal_create_group+0x213/0x830 sysfs_create_group+0x17/0x20 md_run+0x856/0xe60 ? __x64_sys_openat+0x23/0x30 do_md_run+0x26/0x1d0 array_state_store+0x559/0x760 md_attr_store+0xc9/0x1e0 sysfs_kf_write+0x6f/0xa0 kernfs_fop_write_iter+0x141/0x2a0 vfs_write+0x1fc/0x5a0 ksys_write+0x79/0x180 __x64_sys_write+0x1d/0x30 x64_sys_call+0x2818/0x2880 do_syscall_64+0xa9/0x580 entry_SYSCALL_64_after_hwframe+0x4b/0x53 md: cannot register extra attributes for md0 Creation of it depends on 'pers', its lifecycle cannot be aligned with gendisk. So fix this issue by triggering 'md_redundancy_group' deletion when the array is becoming inactive. Link: https://lore.kernel.org/linux-raid/20251103125757.1405796-2-linan666@huaweicloud.com Fixes: `790abe4d77` ("md: remove/add redundancy group only in level change") Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Yu Kuai <yukuai@fnnas.com>	2025-11-11 11:18:51 +08:00
Li Nan	6c6b66f65e	md: prevent adding disks with larger logical_block_size to active arrays When adding a disk to a md array, avoid updating the array's logical_block_size to match the new disk. This prevents accidental partition table loss that renders the array unusable. The later patch will introduce a way to configure the array's logical_block_size. The issue was introduced before Linux 2.6.12-rc2. Link: https://lore.kernel.org/linux-raid/20250918115759.334067-2-linan666@huaweicloud.com/ Fixes: d2e45eace8 ("[PATCH] Fix raid "bio too big" failures") Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Yu Kuai <yukuai@fnnas.com>	2025-11-11 11:17:33 +08:00
Huiwen He	a811db3919	md/raid5: remove redundant __GFP_NOWARN The __GFP_NOWARN flag was included in GFP_NOWAIT since commit `16f5dfbc85` ("gfp: include __GFP_NOWARN in GFP_NOWAIT"). So remove the redundant __GFP_NOWARN flag. Link: https://lore.kernel.org/linux-raid/20251102152540.871568-1-hehuiwen@kylinos.cn Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Reviewed-by: Li Nan <linan122@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Yu Kuai <yukuai@fnnas.com>	2025-11-08 17:49:36 +08:00

1 2 3 4 5 ...

8741 Commits