linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-16 18:22:00 -04:00

Author	SHA1	Message	Date
Jan Kara	0c90eed1b9	ext4: fix deadlock on inode reallocation Currently there is a race in ext4 when reallocating freed inode resulting in a deadlock: Task1 Task2 ext4_evict_inode() handle = ext4_journal_start(); ... if (IS_SYNC(inode)) handle->h_sync = 1; ext4_free_inode() ext4_new_inode() handle = ext4_journal_start() finds the bit in inode bitmap already clear insert_inode_locked() waits for inode to be removed from the hash. ext4_journal_stop(handle) jbd2_journal_stop(handle) jbd2_log_wait_commit(journal, tid); - deadlocks waiting for transaction handle Task2 holds Fix the problem by removing inode from the hash already in ext4_clear_inode() by which time all IO for the inode is done so reuse is already fine but we are still before possibly blocking on transaction commit. Reported-by: "Lai, Yi" <yi1.lai@linux.intel.com> Link: https://lore.kernel.org/all/abNvb2PcrKj1FBeC@ly-workstation Fixes: `88ec797c46` ("fs: make insert_inode_locked() wait for inode destruction") CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20260320090428.24899-2-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:37:48 -04:00
Jiayuan Chen	d15e4b0a41	ext4: fix use-after-free in update_super_work when racing with umount Commit `b98535d091` ("ext4: fix bug_on in start_this_handle during umount filesystem") moved ext4_unregister_sysfs() before flushing s_sb_upd_work to prevent new error work from being queued via /proc/fs/ext4/xx/mb_groups reads during unmount. However, this introduced a use-after-free because update_super_work calls ext4_notify_error_sysfs() -> sysfs_notify() which accesses the kobject's kernfs_node after it has been freed by kobject_del() in ext4_unregister_sysfs(): update_super_work ext4_put_super ----------------- -------------- ext4_unregister_sysfs(sb) kobject_del(&sbi->s_kobj) __kobject_del() sysfs_remove_dir() kobj->sd = NULL sysfs_put(sd) kernfs_put() // RCU free ext4_notify_error_sysfs(sbi) sysfs_notify(&sbi->s_kobj) kn = kobj->sd // stale pointer kernfs_get(kn) // UAF on freed kernfs_node ext4_journal_destroy() flush_work(&sbi->s_sb_upd_work) Instead of reordering the teardown sequence, fix this by making ext4_notify_error_sysfs() detect that sysfs has already been torn down by checking s_kobj.state_in_sysfs, and skipping the sysfs_notify() call in that case. A dedicated mutex (s_error_notify_mutex) serializes ext4_notify_error_sysfs() against kobject_del() in ext4_unregister_sysfs() to prevent TOCTOU races where the kobject could be deleted between the state_in_sysfs check and the sysfs_notify() call. Fixes: `b98535d091` ("ext4: fix bug_on in start_this_handle during umount filesystem") Cc: Jiayuan Chen <jiayuan.chen@linux.dev> Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20260319120336.157873-1-jiayuan.chen@linux.dev Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:37:39 -04:00
Zqiang	496bb99b7e	ext4: fix the might_sleep() warnings in kvfree() Use the kvfree() in the RCU read critical section can trigger the following warnings: EXT4-fs (vdb): unmounting filesystem cd983e5b-3c83-4f5a-a136-17b00eb9d018. WARNING: suspicious RCU usage ./include/linux/rcupdate.h:409 Illegal context switch in RCU read-side critical section! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 Call Trace: <TASK> dump_stack_lvl+0xbb/0xd0 dump_stack+0x14/0x20 lockdep_rcu_suspicious+0x15a/0x1b0 __might_resched+0x375/0x4d0 ? put_object.part.0+0x2c/0x50 __might_sleep+0x108/0x160 vfree+0x58/0x910 ? ext4_group_desc_free+0x27/0x270 kvfree+0x23/0x40 ext4_group_desc_free+0x111/0x270 ext4_put_super+0x3c8/0xd40 generic_shutdown_super+0x14c/0x4a0 ? __pfx_shrinker_free+0x10/0x10 kill_block_super+0x40/0x90 ext4_kill_sb+0x6d/0xb0 deactivate_locked_super+0xb4/0x180 deactivate_super+0x7e/0xa0 cleanup_mnt+0x296/0x3e0 __cleanup_mnt+0x16/0x20 task_work_run+0x157/0x250 ? __pfx_task_work_run+0x10/0x10 ? exit_to_user_mode_loop+0x6a/0x550 exit_to_user_mode_loop+0x102/0x550 do_syscall_64+0x44a/0x500 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> BUG: sleeping function called from invalid context at mm/vmalloc.c:3441 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 556, name: umount preempt_count: 1, expected: 0 CPU: 3 UID: 0 PID: 556 Comm: umount Call Trace: <TASK> dump_stack_lvl+0xbb/0xd0 dump_stack+0x14/0x20 __might_resched+0x275/0x4d0 ? put_object.part.0+0x2c/0x50 __might_sleep+0x108/0x160 vfree+0x58/0x910 ? ext4_group_desc_free+0x27/0x270 kvfree+0x23/0x40 ext4_group_desc_free+0x111/0x270 ext4_put_super+0x3c8/0xd40 generic_shutdown_super+0x14c/0x4a0 ? __pfx_shrinker_free+0x10/0x10 kill_block_super+0x40/0x90 ext4_kill_sb+0x6d/0xb0 deactivate_locked_super+0xb4/0x180 deactivate_super+0x7e/0xa0 cleanup_mnt+0x296/0x3e0 __cleanup_mnt+0x16/0x20 task_work_run+0x157/0x250 ? __pfx_task_work_run+0x10/0x10 ? exit_to_user_mode_loop+0x6a/0x550 exit_to_user_mode_loop+0x102/0x550 do_syscall_64+0x44a/0x500 entry_SYSCALL_64_after_hwframe+0x77/0x7f The above scenarios occur in initialization failures and teardown paths, there are no parallel operations on the resources released by kvfree(), this commit therefore remove rcu_read_lock/unlock() and use rcu_access_pointer() instead of rcu_dereference() operations. Fixes: `7c990728b9` ("ext4: fix potential race between s_flex_groups online resizing and access") Fixes: `df3da4ea5a` ("ext4: fix potential race between s_group_info online resizing and access") Signed-off-by: Zqiang <qiang.zhang@linux.dev> Reviewed-by: Baokun Li <libaokun@linux.alibaba.com> Link: https://patch.msgid.link/20260319094545.19291-1-qiang.zhang@linux.dev Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:36:20 -04:00
Helen Koike	3822743dc2	ext4: reject mount if bigalloc with s_first_data_block != 0 bigalloc with s_first_data_block != 0 is not supported, reject mounting it. Signed-off-by: Helen Koike <koike@igalia.com> Suggested-by: Theodore Ts'o <tytso@mit.edu> Reported-by: syzbot+b73703b873a33d8eb8f6@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=b73703b873a33d8eb8f6 Link: https://patch.msgid.link/20260317142325.135074-1-koike@igalia.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:36:11 -04:00
Ye Bin	9e1b14320b	ext4: fix extents-test.c is not compiled when EXT4_KUNIT_TESTS=M Now, only EXT4_KUNIT_TESTS=Y testcase will be compiled in 'extents.c'. To solve this issue, the ext4 test code needs to be decoupled. The 'extents-test' module is compiled into 'ext4-test' module. Fixes: `cb1e0c1d1f` ("ext4: kunit tests for extent splitting and conversion") Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20260314075258.1317579-4-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2026-03-27 23:36:06 -04:00
Ye Bin	519b76ac0b	ext4: fix mballoc-test.c is not compiled when EXT4_KUNIT_TESTS=M Now, only EXT4_KUNIT_TESTS=Y testcase will be compiled in 'mballoc.c'. To solve this issue, the ext4 test code needs to be decoupled. The ext4 test module is compiled into a separate module. Reported-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Closes: https://patchwork.kernel.org/project/cifs-client/patch/20260118091313.1988168-2-chenxiaosong.chenxiaosong@linux.dev/ Fixes: `7c9fa399a3` ("ext4: add first unit test for ext4_mb_new_blocks_simple in mballoc") Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20260314075258.1317579-3-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2026-03-27 23:36:02 -04:00
Ye Bin	49504a5125	ext4: introduce EXPORT_SYMBOL_FOR_EXT4_TEST() helper Introduce EXPORT_SYMBOL_FOR_EXT4_TEST() helper for kuint test. Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20260314075258.1317579-2-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2026-03-27 23:34:19 -04:00
Milos Nikic	bac3190a8e	jbd2: gracefully abort on checkpointing state corruptions This patch targets two internal state machine invariants in checkpoint.c residing inside functions that natively return integer error codes. - In jbd2_cleanup_journal_tail(): A blocknr of 0 indicates a severely corrupted journal superblock. Replaced the J_ASSERT with a WARN_ON_ONCE and a graceful journal abort, returning -EFSCORRUPTED. - In jbd2_log_do_checkpoint(): Replaced the J_ASSERT_BH checking for an unexpected buffer_jwrite state. If the warning triggers, we explicitly drop the just-taken get_bh() reference and call __flush_batch() to safely clean up any previously queued buffers in the j_chkpt_bhs array, preventing a memory leak before returning -EFSCORRUPTED. Signed-off-by: Milos Nikic <nikic.milos@gmail.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Baokun Li <libaokun@linux.alibaba.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20260311041548.159424-1-nikic.milos@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:34:09 -04:00
Edward Adam Davis	5422fe71d2	ext4: avoid infinite loops caused by residual data On the mkdir/mknod path, when mapping logical blocks to physical blocks, if inserting a new extent into the extent tree fails (in this example, because the file system disabled the huge file feature when marking the inode as dirty), ext4_ext_map_blocks() only calls ext4_free_blocks() to reclaim the physical block without deleting the corresponding data in the extent tree. This causes subsequent mkdir operations to reference the previously reclaimed physical block number again, even though this physical block is already being used by the xattr block. Therefore, a situation arises where both the directory and xattr are using the same buffer head block in memory simultaneously. The above causes ext4_xattr_block_set() to enter an infinite loop about "inserted" and cannot release the inode lock, ultimately leading to the 143s blocking problem mentioned in [1]. If the metadata is corrupted, then trying to remove some extent space can do even more harm. Also in case EXT4_GET_BLOCKS_DELALLOC_RESERVE was passed, remove space wrongly update quota information. Jan Kara suggests distinguishing between two cases: 1) The error is ENOSPC or EDQUOT - in this case the filesystem is fully consistent and we must maintain its consistency including all the accounting. However these errors can happen only early before we've inserted the extent into the extent tree. So current code works correctly for this case. 2) Some other error - this means metadata is corrupted. We should strive to do as few modifications as possible to limit damage. So I'd just skip freeing of allocated blocks. [1] INFO: task syz.0.17:5995 blocked for more than 143 seconds. Call Trace: inode_lock_nested include/linux/fs.h:1073 [inline] __start_dirop fs/namei.c:2923 [inline] start_dirop fs/namei.c:2934 [inline] Reported-by: syzbot+512459401510e2a9a39f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=1659aaaaa8d9d11265d7 Tested-by: syzbot+1659aaaaa8d9d11265d7@syzkaller.appspotmail.com Reported-by: syzbot+1659aaaaa8d9d11265d7@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=512459401510e2a9a39f Tested-by: syzbot+1659aaaaa8d9d11265d7@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Reviewed-by: Jan Kara <jack@suse.cz> Tested-by: syzbot+512459401510e2a9a39f@syzkaller.appspotmail.com Link: https://patch.msgid.link/tencent_43696283A68450B761D76866C6F360E36705@qq.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:33:56 -04:00
Tejas Bharambe	2acb5c12eb	ext4: validate p_idx bounds in ext4_ext_correct_indexes ext4_ext_correct_indexes() walks up the extent tree correcting index entries when the first extent in a leaf is modified. Before accessing path[k].p_idx->ei_block, there is no validation that p_idx falls within the valid range of index entries for that level. If the on-disk extent header contains a corrupted or crafted eh_entries value, p_idx can point past the end of the allocated buffer, causing a slab-out-of-bounds read. Fix this by validating path[k].p_idx against EXT_LAST_INDEX() at both access sites: before the while loop and inside it. Return -EFSCORRUPTED if the index pointer is out of range, consistent with how other bounds violations are handled in the ext4 extent tree code. Reported-by: syzbot+04c4e65cab786a2e5b7e@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=04c4e65cab786a2e5b7e Signed-off-by: Tejas Bharambe <tejas.bharambe@outlook.com> Link: https://patch.msgid.link/JH0PR06MB66326016F9B6AD24097D232B897CA@JH0PR06MB6632.apcprd06.prod.outlook.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:33:46 -04:00
Ye Bin	73bf12adbe	ext4: test if inode's all dirty pages are submitted to disk The commit `aa373cf550` ("writeback: stop background/kupdate works from livelocking other works") introduced an issue where unmounting a filesystem in a multi-logical-partition scenario could lead to batch file data loss. This problem was not fixed until the commit `d92109891f` ("fs/writeback: bail out if there is no more inodes for IO and queued once"). It took considerable time to identify the root cause. Additionally, in actual production environments, we frequently encountered file data loss after normal system reboots. Therefore, we are adding a check in the inode release flow to verify whether all dirty pages have been flushed to disk, in order to determine whether the data loss is caused by a logic issue in the filesystem code. Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20260303012242.3206465-1-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:33:34 -04:00
Ojaswin Mujoo	c4a48e9eee	ext4: minor fix for ext4_split_extent_zeroout() We missed storing the error which triggerd smatch warning: fs/ext4/extents.c:3369 ext4_split_extent_zeroout() warn: duplicate zero check 'err' (previous on line 3363) fs/ext4/extents.c 3361 3362 err = ext4_ext_get_access(handle, inode, path + depth); 3363 if (err) 3364 return err; 3365 3366 ext4_ext_mark_initialized(ex); 3367 3368 ext4_ext_dirty(handle, inode, path + depth); --> 3369 if (err) 3370 return err; 3371 3372 return 0; 3373 } Fix it by correctly storing the err value from ext4_ext_dirty(). Link: https://lore.kernel.org/all/aYXvVgPnKltX79KE@stanley.mountain/ Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Fixes: `a985e07c26` ("ext4: refactor zeroout path and handle all cases") Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Baokun Li <libaokun@linux.alibaba.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://patch.msgid.link/20260302143811.605174-1-ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:33:22 -04:00
Ye Bin	46066e3a06	ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal() There's issue as follows: ... EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117 EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117 EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117 EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 206 at logical offset 0 with max blocks 1 with error 117 EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2243 at logical offset 0 with max blocks 1 with error 117 EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost EXT4-fs (mmcblk0p1): Delayed block allocation failed for inode 2239 at logical offset 0 with max blocks 1 with error 117 EXT4-fs (mmcblk0p1): This should not happen!! Data will be lost EXT4-fs (mmcblk0p1): error count since last fsck: 1 EXT4-fs (mmcblk0p1): initial error at time 1765597433: ext4_mb_generate_buddy:760 EXT4-fs (mmcblk0p1): last error at time 1765597433: ext4_mb_generate_buddy:760 ... According to the log analysis, blocks are always requested from the corrupted block group. This may happen as follows: ext4_mb_find_by_goal ext4_mb_load_buddy ext4_mb_load_buddy_gfp ext4_mb_init_cache ext4_read_block_bitmap_nowait ext4_wait_block_bitmap ext4_validate_block_bitmap if (!grp \|\| EXT4_MB_GRP_BBITMAP_CORRUPT(grp)) return -EFSCORRUPTED; // There's no logs. if (err) return err; // Will return error ext4_lock_group(ac->ac_sb, group); if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info))) // Unreachable goto out; After commit `9008a58e5d` ("ext4: make the bitmap read routines return real error codes") merged, Commit `163a203ddb` ("ext4: mark block group as corrupt on block bitmap error") is no real solution for allocating blocks from corrupted block groups. This is because if 'EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info)' is true, then 'ext4_mb_load_buddy()' may return an error. This means that the block allocation will fail. Therefore, check block group if corrupted when ext4_mb_load_buddy() returns error. Fixes: `163a203ddb` ("ext4: mark block group as corrupt on block bitmap error") Fixes: `9008a58e5d` ("ext4: make the bitmap read routines return real error codes") Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20260302134619.3145520-1-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:33:08 -04:00
Ritesh Harjani (IBM)	afe376d2c1	ext4: kunit: extents-test: lix percpu_counters list corruption commit `82f80e2e3b` ("ext4: add extent status cache support to kunit tests"), added ext4_es_register_shrinker() in extents_kunit_init() function but failed to add the unregister shrinker routine in extents_kunit_exit(). This could cause the following percpu_counters list corruption bug. ok 1 split unwrit extent to 2 extents and convert 1st half writ slab kmalloc-4k start c0000002007ff000 pointer offset 1448 size 4096 list_add corruption. next->prev should be prev (c000000004bc9e60), but was 0000000000000000. (next=c0000002007ff5a8). ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:29! cpu 0x2: Vector: 700 (Program Check) at [c000000241927a30] pc: c000000000f26ed0: __list_add_valid_or_report+0x120/0x164 lr: c000000000f26ecc: __list_add_valid_or_report+0x11c/0x164 sp: c000000241927cd0 msr: 800000000282b033 current = 0xc000000241215200 paca = 0xc0000003fffff300 irqmask: 0x03 irq_happened: 0x09 pid = 258, comm = kunit_try_catch kernel BUG at lib/list_debug.c:29! enter ? for help __percpu_counter_init_many+0x148/0x184 ext4_es_register_shrinker+0x74/0x23c extents_kunit_init+0x100/0x308 kunit_try_run_case+0x78/0x1f8 kunit_generic_run_threadfn_adapter+0x40/0x70 kthread+0x190/0x1a0 start_kernel_thread+0x14/0x18 2:mon> This happens because: extents_kunit_init(test N): ext4_es_register_shrinker(sbi) percpu_counters_init() x 4; // this adds 4 list nodes to global percpu_counters list list_add(&fbc->list, &percpu_counters); shrinker_register(); extents_kunit_exit(test N): kfree(sbi); // frees sbi w/o removing those 4 list nodes. // So, those list node now becomes dangling pointers extents_kunit_init(test N+1): kzalloc_obj(ext4_sb_info) // allocator returns same page, but zeroed. ext4_es_register_shrinker(sbi) percpu_counters_init() list_add(&fbc->list, &percpu_counters); __list_add_valid(new, prev, next); next->prev != prev // list corruption bug detected, since next->prev = NULL Fixes: `82f80e2e3b` ("ext4: add extent status cache support to kunit tests") Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://patch.msgid.link/5bb9041471dab8ce870c191c19cbe4df57473be8.1772381213.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:32:23 -04:00
Li Chen	1aec30021e	ext4: publish jinode after initialization ext4_inode_attach_jinode() publishes ei->jinode to concurrent users. It used to set ei->jinode before jbd2_journal_init_jbd_inode(), allowing a reader to observe a non-NULL jinode with i_vfs_inode still unset. The fast commit flush path can then pass this jinode to jbd2_wait_inode_data(), which dereferences i_vfs_inode->i_mapping and may crash. Below is the crash I observe: ``` BUG: unable to handle page fault for address: 000000010beb47f4 PGD 110e51067 P4D 110e51067 PUD 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 1 UID: 0 PID: 4850 Comm: fc_fsync_bench_ Not tainted 6.18.0-00764-g795a690c06a5 #1 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-2-2 04/01/2014 RIP: 0010:xas_find_marked+0x3d/0x2e0 Code: e0 03 48 83 f8 02 0f 84 f0 01 00 00 48 8b 47 08 48 89 c3 48 39 c6 0f 82 fd 01 00 00 48 85 c9 74 3d 48 83 f9 03 77 63 4c 8b 0f <49> 8b 71 08 48 c7 47 18 00 00 00 00 48 89 f1 83 e1 03 48 83 f9 02 RSP: 0018:ffffbbee806e7bf0 EFLAGS: 00010246 RAX: 000000000010beb4 RBX: 000000000010beb4 RCX: 0000000000000003 RDX: 0000000000000001 RSI: 0000002000300000 RDI: ffffbbee806e7c10 RBP: 0000000000000001 R08: 0000002000300000 R09: 000000010beb47ec R10: ffff9ea494590090 R11: 0000000000000000 R12: 0000002000300000 R13: ffffbbee806e7c90 R14: ffff9ea494513788 R15: ffffbbee806e7c88 FS: 00007fc2f9e3e6c0(0000) GS:ffff9ea6b1444000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000010beb47f4 CR3: 0000000119ac5000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> filemap_get_folios_tag+0x87/0x2a0 __filemap_fdatawait_range+0x5f/0xd0 ? srso_alias_return_thunk+0x5/0xfbef5 ? __schedule+0x3e7/0x10c0 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? preempt_count_sub+0x5f/0x80 ? srso_alias_return_thunk+0x5/0xfbef5 ? cap_safe_nice+0x37/0x70 ? srso_alias_return_thunk+0x5/0xfbef5 ? preempt_count_sub+0x5f/0x80 ? srso_alias_return_thunk+0x5/0xfbef5 filemap_fdatawait_range_keep_errors+0x12/0x40 ext4_fc_commit+0x697/0x8b0 ? ext4_file_write_iter+0x64b/0x950 ? srso_alias_return_thunk+0x5/0xfbef5 ? preempt_count_sub+0x5f/0x80 ? srso_alias_return_thunk+0x5/0xfbef5 ? vfs_write+0x356/0x480 ? srso_alias_return_thunk+0x5/0xfbef5 ? preempt_count_sub+0x5f/0x80 ext4_sync_file+0xf7/0x370 do_fsync+0x3b/0x80 ? syscall_trace_enter+0x108/0x1d0 __x64_sys_fdatasync+0x16/0x20 do_syscall_64+0x62/0x2c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e ... ``` Fix this by initializing the jbd2_inode first. Use smp_wmb() and WRITE_ONCE() to publish ei->jinode after initialization. Readers use READ_ONCE() to fetch the pointer. Fixes: `a361293f5f` ("jbd2: Fix oops in jbd2_journal_file_inode()") Cc: stable@vger.kernel.org Signed-off-by: Li Chen <me@linux.beauty> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20260225082617.147957-1-me@linux.beauty Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:32:07 -04:00
Yuto Ohnuki	356227096e	ext4: replace BUG_ON with proper error handling in ext4_read_inline_folio Replace BUG_ON() with proper error handling when inline data size exceeds PAGE_SIZE. This prevents kernel panic and allows the system to continue running while properly reporting the filesystem corruption. The error is logged via ext4_error_inode(), the buffer head is released to prevent memory leak, and -EFSCORRUPTED is returned to indicate filesystem corruption. Signed-off-by: Yuto Ohnuki <ytohnuki@amazon.com> Link: https://patch.msgid.link/20260223123345.14838-2-ytohnuki@amazon.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:31:52 -04:00
Jan Kara	1308255bbf	ext4: fix fsync(2) for nojournal mode When inode metadata is changed, we sometimes just call ext4_mark_inode_dirty() to track modified metadata. This copies inode metadata into block buffer which is enough when we are journalling metadata. However when we are running in nojournal mode we currently fail to write the dirtied inode buffer during fsync(2) because the inode is not marked as dirty. Use explicit ext4_write_inode() call to make sure the inode table buffer is written to the disk. This is a band aid solution but proper solution requires a much larger rewrite including changes in metadata bh tracking infrastructure. Reported-by: Free Ekanayaka <free.ekanayaka@gmail.com> Link: https://lore.kernel.org/all/87il8nhxdm.fsf@x1.mail-host-address-is-not-set/ CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Link: https://patch.msgid.link/20260216164848.3074-4-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:31:43 -04:00
Jan Kara	bd060afa7c	ext4: make recently_deleted() properly work with lazy itable initialization recently_deleted() checks whether inode has been used in the near past. However this can give false positive result when inode table is not initialized yet and we are in fact comparing to random garbage (or stale itable block of a filesystem before mkfs). Ultimately this results in uninitialized inodes being skipped during inode allocation and possibly they are never initialized and thus e2fsck complains. Verify if the inode has been initialized before checking for dtime. Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Link: https://patch.msgid.link/20260216164848.3074-3-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:30:37 -04:00
Simon Weber	b1d682f199	ext4: fix journal credit check when setting fscrypt context Fix an issue arising when ext4 features has_journal, ea_inode, and encrypt are activated simultaneously, leading to ENOSPC when creating an encrypted file. Fix by passing XATTR_CREATE flag to xattr_set_handle function if a handle is specified, i.e., when the function is called in the control flow of creating a new inode. This aligns the number of jbd2 credits set_handle checks for with the number allocated for creating a new inode. ext4_set_context must not be called with a non-null handle (fs_data) if fscrypt context xattr is not guaranteed to not exist yet. The only other usage of this function currently is when handling the ioctl FS_IOC_SET_ENCRYPTION_POLICY, which calls it with fs_data=NULL. Fixes: `c1a5d5f6ab` ("ext4: improve journal credit handling in set xattr paths") Co-developed-by: Anthony Durrer <anthonydev@fastmail.com> Signed-off-by: Anthony Durrer <anthonydev@fastmail.com> Signed-off-by: Simon Weber <simon.weber.39@gmail.com> Reviewed-by: Eric Biggers <ebiggers@kernel.org> Link: https://patch.msgid.link/20260207100148.724275-4-simon.weber.39@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:30:25 -04:00
Deepanshu Kartikey	ed9356a30e	ext4: convert inline data to extents when truncate exceeds inline size Add a check in ext4_setattr() to convert files from inline data storage to extent-based storage when truncate() grows the file size beyond the inline capacity. This prevents the filesystem from entering an inconsistent state where the inline data flag is set but the file size exceeds what can be stored inline. Without this fix, the following sequence causes a kernel BUG_ON(): 1. Mount filesystem with inode that has inline flag set and small size 2. truncate(file, 50MB) - grows size but inline flag remains set 3. sendfile() attempts to write data 4. ext4_write_inline_data() hits BUG_ON(write_size > inline_capacity) The crash occurs because ext4_write_inline_data() expects inline storage to accommodate the write, but the actual inline capacity (~60 bytes for i_block + ~96 bytes for xattrs) is far smaller than the file size and write request. The fix checks if the new size from setattr exceeds the inode's actual inline capacity (EXT4_I(inode)->i_inline_size) and converts the file to extent-based storage before proceeding with the size change. This addresses the root cause by ensuring the inline data flag and file size remain consistent during truncate operations. Reported-by: syzbot+7de5fe447862fc37576f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7de5fe447862fc37576f Tested-by: syzbot+7de5fe447862fc37576f@syzkaller.appspotmail.com Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com> Link: https://patch.msgid.link/20260207043607.1175976-1-kartikey406@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:29:54 -04:00
Jan Kara	f4a2b42e78	ext4: fix stale xarray tags after writeback There are cases where ext4_bio_write_page() gets called for a page which has no buffers to submit. This happens e.g. when the part of the file is actually a hole, when we cannot allocate blocks due to being called from jbd2, or in data=journal mode when checkpointing writes the buffers earlier. In these cases we just return from ext4_bio_write_page() however if the page didn't need redirtying, we will leave stale DIRTY and/or TOWRITE tags in xarray because those get cleared only in __folio_start_writeback(). As a result we can leave these tags set in mappings even after a final sync on filesystem that's getting remounted read-only or that's being frozen. Various assertions can then get upset when writeback is started on such filesystems (Gerald reported assertion in ext4_journal_check_start() firing). Fix the problem by cycling the page through writeback state even if we decide nothing needs to be written for it so that xarray tags get properly updated. This is slightly silly (we could update the xarray tags directly) but I don't think a special helper messing with xarray tags is really worth it in this relatively rare corner case. Reported-by: Gerald Yang <gerald.yang@canonical.com> Link: https://lore.kernel.org/all/20260128074515.2028982-1-gerald.yang@canonical.com Fixes: `dff4ac75ee` ("ext4: move keep_towrite handling to ext4_bio_write_page()") Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20260205092223.21287-2-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:29:39 -04:00
Zhang Yi	84e21e3fb8	ext4: do not check fast symlink during orphan recovery Commit '5f920d5d6083 ("ext4: verify fast symlink length")' causes the generic/475 test to fail during orphan cleanup of zero-length symlinks. generic/475 84s ... _check_generic_filesystem: filesystem on /dev/vde is inconsistent The fsck reports are provided below: Deleted inode 9686 has zero dtime. Deleted inode 158230 has zero dtime. ... Inode bitmap differences: -9686 -158230 Orphan file (inode 12) block 13 is not clean. Failed to initialize orphan file. In ext4_symlink(), a newly created symlink can be added to the orphan list due to ENOSPC. Its data has not been initialized, and its size is zero. Therefore, we need to disregard the length check of the symbolic link when cleaning up orphan inodes. Instead, we should ensure that the nlink count is zero. Fixes: `5f920d5d60` ("ext4: verify fast symlink length") Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20260131091156.1733648-1-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org	2026-03-27 23:14:25 -04:00
Theodore Ts'o	82c4f23d27	Update MAINTAINERS file to add reviewers for ext4 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>	2026-03-27 23:14:20 -04:00
Linus Torvalds	be762d8b6d	Merge tag 'hwmon-for-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - PMBus driver fixes: - Add mutex protection for regulator operations - Fix reading from "write-only" attributes - Mark lowest/average/highest/rated attributes as read-only - isl68137: Add mutex protection for AVS enable sysfs attributes - ina233: Fix error handling and sign extension when reading shunt voltage - adm1177: Fix sysfs ABI violation and current unit conversion - peci: Fix off-by-one in cputemp_is_visible(), and crit_hyst returning delta instead of absolute temperature * tag 'hwmon-for-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (pmbus/core) Protect regulator operations with mutex hwmon: (pmbus) Introduce the concept of "write-only" attributes hwmon: (pmbus) Mark lowest/average/highest/rated attributes as read-only hwmon: (adm1177) fix sysfs ABI violation and current unit conversion hwmon: (peci/cputemp) Fix off-by-one in cputemp_is_visible() hwmon: (peci/cputemp) Fix crit_hyst returning delta instead of absolute temperature hwmon: (pmbus/isl68137) Add mutex protection for AVS enable sysfs attributes hwmon: (pmbus/ina233) Fix error handling and sign extension in shunt voltage read	2026-03-27 20:02:34 -07:00
Linus Torvalds	afb54c1404	Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Driver (and enclosure) only fixes. Most are obvious. The big change is in the tcm_loop driver to add command draining to error handling (the lack of which was causing hangs with the potential for double use crashes)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: target: file: Use kzalloc_flex for aio_cmd scsi: scsi_transport_sas: Fix the maximum channel scanning issue scsi: target: tcm_loop: Drain commands in target_reset handler scsi: ibmvfc: Fix OOB access in ibmvfc_discover_targets_done() scsi: ses: Handle positive SCSI error from ses_recv_diag()	2026-03-27 19:58:22 -07:00
Linus Torvalds	26df51adf3	Merge tag 'drm-fixes-2026-03-28-1' of https://gitlab.freedesktop.org/drm/kernel Pull drm fixes from Dave Airlie: "Weekly fixes, still a bit busy, but the usual suspects amdgpu and i915/xe have a bunch of small fixes, and otherwise it's just a few minor driver fixes. loognsoon: - update MAINTAINERS shmem: - fault handler fix syncobj: - fix GFP flags amdgpu: - DSC fix - Module parameter parsing fix - PASID reuse fix - drm_edid leak fix - SMU 13.x fixes - SMU 14.x fix - Fence fix in amdgpu_amdkfd_submit_ib() - LVDS fixes - GPU page fault fix for non-4K pages amdkfd: - Ordering fix in kfd_ioctl_create_process() i915/display: - DP tunnel error handling fix - Spurious GMBUS timeout fix - Unlink NV12 planes earlier - Order OP vs. timeout correctly in __wait_for() xe: - Fix UAF in SRIOV migration restore - Updates to HW W/a - VMBind remap fix ivpu: - poweroff fix mediatek: - fix register ordering" * tag 'drm-fixes-2026-03-28-1' of https://gitlab.freedesktop.org/drm/kernel: (25 commits) MAINTAINERS: Update GPU driver maintainer information drm/xe: always keep track of remap prev/next drm/syncobj: Fix xa_alloc allocation flags drm/amd/display: Fix DCE LVDS handling drm/amdgpu: Handle GPU page faults correctly on non-4K page systems drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v14 drm/amdkfd: Fix NULL pointer check order in kfd_ioctl_create_process drm/amd/display: check if ext_caps is valid in BL setup drm/amdgpu: Fix fence put before wait in amdgpu_amdkfd_submit_ib drm/xe: Implement recent spec updates to Wa_16025250150 accel/ivpu: Add disable clock relinquish workaround for NVL-A0 drm/i915/dp_tunnel: Fix error handling when clearing stream BW in atomic state drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v13 drm/amd/pm: Return -EOPNOTSUPP for unsupported OD_MCLK on smu_v13_0_6 drm/amd/pm: Skip redundant UCLK restore in smu_v13_0_6 drm/amd/display: Fix drm_edid leak in amdgpu_dm drm/amdgpu: prevent immediate PASID reuse case drm/amdgpu: fix strsep() corrupting lockup_timeout on multi-GPU (v3) drm/amd/display: Do not skip unrelated mode changes in DSC validation drm/xe/pf: Fix use-after-free in migration restore ...	2026-03-27 17:21:37 -07:00
Vasily Gorbik	0738d395aa	s390/entry: Scrub r12 register on kernel entry Before commit `f33f2d4c7c` ("s390/bp: remove TIF_ISOLATE_BP"), all entry handlers loaded r12 with the current task pointer (lg %r12,__LC_CURRENT) for use by the BPENTER/BPEXIT macros. That commit removed TIF_ISOLATE_BP, dropping both the branch prediction macros and the r12 load, but did not add r12 to the register clearing sequence. Add the missing xgr %r12,%r12 to make the register scrub consistent across all entry points. Fixes: `f33f2d4c7c` ("s390/bp: remove TIF_ISOLATE_BP") Cc: stable@kernel.org Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2026-03-28 00:43:39 +01:00
Greg Kroah-Hartman	48b8814e25	s390/syscalls: Add spectre boundary for syscall dispatch table The s390 syscall number is directly controlled by userspace, but does not have an array_index_nospec() boundary to prevent access past the syscall function pointer tables. Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Fixes: `56e62a7370` ("s390: convert to generic entry") Cc: stable@kernel.org Assisted-by: gkh_clanker_2000 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/2026032404-sterling-swoosh-43e6@gregkh Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2026-03-28 00:43:39 +01:00
Vasily Gorbik	c5c0a268b3	s390/barrier: Make array_index_mask_nospec() __always_inline Mark array_index_mask_nospec() as __always_inline to guarantee the mitigation is emitted inline regardless of compiler inlining decisions. Fixes: `e2dd833389` ("s390: add optimized array_index_mask_nospec") Cc: stable@kernel.org Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2026-03-28 00:43:24 +01:00
Linus Torvalds	335c9017e3	Merge tag 'spi-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "There are two core fixes here. One is from Johan dealing with an issue introduced by a devm_ API usage update causing things to be freed earlier than they had earlier when we fail to register a device, another from Danilo avoids unlocked acccess to data by converting to use a driver core API. We also have a few relatively minor driver specific fixes" * tag 'spi-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-fsl-lpspi: fix teardown order issue (UAF) spi: fix use-after-free on managed registration failure spi: use generic driver_override infrastructure spi: meson-spicc: Fix double-put in remove path spi: sn-f-ospi: Use devm_mutex_init() to simplify code spi: sn-f-ospi: Fix resource leak in f_ospi_probe()	2026-03-27 16:38:55 -07:00
Linus Torvalds	cd0bbd5a66	Merge tag 'regulator-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fix from Mark Brown: "A fix from Alice for the rust bindings, they didn't handle the stub implementation of the C API used when CONFIG_REGULATOR is disabled leading to undefined behaviour" * tag 'regulator-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: rust: regulator: do not assume that regulator_get() returns non-null	2026-03-27 16:36:23 -07:00
Linus Torvalds	30052002e6	Merge tag 'regmap-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap fix from Mark Brown: "A fix from Andy Shevchenko for an issue with caching of page selector registers which are located inside the page they are switching" * tag 'regmap-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: Synchronize cache for the page selector	2026-03-27 16:34:25 -07:00
Linus Torvalds	dd09eb4433	Merge tag 'tsm-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm Pull tsm fix from Dan Williams: - Fix a VMM controlled buffer length used to emit TDX attestation reports * tag 'tsm-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm: virt: tdx-guest: Fix handling of host controlled 'quote' buffer length	2026-03-27 16:19:51 -07:00
Linus Torvalds	faf44e54f6	Merge tag 'vfio-v7.0-rc6' of https://github.com/awilliam/linux-vfio Pull VFIO fix from Alex Williamson: - Fix double-free and reference count underflow if dma-buf file allocation fails (Alex Williamson) * tag 'vfio-v7.0-rc6' of https://github.com/awilliam/linux-vfio: vfio/pci: Fix double free in dma-buf feature	2026-03-27 15:59:30 -07:00
Linus Torvalds	56bea42415	Merge tag 'efi-fixes-for-v7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fix from Ard Biesheuvel: "Fix a potential buffer overrun issue introduced by the previous fix for EFI boot services region reservations on x86" * tag 'efi-fixes-for-v7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: x86/efi: efi_unmap_boot_services: fix calculation of ranges_to_free size	2026-03-27 15:55:25 -07:00
Linus Torvalds	a361474ba3	Merge tag 'loongarch-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Fix missing NULL checks for kstrdup(), workaround LS2K/LS7A GPU DMA hang bug, emit GNU_EH_FRAME for vDSO correctly, and fix some KVM-related bugs" * tag 'loongarch-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: KVM: Fix base address calculation in kvm_eiointc_regs_access() LoongArch: KVM: Handle the case that EIOINTC's coremap is empty LoongArch: KVM: Make kvm_get_vcpu_by_cpuid() more robust LoongArch: vDSO: Emit GNU_EH_FRAME correctly LoongArch: Workaround LS2K/LS7A GPU DMA hang bug LoongArch: Fix missing NULL checks for kstrdup()	2026-03-27 15:39:41 -07:00
Linus Torvalds	196ef74abd	Merge tag 'io_uring-7.0-20260327' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull io_uring fixes from Jens Axboe: "Just two small fixes, both fixing regressions added in the fdinfo code in 6.19 with the SQE mixed size support" * tag 'io_uring-7.0-20260327' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: io_uring/fdinfo: fix OOB read in SQE_MIXED wrap check io_uring/fdinfo: fix SQE_MIXED SQE displaying	2026-03-27 15:35:38 -07:00
Dave Airlie	5ba61d8a25	Merge tag 'mediatek-drm-fixes-20260323' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes Mediatek DRM Fixes - 20260323 1. dsi: Store driver data before invoking mipi_dsi_host_register Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://patch.msgid.link/20260323160135.39609-1-chunkuang.hu@kernel.org	2026-03-28 08:05:36 +10:00
Sean Christopherson	df83746075	KVM: x86/mmu: Only WARN in direct MMUs when overwriting shadow-present SPTE Adjust KVM's sanity check against overwriting a shadow-present SPTE with a another SPTE with a different target PFN to only apply to direct MMUs, i.e. only to MMUs without shadowed gPTEs. While it's impossible for KVM to overwrite a shadow-present SPTE in response to a guest write, writes from outside the scope of KVM, e.g. from host userspace, aren't detected by KVM's write tracking and so can break KVM's shadow paging rules. ------------[ cut here ]------------ pfn != spte_to_pfn(*sptep) WARNING: arch/x86/kvm/mmu/mmu.c:3069 at mmu_set_spte+0x1e4/0x440 [kvm], CPU#0: vmx_ept_stale_r/872 Modules linked in: kvm_intel kvm irqbypass CPU: 0 UID: 1000 PID: 872 Comm: vmx_ept_stale_r Not tainted 7.0.0-rc2-eafebd2d2ab0-sink-vm #319 PREEMPT Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:mmu_set_spte+0x1e4/0x440 [kvm] Call Trace: <TASK> ept_page_fault+0x535/0x7f0 [kvm] kvm_mmu_do_page_fault+0xee/0x1f0 [kvm] kvm_mmu_page_fault+0x8d/0x620 [kvm] vmx_handle_exit+0x18c/0x5a0 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xc55/0x1c20 [kvm] kvm_vcpu_ioctl+0x2d5/0x980 [kvm] __x64_sys_ioctl+0x8a/0xd0 do_syscall_64+0xb5/0x730 entry_SYSCALL_64_after_hwframe+0x4b/0x53 </TASK> ---[ end trace 0000000000000000 ]--- Fixes: `11d4517511` ("KVM: x86/mmu: Warn if PFN changes on shadow-present SPTE in shadow MMU") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-03-27 22:33:33 +01:00
Sean Christopherson	aad885e774	KVM: x86/mmu: Drop/zap existing present SPTE even when creating an MMIO SPTE When installing an emulated MMIO SPTE, do so after dropping/zapping the existing SPTE (if it's shadow-present). While commit `a54aa15c6b` was right about it being impossible to convert a shadow-present SPTE to an MMIO SPTE due to a _guest_ write, it failed to account for writes to guest memory that are outside the scope of KVM. E.g. if host userspace modifies a shadowed gPTE to switch from a memslot to emulted MMIO and then the guest hits a relevant page fault, KVM will install the MMIO SPTE without first zapping the shadow-present SPTE. ------------[ cut here ]------------ is_shadow_present_pte(*sptep) WARNING: arch/x86/kvm/mmu/mmu.c:484 at mark_mmio_spte+0xb2/0xc0 [kvm], CPU#0: vmx_ept_stale_r/4292 Modules linked in: kvm_intel kvm irqbypass CPU: 0 UID: 1000 PID: 4292 Comm: vmx_ept_stale_r Not tainted 7.0.0-rc2-eafebd2d2ab0-sink-vm #319 PREEMPT Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:mark_mmio_spte+0xb2/0xc0 [kvm] Call Trace: <TASK> mmu_set_spte+0x237/0x440 [kvm] ept_page_fault+0x535/0x7f0 [kvm] kvm_mmu_do_page_fault+0xee/0x1f0 [kvm] kvm_mmu_page_fault+0x8d/0x620 [kvm] vmx_handle_exit+0x18c/0x5a0 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xc55/0x1c20 [kvm] kvm_vcpu_ioctl+0x2d5/0x980 [kvm] __x64_sys_ioctl+0x8a/0xd0 do_syscall_64+0xb5/0x730 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x47fa3f </TASK> ---[ end trace 0000000000000000 ]--- Reported-by: Alexander Bulekov <bkov@amazon.com> Debugged-by: Alexander Bulekov <bkov@amazon.com> Suggested-by: Fred Griffoul <fgriffo@amazon.co.uk> Fixes: `a54aa15c6b` ("KVM: x86/mmu: Handle MMIO SPTEs directly in mmu_set_spte()") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-03-27 22:33:33 +01:00
Paolo Bonzini	6c6ba54895	Merge tag 'kvm-s390-master-7.0-2' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: More memory management fixes Lots of small and not-so-small fixes for the newly rewritten gmap, mostly affecting the handling of nested guests.	2026-03-27 22:30:44 +01:00
Linus Torvalds	7df48e3631	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma fixes from Jason Gunthorpe: - Quite a few irdma bug fixes, several user triggerable - Fix a 0 SMAC header in ionic - Tolerate FW errors for RAAS in bng_re - Don't UAF in efa when printing error events - Better handle pool exhaustion in the new bvec paths * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/irdma: Harden depth calculation functions RDMA/irdma: Return EINVAL for invalid arp index error RDMA/irdma: Fix deadlock during netdev reset with active connections RDMA/irdma: Remove reset check from irdma_modify_qp_to_err() RDMA/irdma: Clean up unnecessary dereference of event->cm_node RDMA/irdma: Remove a NOP wait_event() in irdma_modify_qp_roce() RDMA/irdma: Update ibqp state to error if QP is already in error state RDMA/irdma: Initialize free_qp completion before using it RDMA/efa: Fix possible deadlock RDMA/rw: Fix MR pool exhaustion in bvec RDMA READ path RDMA/rw: Fall back to direct SGE on MR pool exhaustion RDMA/efa: Fix use of completion ctx after free RDMA/bng_re: Fix silent failure in HWRM version query RDMA/ionic: Preserve and set Ethernet source MAC after ib_ud_header_init() RDMA/irdma: Fix double free related to rereg_user_mr	2026-03-27 13:30:04 -07:00
Linus Torvalds	8af4fad545	Merge tag 'pci-v7.0-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Remove power-off from pwrctrl drivers since this is now done directly by the PCI controller drivers (Chen-Yu Tsai) - Fix pwrctrl device node leak (Felix Gu) - Document a TLP header decoder for AER log messages (Lukas Wunner) * tag 'pci-v7.0-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: Documentation: PCI: Document PCIe TLP Header decoder for AER messages PCI/pwrctrl: Fix pci_pwrctrl_is_required() device node leak PCI/pwrctrl: Do not power off on pwrctrl device removal	2026-03-27 13:25:58 -07:00
Linus Torvalds	83ce1c753f	Merge tag 'sound-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "This became slightly big partly due to my time off in the last week. But all changes are about device-specific fixes, so it should be safely applicable. ASoC: - Fix double free in sma1307 - Fix uninitialized variables in simple-card-utils/imx-card - Address clock leaks and error propagation in ADAU1372 - Add DMI quirks and ACP/SDW support for ASUS - Fix Intel CATPT DMA mask - Fix SOF topology parsing - Fix DT bindings for RK3576 SPDIF, STM32 SAI and WCD934x HD-audio: - Quirks for Lenovo, ASUS, and various HP models, as well as a speaker pop fix on Star Labs StarFighter - Revert MSI X870E Tomahawk denylist again USB-Audio: - Fix distorted audio on Focusrite Scarlett 2i2/2i4 1st Gen - Add iface reset quirk for AB17X - Update Qualcomm USB audio Kconfig dependencies and license Misc: - Fix minor compile warnings for firewire and asihpi drivers" * tag 'sound-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (35 commits) Revert "ALSA: hda/intel: Add MSI X870E Tomahawk to denylist" ALSA: usb-audio: Add iface reset and delay quirk for AB17X USB Audio ALSA: hda/realtek: add HP Laptop 15-fd0xxx mute LED quirk ALSA: usb-audio: Exclude Scarlett 2i4 1st Gen from SKIP_IFACE_SETUP ALSA: hda/realtek: Add mute LED quirk for HP Pavilion 15-eg0xxx ALSA: hda/realtek - Fixed Speaker Mute LED for HP EliteBoard G1a platform ASoC: SOF: ipc4-topology: Allow bytes controls without initial payload ASoC: adau1372: Fix clock leak on PLL lock failure ASoC: adau1372: Fix unchecked clk_prepare_enable() return value ASoC: SDCA: fix finding wrong entity ASoC: SDCA: remove the max count of initialization table ASoC: codecs: wcd934x: fix typo in dt parsing ASoC: dt-bindings: stm32: Fix incorrect compatible string in stm32h7-sai match ASoC: Intel: catpt: Fix the device initialization ASoC: amd: acp: add ASUS HN7306EA quirk for legacy SDW machine ASoC: SOF: topology: reject invalid vendor array size in token parser ASoC: tas2781: Add null check for calibration data ALSA: asihpi: avoid write overflow check warning ASoC: fsl: imx-card: initialize playback_only and capture_only ASoC: simple-card-utils: Check value of is_playback_only and is_capture_only ...	2026-03-27 13:16:40 -07:00
Linus Torvalds	f44c65111e	Merge tag 'media/v7.0-6' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - uvcvideo may cause OOPS when out of memory - remove a deadlock in the ccs driver * tag 'media/v7.0-6' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: ccs: Avoid deadlock in ccs_init_state() media: uvcvideo: Fix bug in error path of uvc_alloc_urb_buffers	2026-03-27 13:10:49 -07:00
Linus Torvalds	0b8bf3b64e	Merge tag 'sysctl-7.00-fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl Pull sysctl fix from Joel Granados: "Fix uninitialized variable error when writing to a sysctl bitmap Removed the possibility of returning an unjustified -EINVAL when writing to a sysctl bitmap" * tag 'sysctl-7.00-fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: sysctl: fix uninitialized variable in proc_do_large_bitmap	2026-03-27 13:04:34 -07:00
Linus Torvalds	3577cfd738	Merge tag 'xfs-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs fixes from Carlos Maiolino: "This includes a few important bug fixes, and some code refactoring that was necessary for one of the fixes" * tag 'xfs-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: remove file_path tracepoint data xfs: don't irele after failing to iget in xfs_attri_recover_work xfs: remove redundant validation in xlog_recover_attri_commit_pass2 xfs: fix ri_total validation in xlog_recover_attri_commit_pass2 xfs: close crash window in attr dabtree inactivation xfs: factor out xfs_attr3_leaf_init xfs: factor out xfs_attr3_node_entry_remove xfs: only assert new size for datafork during truncate extents xfs: annotate struct xfs_attr_list_context with __counted_by_ptr xfs: cleanup buftarg handling in XFS_IOC_VERIFY_MEDIA xfs: scrub: unlock dquot before early return in quota scrub xfs: refactor xfsaild_push loop into helper xfs: save ailp before dropping the AIL lock in push callbacks xfs: avoid dereferencing log items after push callbacks xfs: stop reclaim before pushing AIL during unmount	2026-03-27 12:22:45 -07:00
Luo Haiyang	1f98857322	tracing: Fix potential deadlock in cpu hotplug with osnoise The following sequence may leads deadlock in cpu hotplug: task1 task2 task3 ----- ----- ----- mutex_lock(&interface_lock) [CPU GOING OFFLINE] cpus_write_lock(); osnoise_cpu_die(); kthread_stop(task3); wait_for_completion(); osnoise_sleep(); mutex_lock(&interface_lock); cpus_read_lock(); [DEAD LOCK] Fix by swap the order of cpus_read_lock() and mutex_lock(&interface_lock). Cc: stable@vger.kernel.org Cc: <mathieu.desnoyers@efficios.com> Cc: <zhang.run@zte.com.cn> Cc: <yang.tao172@zte.com.cn> Cc: <ran.xiaokai@zte.com.cn> Fixes: `bce29ac9ce` ("trace: Add osnoise tracer") Link: https://patch.msgid.link/20260326141953414bVSj33dAYktqp9Oiyizq8@zte.com.cn Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Luo Haiyang <luo.haiyang@zte.com.cn> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2026-03-27 15:18:06 -04:00
Sachin Mokashi	ebbe5d957e	ASoC: Intel: ehl_rt5660: remove unused macro definitions DUAL_CHANNEL and NAME_SIZE macros are not being used (anymore) but the macros are still defined. Remove them to clean up dead code. Signed-off-by: Sachin Mokashi <sachin.mokashi@intel.com> Link: https://patch.msgid.link/20260324163400.1276247-1-sachin.mokashi@intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2026-03-27 19:13:56 +00:00
Shuming Fan	ae00200acb	ASoC: SDCA: fix the register to ctl value conversion for Q7.8 format The division calculation should be implemented using signed integer format. This patch changes mc->shift from an unsigned type to a signed integer during the calculation. Fixes: `501efdcb3b` ("ASoC: SDCA: Pull the Q7.8 volume helpers out of soc-ops") Signed-off-by: Shuming Fan <shumingf@realtek.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20260327082331.2277498-1-shumingf@realtek.com Signed-off-by: Mark Brown <broonie@kernel.org>	2026-03-27 19:12:33 +00:00

1 2 3 4 5 ...

1429592 Commits