linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-04-10 22:59:03 -04:00

Author	SHA1	Message	Date
Linus Torvalds	b10c31b70b	Merge tag 'for-6.17-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fix delayed inode tracking in xarray, eviction can race with insertion and leave behind a disconnected inode - on systems with large page (64K) and small block size (4K) fix compression read that can return partially filled folio - slightly relax compression option format for backward compatibility, allow to specify level for LZO although there's only one - fix simple quota accounting of compressed extents - validate minimum device size in 'device add' - update maintainers' entry * tag 'for-6.17-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: don't allow adding block device of less than 1 MB MAINTAINERS: update btrfs entry btrfs: fix subvolume deletion lockup caused by inodes xarray race btrfs: fix corruption reading compressed range when block size is smaller than page size btrfs: accept and ignore compression level for lzo btrfs: fix squota compressed stats leak	2025-09-11 08:01:18 -07:00
Linus Torvalds	4f553c1e2c	Merge tag 'mm-hotfixes-stable-2025-09-10-20-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "20 hotfixes. 15 are cc:stable and the remainder address post-6.16 issues or aren't considered necessary for -stable kernels. 14 of these fixes are for MM. This includes - kexec fixes from Breno for a recently introduced use-uninitialized bug - DAMON fixes from Quanmin Yan to avoid div-by-zero crashes which can occur if the operator uses poorly-chosen insmod parameters and misc singleton fixes" * tag 'mm-hotfixes-stable-2025-09-10-20-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: MAINTAINERS: add tree entry to numa memblocks and emulation block mm/damon/sysfs: fix use-after-free in state_show() proc: fix type confusion in pde_set_flags() compiler-clang.h: define __SANITIZE_*__ macros only when undefined mm/vmalloc, mm/kasan: respect gfp mask in kasan_populate_vmalloc() ocfs2: fix recursive semaphore deadlock in fiemap call mm/memory-failure: fix VM_BUG_ON_PAGE(PagePoisoned(page)) when unpoison memory mm/mremap: fix regression in vrm->new_addr check percpu: fix race on alloc failed warning limit mm/memory-failure: fix redundant updates for already poisoned pages s390: kexec: initialize kexec_buf struct riscv: kexec: initialize kexec_buf struct arm64: kexec: initialize kexec_buf struct in load_other_segments() mm/damon/reclaim: avoid divide-by-zero in damon_reclaim_apply_parameters() mm/damon/lru_sort: avoid divide-by-zero in damon_lru_sort_apply_parameters() mm/damon/core: set quota->charged_from to jiffies at first charge window mm/hugetlb: add missing hugetlb_lock in __unmap_hugepage_range() init/main.c: fix boot time tracing crash mm/memory_hotplug: fix hwpoisoned large folio handling in do_migrate_range() mm/khugepaged: fix the address passed to notifier on testing young	2025-09-10 21:19:34 -07:00
Linus Torvalds	7aac71907b	Merge tag 'nfs-for-6.17-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client fixes from Trond Myklebust: "Stable patches: - Revert "SUNRPC: Don't allow waiting for exiting tasks" as it is breaking ltp tests Bugfixes: - Another set of fixes to the tracking of NFSv4 server capabilities when crossing filesystem boundaries - Localio fix to restore credentials and prevent triggering a BUG_ON() - Fix to prevent flapping of the localio on/off trigger - Protections against 'eof page pollution' as demonstrated in xfstests generic/363 - Series of patches to ensure correct ordering of O_DIRECT i/o and truncate, fallocate and copy functions - Fix a NULL pointer check in flexfiles reads that regresses 6.17 - Correct a typo that breaks flexfiles layout segment processing" * tag 'nfs-for-6.17-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4/flexfiles: Fix layout merge mirror check. SUNRPC: call xs_sock_process_cmsg for all cmsg Revert "SUNRPC: Don't allow waiting for exiting tasks" NFS: Fix the marking of the folio as up to date NFS: nfs_invalidate_folio() must observe the offset and size arguments NFSv4.2: Serialise O_DIRECT i/o and copy range NFSv4.2: Serialise O_DIRECT i/o and clone range NFSv4.2: Serialise O_DIRECT i/o and fallocate() NFS: Serialise O_DIRECT i/o and truncate() NFSv4.2: Protect copy offload and clone against 'eof page pollution' NFS: Protect against 'eof page pollution' flexfiles/pNFS: fix NULL checks on result of ff_layout_choose_ds_for_read nfs/localio: avoid bouncing LOCALIO if nfs_client_is_local() nfs/localio: restore creds before releasing pageio data NFSv4: Clear the NFS_CAP_XATTR flag if not supported by the server NFSv4: Clear NFS_CAP_OPEN_XOR and NFS_CAP_DELEGTIME if not supported NFSv4: Clear the NFS_CAP_FS_LOCATIONS flag if it is not set NFSv4: Don't clear capabilities that won't be reset	2025-09-10 12:38:41 -07:00
wangzijie	0ce9398aa0	proc: fix type confusion in pde_set_flags() Commit `2ce3d282bd` ("proc: fix missing pde_set_flags() for net proc files") missed a key part in the definition of proc_dir_entry: union { const struct proc_ops proc_ops; const struct file_operations proc_dir_ops; }; So dereference of ->proc_ops assumes it is a proc_ops structure results in type confusion and make NULL check for 'proc_ops' not work for proc dir. Add !S_ISDIR(dp->mode) test before calling pde_set_flags() to fix it. Link: https://lkml.kernel.org/r/20250904135715.3972782-1-wangzijie1@honor.com Fixes: `2ce3d282bd` ("proc: fix missing pde_set_flags() for net proc files") Signed-off-by: wangzijie <wangzijie1@honor.com> Reported-by: Brad Spengler <spender@grsecurity.net> Closes: https://lore.kernel.org/all/20250903065758.3678537-1-wangzijie1@honor.com/ Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: Stefano Brivio <sbrivio@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-08 23:45:12 -07:00
Mark Tinguely	04100f775c	ocfs2: fix recursive semaphore deadlock in fiemap call syzbot detected a OCFS2 hang due to a recursive semaphore on a FS_IOC_FIEMAP of the extent list on a specially crafted mmap file. context_switch kernel/sched/core.c:5357 [inline] __schedule+0x1798/0x4cc0 kernel/sched/core.c:6961 __schedule_loop kernel/sched/core.c:7043 [inline] schedule+0x165/0x360 kernel/sched/core.c:7058 schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:7115 rwsem_down_write_slowpath+0x872/0xfe0 kernel/locking/rwsem.c:1185 __down_write_common kernel/locking/rwsem.c:1317 [inline] __down_write kernel/locking/rwsem.c:1326 [inline] down_write+0x1ab/0x1f0 kernel/locking/rwsem.c:1591 ocfs2_page_mkwrite+0x2ff/0xc40 fs/ocfs2/mmap.c:142 do_page_mkwrite+0x14d/0x310 mm/memory.c:3361 wp_page_shared mm/memory.c:3762 [inline] do_wp_page+0x268d/0x5800 mm/memory.c:3981 handle_pte_fault mm/memory.c:6068 [inline] __handle_mm_fault+0x1033/0x5440 mm/memory.c:6195 handle_mm_fault+0x40a/0x8e0 mm/memory.c:6364 do_user_addr_fault+0x764/0x1390 arch/x86/mm/fault.c:1387 handle_page_fault arch/x86/mm/fault.c:1476 [inline] exc_page_fault+0x76/0xf0 arch/x86/mm/fault.c:1532 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623 RIP: 0010:copy_user_generic arch/x86/include/asm/uaccess_64.h:126 [inline] RIP: 0010:raw_copy_to_user arch/x86/include/asm/uaccess_64.h:147 [inline] RIP: 0010:_inline_copy_to_user include/linux/uaccess.h:197 [inline] RIP: 0010:_copy_to_user+0x85/0xb0 lib/usercopy.c:26 Code: e8 00 bc f7 fc 4d 39 fc 72 3d 4d 39 ec 77 38 e8 91 b9 f7 fc 4c 89 f7 89 de e8 47 25 5b fd 0f 01 cb 4c 89 ff 48 89 d9 4c 89 f6 <f3> a4 0f 1f 00 48 89 cb 0f 01 ca 48 89 d8 5b 41 5c 41 5d 41 5e 41 RSP: 0018:ffffc9000403f950 EFLAGS: 00050256 RAX: ffffffff84c7f101 RBX: 0000000000000038 RCX: 0000000000000038 RDX: 0000000000000000 RSI: ffffc9000403f9e0 RDI: 0000200000000060 RBP: ffffc9000403fa90 R08: ffffc9000403fa17 R09: 1ffff92000807f42 R10: dffffc0000000000 R11: fffff52000807f43 R12: 0000200000000098 R13: 00007ffffffff000 R14: ffffc9000403f9e0 R15: 0000200000000060 copy_to_user include/linux/uaccess.h:225 [inline] fiemap_fill_next_extent+0x1c0/0x390 fs/ioctl.c:145 ocfs2_fiemap+0x888/0xc90 fs/ocfs2/extent_map.c:806 ioctl_fiemap fs/ioctl.c:220 [inline] do_vfs_ioctl+0x1173/0x1430 fs/ioctl.c:532 __do_sys_ioctl fs/ioctl.c:596 [inline] __se_sys_ioctl+0x82/0x170 fs/ioctl.c:584 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f5f13850fd9 RSP: 002b:00007ffe3b3518b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000200000000000 RCX: 00007f5f13850fd9 RDX: 0000200000000040 RSI: 00000000c020660b RDI: 0000000000000004 RBP: 6165627472616568 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe3b3518f0 R13: 00007ffe3b351b18 R14: 431bde82d7b634db R15: 00007f5f1389a03b ocfs2_fiemap() takes a read lock of the ip_alloc_sem semaphore (since v2.6.22-527-g7307de80510a) and calls fiemap_fill_next_extent() to read the extent list of this running mmap executable. The user supplied buffer to hold the fiemap information page faults calling ocfs2_page_mkwrite() which will take a write lock (since v2.6.27-38-g00dc417fa3e7) of the same semaphore. This recursive semaphore will hold filesystem locks and causes a hang of the fileystem. The ip_alloc_sem protects the inode extent list and size. Release the read semphore before calling fiemap_fill_next_extent() in ocfs2_fiemap() and ocfs2_fiemap_inline(). This does an unnecessary semaphore lock/unlock on the last extent but simplifies the error path. Link: https://lkml.kernel.org/r/61d1a62b-2631-4f12-81e2-cd689914360b@oracle.com Fixes: `00dc417fa3` ("ocfs2: fiemap support") Signed-off-by: Mark Tinguely <mark.tinguely@oracle.com> Reported-by: syzbot+541dcc6ee768f77103e7@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=541dcc6ee768f77103e7 Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-08 23:45:11 -07:00
Jonathan Curley	dd2fa82473	NFSv4/flexfiles: Fix layout merge mirror check. Typo in ff_lseg_match_mirrors makes the diff ineffective. This results in merge happening all the time. Merge happening all the time is problematic because it marks lsegs invalid. Marking lsegs invalid causes all outstanding IO to get restarted with EAGAIN and connections to get closed. Closing connections constantly triggers race conditions in the RDMA implementation... Fixes: `660d1eb223` ("pNFS/flexfile: Don't merge layout segments if the mirrors don't match") Signed-off-by: Jonathan Curley <jcurley@purestorage.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-09-08 14:37:55 -04:00
Linus Torvalds	f777d1112e	Merge tag 'vfs-6.17-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "fuse: - Prevent opening of non-regular backing files. Fuse doesn't support non-regular files anyway. - Check whether copy_file_range() returns a larger size than requested. - Prevent overflow in copy_file_range() as fuse currently only supports 32-bit sized copies. - Cache the blocksize value if the server returned a new value as inode->i_blkbits isn't modified directly anymore. - Fix i_blkbits handling for iomap partial writes. By default i_blkbits is set to PAGE_SIZE which causes iomap to mark the whole folio as uptodate even on a partial write. But fuseblk filesystems support choosing a blocksize smaller than PAGE_SIZE risking data corruption. Simply enforce PAGE_SIZE as blocksize for fuseblk's internal inode for now. - Prevent out-of-bounds acces in fuse_dev_write() when the number of bytes to be retrieved is truncated to the fc->max_pages limit. virtiofs: - Fix page faults for DAX page addresses. Misc: - Tighten file handle decoding from userns. Check that the decoded dentry itself has a valid idmapping in the user namespace. - Fix mount-notify selftests. - Fix some indentation errors. - Add an FMODE_ flag to indicate IOCB_HAS_METADATA availability. This will be moved to an FOP_* flag with a bit more rework needed for that to happen not suitable for a fix. - Don't silently ignore metadata for sync read/write. - Don't pointlessly log warning when reading coredump sysctls" * tag 'vfs-6.17-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fuse: virtio_fs: fix page fault for DAX page address selftests/fs/mount-notify: Fix compilation failure. fhandle: use more consistent rules for decoding file handle from userns fuse: Block access to folio overlimit fuse: fix fuseblk i_blkbits for iomap partial writes fuse: reflect cached blocksize if blocksize was changed fuse: prevent overflow in copy_file_range return value fuse: check if copy_file_range() returns larger than requested size fuse: do not allow mapping a non-regular backing file coredump: don't pointlessly check and spew warnings fs: fix indentation style block: don't silently ignore metadata for sync read/write fs: add a FMODE_ flag to indicate IOCB_HAS_METADATA availability Please enter a commit message to explain why this merge is necessary, especially if it merges an updated upstream into a topic branch.	2025-09-08 07:53:01 -07:00
Trond Myklebust	c12b6a7b12	NFS: Fix the marking of the folio as up to date Since all callers of nfs_page_group_covers_page() have already ensured that there is only one group member, all that is required is to check that the entire folio contains dirty data. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-09-06 16:51:26 -04:00
Trond Myklebust	b7b8574225	NFS: nfs_invalidate_folio() must observe the offset and size arguments If we're truncating part of the folio, then we need to write out the data on the part that is not covered by the cancellation. Fixes: `d47992f86b` ("mm: change invalidatepage prototype to accept length") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-09-06 16:51:26 -04:00
Trond Myklebust	ca247c8990	NFSv4.2: Serialise O_DIRECT i/o and copy range Ensure that all O_DIRECT reads and writes complete before copying a file range, so that the destination is up to date. Fixes: `a5864c999d` ("NFS: Do not serialise O_DIRECT reads and writes") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-09-06 16:51:25 -04:00
Trond Myklebust	c80ebeba11	NFSv4.2: Serialise O_DIRECT i/o and clone range Ensure that all O_DIRECT reads and writes complete before cloning a file range, so that both the source and destination are up to date. Fixes: `a5864c999d` ("NFS: Do not serialise O_DIRECT reads and writes") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-09-06 16:51:25 -04:00
Trond Myklebust	b93128f297	NFSv4.2: Serialise O_DIRECT i/o and fallocate() Ensure that all O_DIRECT reads and writes complete before calling fallocate so that we don't race w.r.t. attribute updates. Fixes: `99f2378322` ("NFSv4.2: Always flush out writes in nfs42_proc_fallocate()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-09-06 16:51:25 -04:00
Trond Myklebust	9eb90f4354	NFS: Serialise O_DIRECT i/o and truncate() Ensure that all O_DIRECT reads and writes are complete, and prevent the initiation of new i/o until the setattr operation that will truncate the file is complete. Fixes: `a5864c999d` ("NFS: Do not serialise O_DIRECT reads and writes") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-09-06 16:51:25 -04:00
Trond Myklebust	b2036bb651	NFSv4.2: Protect copy offload and clone against 'eof page pollution' The NFSv4.2 copy offload and clone functions can also end up extending the size of the destination file, so they too need to call nfs_truncate_last_folio(). Reported-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-09-06 16:51:25 -04:00
Trond Myklebust	b1817b18ff	NFS: Protect against 'eof page pollution' This commit fixes the failing xfstest 'generic/363'. When the user mmaps() an area that extends beyond the end of file, and proceeds to write data into the folio that straddles that eof, we're required to discard that folio data if the user calls some function that extends the file length. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-09-06 16:51:25 -04:00
Tigran Mkrtchyan	5a46d2339a	flexfiles/pNFS: fix NULL checks on result of ff_layout_choose_ds_for_read Recent commit `f06bedfa62` ("pNFS/flexfiles: don't attempt pnfs on fatal DS errors") has changed the error return type of ff_layout_choose_ds_for_read() from NULL to an error pointer. However, not all code paths have been updated to match the change. Thus, some non-NULL checks will accept error pointers as a valid return value. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Fixes: `f06bedfa62` ("pNFS/flexfiles: don't attempt pnfs on fatal DS errors") Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-09-06 16:51:25 -04:00
Mike Snitzer	d3684397ea	nfs/localio: avoid bouncing LOCALIO if nfs_client_is_local() Previously nfs_local_probe() was made to disable and then attempt to re-enable LOCALIO (via LOCALIO protocol handshake) if/when it was called and LOCALIO already enabled. Vague memory for _why_ this was the case is that this was useful if/when a local NFS server were to be restarted with a local NFS client connected to it. But as it happens this causes an absurd amount of LOCALIO flapping which has a side-effect of too much IO being needlessly sent to NFSD (using RPC over the loopback network interface). This is the definition of "serious performance loss" (that negates the point of having LOCALIO). So remove this mis-optimization for re-enabling LOCALIO if/when an NFS server is restarted (which is an extremely rare thing to do). Will revisit testing that scenario again but in the meantime this patch restores the full benefit of LOCALIO. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-09-06 16:51:21 -04:00
Scott Mayhew	992203a1fb	nfs/localio: restore creds before releasing pageio data Otherwise if the nfsd filecache code releases the nfsd_file immediately, it can trigger the BUG_ON(cred == current->cred) in __put_cred() when it puts the nfsd_file->nf_file->f-cred. Fixes: `b9f5dd57f4` ("nfs/localio: use dedicated workqueues for filesystem read and write") Signed-off-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Mike Snitzer <snitzer@kernel.org> Link: https://lore.kernel.org/r/20250807164938.2395136-1-smayhew@redhat.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-09-05 15:23:49 -04:00
Linus Torvalds	c2f3b108c0	Merge tag '6.17-RC4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 Pull smb client fixes from Steve French: - Fix two potential NULL pointer references - Two debugging improvements (to help debug recent issues) a new tracepoint, and minor improvement to DebugData - Trivial comment cleanup * tag '6.17-RC4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: prevent NULL pointer dereference in UTF16 conversion smb: client: show negotiated cipher in DebugData smb: client: add new tracepoint to trace lease break notification smb: client: fix spellings in comments smb: client: Fix NULL pointer dereference in cifs_debug_dirs_proc_show()	2025-09-05 11:14:23 -07:00
Mark Harmstone	3d1267475b	btrfs: don't allow adding block device of less than 1 MB Commit 15ae0410c37a79 ("btrfs-progs: add error handling for device_get_partition_size_fd_stat()") in btrfs-progs inadvertently changed it so that if the BLKGETSIZE64 ioctl on a block device returned a size of 0, this was no longer seen as an error condition. Unfortunately this is how disconnected NBD devices behave, meaning that with btrfs-progs 6.16 it's now possible to add a device you can't remove: # btrfs device add /dev/nbd0 /root/temp # btrfs device remove /dev/nbd0 /root/temp ERROR: error removing device '/dev/nbd0': Invalid argument This check should always have been done kernel-side anyway, so add a check in btrfs_init_new_device() that the new device doesn't have a size less than BTRFS_DEVICE_RANGE_RESERVED (i.e. 1 MB). Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-09-05 19:52:10 +02:00
Haiyue Wang	e1bf212d06	fuse: virtio_fs: fix page fault for DAX page address The commit `ced17ee32a` ("Revert "virtio: reject shm region if length is zero"") exposes the following DAX page fault bug (this fix the failure that getting shm region alway returns false because of zero length): The commit `21aa65bf82` ("mm: remove callers of pfn_t functionality") handles the DAX physical page address incorrectly: the removed macro 'phys_to_pfn_t()' should be replaced with 'PHYS_PFN()'. [ 1.390321] BUG: unable to handle page fault for address: ffffd3fb40000008 [ 1.390875] #PF: supervisor read access in kernel mode [ 1.391257] #PF: error_code(0x0000) - not-present page [ 1.391509] PGD 0 P4D 0 [ 1.391626] Oops: Oops: 0000 [#1] SMP NOPTI [ 1.391806] CPU: 6 UID: 1000 PID: 162 Comm: weston Not tainted 6.17.0-rc3-WSL2-STABLE #2 PREEMPT(none) [ 1.392361] RIP: 0010:dax_to_folio+0x14/0x60 [ 1.392653] Code: 52 c9 c3 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 c1 ef 05 48 c1 e7 06 48 03 3d 34 b5 31 01 <48> 8b 57 08 48 89 f8 f6 c2 01 75 2b 66 90 c3 cc cc cc cc f7 c7 ff [ 1.393727] RSP: 0000:ffffaf7d04407aa8 EFLAGS: 00010086 [ 1.394003] RAX: 000000a000000000 RBX: ffffaf7d04407bb0 RCX: 0000000000000000 [ 1.394524] RDX: ffffd17b40000008 RSI: 0000000000000083 RDI: ffffd3fb40000000 [ 1.394967] RBP: 0000000000000011 R08: 000000a000000000 R09: 0000000000000000 [ 1.395400] R10: 0000000000001000 R11: ffffaf7d04407c10 R12: 0000000000000000 [ 1.395806] R13: ffffa020557be9c0 R14: 0000014000000001 R15: 0000725970e94000 [ 1.396268] FS: 000072596d6d2ec0(0000) GS:ffffa0222dc59000(0000) knlGS:0000000000000000 [ 1.396715] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.397100] CR2: ffffd3fb40000008 CR3: 000000011579c005 CR4: 0000000000372ef0 [ 1.397518] Call Trace: [ 1.397663] <TASK> [ 1.397900] dax_insert_entry+0x13b/0x390 [ 1.398179] dax_fault_iter+0x2a5/0x6c0 [ 1.398443] dax_iomap_pte_fault+0x193/0x3c0 [ 1.398750] __fuse_dax_fault+0x8b/0x270 [ 1.398997] ? vm_mmap_pgoff+0x161/0x210 [ 1.399175] __do_fault+0x30/0x180 [ 1.399360] do_fault+0xc4/0x550 [ 1.399547] __handle_mm_fault+0x8e3/0xf50 [ 1.399731] ? do_syscall_64+0x72/0x1e0 [ 1.399958] handle_mm_fault+0x192/0x2f0 [ 1.400204] do_user_addr_fault+0x20e/0x700 [ 1.400418] exc_page_fault+0x66/0x150 [ 1.400602] asm_exc_page_fault+0x26/0x30 [ 1.400831] RIP: 0033:0x72596d1bf703 [ 1.401076] Code: 31 f6 45 31 e4 48 8d 15 b3 73 00 00 e8 06 03 00 00 8b 83 68 01 00 00 e9 8e fa ff ff 0f 1f 00 48 8b 44 24 08 4c 89 ee 48 89 df <c7> 00 21 43 34 12 e8 72 09 00 00 e9 6a fa ff ff 0f 1f 44 00 00 e8 [ 1.402172] RSP: 002b:00007ffc350f6dc0 EFLAGS: 00010202 [ 1.402488] RAX: 0000725970e94000 RBX: 00005b7c642c2560 RCX: 0000725970d359a7 [ 1.402898] RDX: 0000000000000003 RSI: 00007ffc350f6dc0 RDI: 00005b7c642c2560 [ 1.403284] RBP: 00007ffc350f6e90 R08: 000000000000000d R09: 0000000000000000 [ 1.403634] R10: 00007ffc350f6dd8 R11: 0000000000000246 R12: 0000000000000001 [ 1.404078] R13: 00007ffc350f6dc0 R14: 0000725970e29ce0 R15: 0000000000000003 [ 1.404450] </TASK> [ 1.404570] Modules linked in: [ 1.404821] CR2: ffffd3fb40000008 [ 1.405029] ---[ end trace 0000000000000000 ]--- [ 1.405323] RIP: 0010:dax_to_folio+0x14/0x60 [ 1.405556] Code: 52 c9 c3 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 c1 ef 05 48 c1 e7 06 48 03 3d 34 b5 31 01 <48> 8b 57 08 48 89 f8 f6 c2 01 75 2b 66 90 c3 cc cc cc cc f7 c7 ff [ 1.406639] RSP: 0000:ffffaf7d04407aa8 EFLAGS: 00010086 [ 1.406910] RAX: 000000a000000000 RBX: ffffaf7d04407bb0 RCX: 0000000000000000 [ 1.407379] RDX: ffffd17b40000008 RSI: 0000000000000083 RDI: ffffd3fb40000000 [ 1.407800] RBP: 0000000000000011 R08: 000000a000000000 R09: 0000000000000000 [ 1.408246] R10: 0000000000001000 R11: ffffaf7d04407c10 R12: 0000000000000000 [ 1.408666] R13: ffffa020557be9c0 R14: 0000014000000001 R15: 0000725970e94000 [ 1.409170] FS: 000072596d6d2ec0(0000) GS:ffffa0222dc59000(0000) knlGS:0000000000000000 [ 1.409608] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.409977] CR2: ffffd3fb40000008 CR3: 000000011579c005 CR4: 0000000000372ef0 [ 1.410437] Kernel panic - not syncing: Fatal exception [ 1.410857] Kernel Offset: 0xc000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Fixes: `21aa65bf82` ("mm: remove callers of pfn_t functionality") Signed-off-by: Haiyue Wang <haiyuewa@163.com> Link: https://lore.kernel.org/20250904120339.972-1-haiyuewa@163.com Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-09-05 15:56:30 +02:00
Makar Semyonov	70bccd9855	cifs: prevent NULL pointer dereference in UTF16 conversion There can be a NULL pointer dereference bug here. NULL is passed to __cifs_sfu_make_node without checks, which passes it unchecked to cifs_strndup_to_utf16, which in turn passes it to cifs_local_to_utf16_bytes where '*from' is dereferenced, causing a crash. This patch adds a check for NULL 'src' in cifs_strndup_to_utf16 and returns NULL early to prevent dereferencing NULL pointer. Found by Linux Verification Center (linuxtesting.org) with SVACE Signed-off-by: Makar Semyonov <m.semenov@tssltd.ru> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2025-09-04 11:43:31 -05:00
Linus Torvalds	08b06c30a4	Merge tag 'v6.17-rc4-ksmbd-fix' of git://git.samba.org/ksmbd Pull smb server fix from Steve French: - fix handling filenames with ":" (colon) in them * tag 'v6.17-rc4-ksmbd-fix' of git://git.samba.org/ksmbd: ksmbd: allow a filename to contain colons on SMB3.1.1 posix extensions	2025-09-03 20:44:15 -07:00
Bharath SM	91be128b49	smb: client: show negotiated cipher in DebugData Print the negotiated encryption cipher type in DebugData Signed-off-by: Bharath SM <bharathsm@microsoft.com> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-09-02 20:38:00 -05:00
Bharath SM	72595cb6da	smb: client: add new tracepoint to trace lease break notification Add smb3_lease_break_enter to trace lease break notifications, recording lease state, flags, epoch, and lease key. Align smb3_lease_not_found to use the same payload and print format. Signed-off-by: Bharath SM <bharathsm@microsoft.com> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-09-02 20:37:44 -05:00
Bharath SM	0c3813d855	smb: client: fix spellings in comments correct spellings in comments Signed-off-by: Bharath SM <bharathsm@microsoft.com> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-09-02 20:37:17 -05:00
Linus Torvalds	8026aed072	Merge tag 'mm-hotfixes-stable-2025-09-01-17-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "17 hotfixes. 13 are cc:stable and the remainder address post-6.16 issues or aren't considered necessary for -stable kernels. 11 of these fixes are for MM. This includes a three-patch series from Harry Yoo which fixes an intermittent boot failure which can occur on x86 systems. And a two-patch series from Alexander Gordeev which fixes a KASAN crash on S390 systems" * tag 'mm-hotfixes-stable-2025-09-01-17-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm: fix possible deadlock in kmemleak x86/mm/64: define ARCH_PAGE_TABLE_SYNC_MASK and arch_sync_kernel_mappings() mm: introduce and use {pgd,p4d}_populate_kernel() mm: move page table sync declarations to linux/pgtable.h proc: fix missing pde_set_flags() for net proc files mm: fix accounting of memmap pages mm/damon/core: prevent unnecessary overflow in damos_set_effective_quota() kexec: add KEXEC_FILE_NO_CMA as a legal flag kasan: fix GCC mem-intrinsic prefix with sw tags mm/kasan: avoid lazy MMU mode hazards mm/kasan: fix vmalloc shadow memory (de-)population races kunit: kasan_test: disable fortify string checker on kasan_strings() test selftests/mm: fix FORCE_READ to read input value correctly mm/userfaultfd: fix kmap_local LIFO ordering for CONFIG_HIGHPTE ocfs2: prevent release journal inode after journal shutdown rust: mm: mark VmaNew as transparent of_numa: fix uninitialized memory nodes causing kernel panic	2025-09-02 13:18:00 -07:00
Linus Torvalds	e3c94a539e	Merge tag 'for-6.17-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fix a few races related to inode link count - fix inode leak on failure to add link to inode - move transaction aborts closer to where they happen * tag 'for-6.17-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: avoid load/store tearing races when checking if an inode was logged btrfs: fix race between setting last_dir_index_offset and inode logging btrfs: fix race between logging inode and checking if it was logged before btrfs: simplify error handling logic for btrfs_link() btrfs: fix inode leak on failure to add link to inode btrfs: abort transaction on failure to add link to inode	2025-09-02 13:13:22 -07:00
Omar Sandoval	f6a6c28005	btrfs: fix subvolume deletion lockup caused by inodes xarray race There is a race condition between inode eviction and inode caching that can cause a live struct btrfs_inode to be missing from the root->inodes xarray. Specifically, there is a window during evict() between the inode being unhashed and deleted from the xarray. If btrfs_iget() is called for the same inode in that window, it will be recreated and inserted into the xarray, but then eviction will delete the new entry, leaving nothing in the xarray: Thread 1 Thread 2 --------------------------------------------------------------- evict() remove_inode_hash() btrfs_iget_path() btrfs_iget_locked() btrfs_read_locked_inode() btrfs_add_inode_to_root() destroy_inode() btrfs_destroy_inode() btrfs_del_inode_from_root() __xa_erase In turn, this can cause issues for subvolume deletion. Specifically, if an inode is in this lost state, and all other inodes are evicted, then btrfs_del_inode_from_root() will call btrfs_add_dead_root() prematurely. If the lost inode has a delayed_node attached to it, then when btrfs_clean_one_deleted_snapshot() calls btrfs_kill_all_delayed_nodes(), it will loop forever because the delayed_nodes xarray will never become empty (unless memory pressure forces the inode out). We saw this manifest as soft lockups in production. Fix it by only deleting the xarray entry if it matches the given inode (using __xa_cmpxchg()). Fixes: `310b2f5d5a` ("btrfs: use an xarray to track open inodes in a root") Cc: stable@vger.kernel.org # 6.11+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Co-authored-by: Leo Martins <loemra.dev@gmail.com> Signed-off-by: Leo Martins <loemra.dev@gmail.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-09-02 20:45:30 +02:00
Qu Wenruo	9786531399	btrfs: fix corruption reading compressed range when block size is smaller than page size [BUG] With 64K page size (aarch64 with 64K page size config) and 4K btrfs block size, the following workload can easily lead to a corrupted read: mkfs.btrfs -f -s 4k $dev > /dev/null mount -o compress $dev $mnt xfs_io -f -c "pwrite -S 0xff 0 64k" $mnt/base > /dev/null echo "correct result:" od -Ad -t x1 $mnt/base xfs_io -f -c "reflink $mnt/base 32k 0 32k" \ -c "reflink $mnt/base 0 32k 32k" \ -c "pwrite -S 0xff 60k 4k" $mnt/new > /dev/null echo "incorrect result:" od -Ad -t x1 $mnt/new umount $mnt This shows the following result: correct result: 0000000 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff * 0065536 incorrect result: 0000000 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff * 0032768 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 0061440 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff * 0065536 Notice the zero in the range [32K, 60K), which is incorrect. [CAUSE] With extra trace printk, it shows the following events during od: (some unrelated info removed like CPU and context) od-3457 btrfs_do_readpage: enter r/i=5/258 folio=0(65536) prev_em_start=0000000000000000 The "r/i" is indicating the root and inode number. In our case the file "new" is using ino 258 from fs tree (root 5). Here notice the @prev_em_start pointer is NULL. This means the btrfs_do_readpage() is called from btrfs_read_folio(), not from btrfs_readahead(). od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=0 got em start=0 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=4096 got em start=0 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=8192 got em start=0 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=12288 got em start=0 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=16384 got em start=0 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=20480 got em start=0 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=24576 got em start=0 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=28672 got em start=0 len=32768 These above 32K blocks will be read from the first half of the compressed data extent. od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=32768 got em start=32768 len=32768 Note here there is no btrfs_submit_compressed_read() call. Which is incorrect now. Although both extent maps at 0 and 32K are pointing to the same compressed data, their offsets are different thus can not be merged into the same read. So this means the compressed data read merge check is doing something wrong. od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=36864 got em start=32768 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=40960 got em start=32768 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=45056 got em start=32768 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=49152 got em start=32768 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=53248 got em start=32768 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=57344 got em start=32768 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=61440 skip uptodate od-3457 btrfs_submit_compressed_read: cb orig_bio: file off=0 len=61440 The function btrfs_submit_compressed_read() is only called at the end of folio read. The compressed bio will only have an extent map of range [0, 32K), but the original bio passed in is for the whole 64K folio. This will cause the decompression part to only fill the first 32K, leaving the rest untouched (aka, filled with zero). This incorrect compressed read merge leads to the above data corruption. There were similar problems that happened in the past, commit `808f80b467` ("Btrfs: update fix for read corruption of compressed and shared extents") is doing pretty much the same fix for readahead. But that's back to 2015, where btrfs still only supports bs (block size) == ps (page size) cases. This means btrfs_do_readpage() only needs to handle a folio which contains exactly one block. Only btrfs_readahead() can lead to a read covering multiple blocks. Thus only btrfs_readahead() passes a non-NULL @prev_em_start pointer. With v5.15 kernel btrfs introduced bs < ps support. This breaks the above assumption that a folio can only contain one block. Now btrfs_read_folio() can also read multiple blocks in one go. But btrfs_read_folio() doesn't pass a @prev_em_start pointer, thus the existing bio force submission check will never be triggered. In theory, this can also happen for btrfs with large folios, but since large folio is still experimental, we don't need to bother it, thus only bs < ps support is affected for now. [FIX] Instead of passing @prev_em_start to do the proper compressed extent check, introduce one new member, btrfs_bio_ctrl::last_em_start, so that the existing bio force submission logic will always be triggered. CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-09-02 20:45:25 +02:00
Calvin Owens	6db1df415d	btrfs: accept and ignore compression level for lzo The compression level is meaningless for lzo, but before commit `3f093ccb95` ("btrfs: harden parsing of compression mount options"), it was silently ignored if passed. After that commit, passing a level with lzo fails to mount: BTRFS error: unrecognized compression value lzo:1 It seems reasonable for users to expect that lzo would permit a numeric level option, as all the other algos do, even though the kernel's implementation of LZO currently only supports a single level. Because it has always worked to pass a level, it seems likely to me that users in the real world are relying on doing so. This patch restores the old behavior, giving "lzo:N" the same semantics as all of the other compression algos. To be clear, silly variants like "lzo:one", "lzo:the_first_option", or "lzo:armageddon" also used to work. This isn't meant to suggest that any possible mis-interpretation of mount options that once worked must continue to work forever. This is an exceptional case where it makes sense to preserve compatibility, both because the mis-interpretation is reasonable, and because nothing tangible is sacrificed. Finally update btrfs_show_options() to ignore the level of LZO, as it is only the default level without any extra meaning. Fixes: `3f093ccb95` ("btrfs: harden parsing of compression mount options") Reviewed-by: Daniel Vacek <neelx@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Calvin Owens <calvin@wbinvd.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-09-02 20:45:19 +02:00
Boris Burkov	de134cb54c	btrfs: fix squota compressed stats leak The following workload on a squota enabled fs: btrfs subvol create mnt/subvol # ensure subvol extents get accounted sync btrfs qgroup create 1/1 mnt btrfs qgroup assign mnt/subvol 1/1 mnt btrfs qgroup delete mnt/subvol # make the cleaner thread run btrfs filesystem sync mnt sleep 1 btrfs filesystem sync mnt btrfs qgroup destroy 1/1 mnt will fail with EBUSY. The reason is that 1/1 does the quick accounting when we assign subvol to it, gaining its exclusive usage as excl and excl_cmpr. But then when we delete subvol, the decrement happens via record_squota_delta() which does not update excl_cmpr, as squotas does not make any distinction between compressed and normal extents. Thus, we increment excl_cmpr but never decrement it, and are unable to delete 1/1. The two possible fixes are to make squota always mirror excl and excl_cmpr or to make the fast accounting separately track the plain and cmpr numbers. The latter felt cleaner to me so that is what I opted for. Fixes: `1e0e9d5771` ("btrfs: add helper for recording simple quota deltas") CC: stable@vger.kernel.org # 6.12+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>	2025-09-02 20:45:15 +02:00
Christian Brauner	e23654f5b1	Merge tag 'fuse-fixes-6.17-rc5' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse into vfs.fixes fuse fixes for 6.17-rc5 * tag 'fuse-fixes-6.17-rc5' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (6 commits) fuse: Block access to folio overlimit fuse: fix fuseblk i_blkbits for iomap partial writes fuse: reflect cached blocksize if blocksize was changed fuse: prevent overflow in copy_file_range return value fuse: check if copy_file_range() returns larger than requested size fuse: do not allow mapping a non-regular backing file Link: https://lore.kernel.org/CAJfpeguEVMMyw_zCb+hbOuSxdE2Z3Raw=SJsq=Y56Ae6dn2W3g@mail.gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-09-01 12:48:28 +02:00
Philipp Kerling	b5ee94ac65	ksmbd: allow a filename to contain colons on SMB3.1.1 posix extensions If the client sends SMB2_CREATE_POSIX_CONTEXT to ksmbd, allow the filename to contain a colon (':'). This requires disabling the support for Alternate Data Streams (ADS), which are denoted by a colon-separated suffix to the filename on Windows. This should not be an issue, since this concept is not known to POSIX anyway and the client has to explicitly request a POSIX context to get this behavior. Link: https://lore.kernel.org/all/f9401718e2be2ab22058b45a6817db912784ef61.camel@rx2.rx-server.de/ Signed-off-by: Philipp Kerling <pkerling@casix.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-08-31 17:48:38 -05:00
Wang Zhaolong	6976c7a69d	smb: client: Fix NULL pointer dereference in cifs_debug_dirs_proc_show() Reading /proc/fs/cifs/open_dirs may hit a NULL dereference when tcon->cfids is NULL. Add NULL check before accessing cfids to prevent the crash. Reproduction: - Mount CIFS share - cat /proc/fs/cifs/open_dirs Fixes: `844e5c0eb1` ("smb3 client: add way to show directory leases for improved debugging") Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-08-31 17:46:57 -05:00
Trond Myklebust	4fb2b677fc	NFSv4: Clear the NFS_CAP_XATTR flag if not supported by the server nfs_server_set_fsinfo() shouldn't assume that NFS_CAP_XATTR is unset on entry to the function. Fixes: `b78ef845c3` ("NFSv4.2: query the server for extended attribute support") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-08-29 12:56:43 -04:00
Trond Myklebust	b3ac334360	NFSv4: Clear NFS_CAP_OPEN_XOR and NFS_CAP_DELEGTIME if not supported _nfs4_server_capabilities() should clear capabilities that are not supported by the server. Fixes: `d2a00cceb9` ("NFSv4: Detect support for OPEN4_SHARE_ACCESS_WANT_OPEN_XOR_DELEGATION") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-08-29 12:56:43 -04:00
Trond Myklebust	dd5a8621b8	NFSv4: Clear the NFS_CAP_FS_LOCATIONS flag if it is not set _nfs4_server_capabilities() is expected to clear any flags that are not supported by the server. Fixes: `8a59bb93b7` ("NFSv4 store server support for fs_location attribute") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-08-29 12:56:43 -04:00
Trond Myklebust	31f1a960ad	NFSv4: Don't clear capabilities that won't be reset Don't clear the capabilities that are not going to get reset by the call to _nfs4_server_capabilities(). Reported-by: Scott Haiden <scott.b.haiden@gmail.com> Fixes: `b01f21cacd` ("NFS: Fix the setting of capabilities when automounting a new filesystem") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-08-29 12:56:27 -04:00
Linus Torvalds	fb679c832b	Merge tag 'efi-fixes-for-v6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: - Assorted fixes for the OP-TEE based pseudo-EFI variable store - Fix for an OOB access when looking up the same non-existing efivarfs entry multiple times in parallel * tag 'efi-fixes-for-v6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efivarfs: Fix slab-out-of-bounds in efivarfs_d_compare efi: stmm: Drop unneeded null pointer check efi: stmm: Drop unused EFI error from setup_mm_hdr arguments efi: stmm: Do not return EFI_OUT_OF_RESOURCES on internal errors efi: stmm: Fix incorrect buffer allocation method	2025-08-29 09:15:46 -07:00
Linus Torvalds	2575e638e2	Merge tag 'v6.17-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 Pull smb client fixes from Steve French: - Fix possible refcount leak in compound operations - Fix remap_file_range() return code mapping, found by generic/157 * tag 'v6.17-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: fs/smb: Fix inconsistent refcnt update smb3 client: fix return code mapping of remap_file_range	2025-08-29 08:51:34 -07:00
Linus Torvalds	469447200a	Merge tag 'xfs-fixes-6.17-rc4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs fixes from Carlos Maiolino: "The highlight I'd like to point here is related to the XFS_RT Kconfig, which has been updated to be enabled by default now if CONFIG_BLK_DEV_ZONED is enabled. This also contains a few fixes for zoned devices support in XFS, specially related to swapon requests in inodes belonging to the zoned FS. A null-ptr dereference fix in the xattr data, due to a mishandling of medium errors generated by block devices is also included" * tag 'xfs-fixes-6.17-rc4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: do not propagate ENODATA disk errors into xattr code xfs: reject swapon for inodes on a zoned file system earlier xfs: kick off inodegc when failing to reserve zoned blocks xfs: remove xfs_last_used_zone xfs: Default XFS_RT to Y if CONFIG_BLK_DEV_ZONED is enabled	2025-08-29 08:09:34 -07:00
Amir Goldstein	bb585591eb	fhandle: use more consistent rules for decoding file handle from userns Commit `620c266f39` ("fhandle: relax open_by_handle_at() permission checks") relaxed the coditions for decoding a file handle from non init userns. The conditions are that that decoded dentry is accessible from the user provided mountfd (or to fs root) and that all the ancestors along the path have a valid id mapping in the userns. These conditions are intentionally more strict than the condition that the decoded dentry should be "lookable" by path from the mountfd. For example, the path /home/amir/dir/subdir is lookable by path from unpriv userns of user amir, because /home perms is 755, but the owner of /home does not have a valid id mapping in unpriv userns of user amir. The current code did not check that the decoded dentry itself has a valid id mapping in the userns. There is no security risk in that, because that final open still performs the needed permission checks, but this is inconsistent with the checks performed on the ancestors, so the behavior can be a bit confusing. Add the check for the decoded dentry itself, so that the entire path, including the last component has a valid id mapping in the userns. Fixes: `620c266f39` ("fhandle: relax open_by_handle_at() permission checks") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/20250827194309.1259650-1-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-08-29 09:48:31 +02:00
Li Nan	a6358f8cf6	efivarfs: Fix slab-out-of-bounds in efivarfs_d_compare Observed on kernel 6.6 (present on master as well): BUG: KASAN: slab-out-of-bounds in memcmp+0x98/0xd0 Call trace: kasan_check_range+0xe8/0x190 __asan_loadN+0x1c/0x28 memcmp+0x98/0xd0 efivarfs_d_compare+0x68/0xd8 __d_lookup_rcu_op_compare+0x178/0x218 __d_lookup_rcu+0x1f8/0x228 d_alloc_parallel+0x150/0x648 lookup_open.isra.0+0x5f0/0x8d0 open_last_lookups+0x264/0x828 path_openat+0x130/0x3f8 do_filp_open+0x114/0x248 do_sys_openat2+0x340/0x3c0 __arm64_sys_openat+0x120/0x1a0 If dentry->d_name.len < EFI_VARIABLE_GUID_LEN , 'guid' can become negative, leadings to oob. The issue can be triggered by parallel lookups using invalid filename: T1 T2 lookup_open ->lookup simple_lookup d_add // invalid dentry is added to hash list lookup_open d_alloc_parallel __d_lookup_rcu __d_lookup_rcu_op_compare hlist_bl_for_each_entry_rcu // invalid dentry can be retrieved ->d_compare efivarfs_d_compare // oob Fix it by checking 'guid' before cmp. Fixes: `da27a24383` ("efivarfs: guid part of filenames are case-insensitive") Signed-off-by: Li Nan <linan122@huawei.com> Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2025-08-28 08:39:49 +02:00
wangzijie	2ce3d282bd	proc: fix missing pde_set_flags() for net proc files To avoid potential UAF issues during module removal races, we use pde_set_flags() to save proc_ops flags in PDE itself before proc_register(), and then use pde_has_proc_() helpers instead of directly dereferencing pde->proc_ops->. However, the pde_set_flags() call was missing when creating net related proc files. This omission caused incorrect behavior which FMODE_LSEEK was being cleared inappropriately in proc_reg_open() for net proc files. Lars reported it in this link[1]. Fix this by ensuring pde_set_flags() is called when register proc entry, and add NULL check for proc_ops in pde_set_flags(). [wangzijie1@honor.com: stash pde->proc_ops in a local const variable, per Christian] Link: https://lkml.kernel.org/r/20250821105806.1453833-1-wangzijie1@honor.com Link: https://lkml.kernel.org/r/20250818123102.959595-1-wangzijie1@honor.com Link: https://lore.kernel.org/all/20250815195616.64497967@chagall.paradoxon.rec/ [1] Fixes: `ff7ec8dc1b` ("proc: use the same treatment to check proc_lseek as ones for proc_read_iter et.al") Signed-off-by: wangzijie <wangzijie1@honor.com> Reported-by: Lars Wendler <polynomial-c@gmx.de> Tested-by: Stefano Brivio <sbrivio@redhat.com> Tested-by: Petr Vaněk <pv@excello.cz> Tested by: Lars Wendler <polynomial-c@gmx.de> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: Kirill A. Shutemov <k.shutemov@gmail.com> Cc: wangzijie <wangzijie1@honor.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-08-27 22:45:44 -07:00
Edward Adam Davis	f46e8ef8bb	ocfs2: prevent release journal inode after journal shutdown Before calling ocfs2_delete_osb(), ocfs2_journal_shutdown() has already been executed in ocfs2_dismount_volume(), so osb->journal must be NULL. Therefore, the following calltrace will inevitably fail when it reaches jbd2_journal_release_jbd_inode(). ocfs2_dismount_volume()-> ocfs2_delete_osb()-> ocfs2_free_slot_info()-> __ocfs2_free_slot_info()-> evict()-> ocfs2_evict_inode()-> ocfs2_clear_inode()-> jbd2_journal_release_jbd_inode(osb->journal->j_journal, Adding osb->journal checks will prevent null-ptr-deref during the above execution path. Link: https://lkml.kernel.org/r/tencent_357489BEAEE4AED74CBD67D246DBD2C4C606@qq.com Fixes: `da5e7c8782` ("ocfs2: cleanup journal init and shutdown") Signed-off-by: Edward Adam Davis <eadavis@qq.com> Reported-by: syzbot+47d8cb2f2cc1517e515a@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=47d8cb2f2cc1517e515a Tested-by: syzbot+47d8cb2f2cc1517e515a@syzkaller.appspotmail.com Reviewed-by: Mark Tinguely <mark.tinguely@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-08-27 22:45:41 -07:00
Shuhao Fu	ab529e6ca1	fs/smb: Fix inconsistent refcnt update A possible inconsistent update of refcount was identified in `smb2_compound_op`. Such inconsistent update could lead to possible resource leaks. Why it is a possible bug: 1. In the comment section of the function, it clearly states that the reference to `cfile` should be dropped after calling this function. 2. Every control flow path would check and drop the reference to `cfile`, except the patched one. 3. Existing callers would not handle refcount update of `cfile` if -ENOMEM is returned. To fix the bug, an extra goto label "out" is added, to make sure that the cleanup logic would always be respected. As the problem is caused by the allocation failure of `vars`, the cleanup logic between label "finished" and "out" can be safely ignored. According to the definition of function `is_replayable_error`, the error code of "-ENOMEM" is not recoverable. Therefore, the replay logic also gets ignored. Signed-off-by: Shuhao Fu <sfual@cse.ust.hk> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2025-08-27 14:59:06 -05:00
Edward Adam Davis	9d81ba6d49	fuse: Block access to folio overlimit syz reported a slab-out-of-bounds Write in fuse_dev_do_write. When the number of bytes to be retrieved is truncated to the upper limit by fc->max_pages and there is an offset, the oob is triggered. Add a loop termination condition to prevent overruns. Fixes: `3568a95693` ("fuse: support large folios for retrieves") Reported-by: syzbot+2d215d165f9354b9c4ea@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2d215d165f9354b9c4ea Tested-by: syzbot+2d215d165f9354b9c4ea@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Reviewed-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2025-08-27 14:29:43 +02:00
Joanne Koong	bd24d2108e	fuse: fix fuseblk i_blkbits for iomap partial writes On regular fuse filesystems, i_blkbits is set to PAGE_SHIFT which means any iomap partial writes will mark the entire folio as uptodate. However fuseblk filesystems work differently and allow the blocksize to be less than the page size. As such, this may lead to data corruption if fuseblk sets its blocksize to less than the page size, uses the writeback cache, and does a partial write, then a read and the read happens before the write has undergone writeback, since the folio will not be marked uptodate from the partial write so the read will read in the entire folio from disk, which will overwrite the partial write. The long-term solution for this, which will also be needed for fuse to enable large folios with the writeback cache on, is to have fuse also use iomap for folio reads, but until that is done, the cleanest workaround is to use the page size for fuseblk's internal kernel inode blksize/blkbits values while maintaining current behavior for stat(). This was verified using ntfs-3g: $ sudo mkfs.ntfs -f -c 512 /dev/vdd1 $ sudo ntfs-3g /dev/vdd1 ~/fuseblk $ stat ~/fuseblk/hi.txt IO Block: 512 Fixes: `a4c9ab1d49` ("fuse: use iomap for buffered writes") Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2025-08-26 12:43:31 +02:00
Joanne Koong	7956994650	fuse: reflect cached blocksize if blocksize was changed As pointed out by Miklos[1], in the fuse_update_get_attr() path, the attributes returned to stat may be cached values instead of fresh ones fetched from the server. In the case where the server returned a modified blocksize value, we need to cache it and reflect it back to stat if values are not re-fetched since we now no longer directly change inode->i_blkbits. Link: https://lore.kernel.org/linux-fsdevel/CAJfpeguCOxeVX88_zPd1hqziB_C+tmfuDhZP5qO2nKmnb-dTUA@mail.gmail.com/ [1] Fixes: `542ede096e` ("fuse: keep inode->i_blkbits constant") Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2025-08-26 12:43:31 +02:00

1 2 3 4 5 ...

100840 Commits