linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-21 11:25:22 -04:00

Author	SHA1	Message	Date
Gao Xiang	2a13fc417f	erofs: consolidate z_erofs_extent_lookback() The initial m.delta[0] also needs to be checked against zero. In addition, also drop the redundant logic that errors out for lcn == 0 / m.delta[0] == 1 case. Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2025-10-22 07:54:31 +08:00
Gao Xiang	e13d315ae0	erofs: avoid infinite loops due to corrupted subpage compact indexes Robert reported an infinite loop observed by two crafted images. The root cause is that `clusterofs` can be larger than `lclustersize` for !NONHEAD `lclusters` in corrupted subpage compact indexes, e.g.: blocksize = lclustersize = 512 lcn = 6 clusterofs = 515 Move the corresponding check for full compress indexes to `z_erofs_load_lcluster_from_disk()` to also cover subpage compact compress indexes. It also fixes the position of `m->type >= Z_EROFS_LCLUSTER_TYPE_MAX` check, since it should be placed right after `z_erofs_load_{compact,full}_lcluster()`. Fixes: `8d2517aaee` ("erofs: fix up compacted indexes for block size < 4096") Fixes: `1a5223c182` ("erofs: do sanity check on m->type in z_erofs_load_compact_lcluster()") Reported-by: Robert Morris <rtm@csail.mit.edu> Closes: https://lore.kernel.org/r/35167.1760645886@localhost Reviewed-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2025-10-22 07:54:11 +08:00
Deepanshu Kartikey	cec944dd32	hugetlbfs: move lock assertions after early returns in huge_pmd_unshare() When hugetlb_vmdelete_list() processes VMAs during truncate operations, it may encounter VMAs where huge_pmd_unshare() is called without the required shareable lock. This triggers an assertion failure in hugetlb_vma_assert_locked(). The previous fix in commit `dd83609b88` ("hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list") skipped entire VMAs without shareable locks to avoid the assertion. However, this prevented pages from being unmapped and freed, causing a regression in fallocate(PUNCH_HOLE) operations where pages were not freed immediately, as reported by Mark Brown. Instead of checking locks in the caller or skipping VMAs, move the lock assertions in huge_pmd_unshare() to after the early return checks. The assertions are only needed when actual PMD unsharing work will be performed. If the function returns early because sz != PMD_SIZE or the PMD is not shared, no locks are required and assertions should not fire. This approach reverts the VMA skipping logic from commit `dd83609b88` ("hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list") while moving the assertions to avoid the assertion failure, keeping all the logic within huge_pmd_unshare() itself and allowing page unmapping and freeing to proceed for all VMAs. Link: https://lkml.kernel.org/r/20251014113344.21194-1-kartikey406@gmail.com Fixes: `dd83609b88` ("hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list") Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Reported-by: <syzbot+f26d7c75c26ec19790e7@syzkaller.appspotmail.com> Reported-by: Mark Brown <broonie@kernel.org> Closes: https://syzkaller.appspot.com/bug?extid=f26d7c75c26ec19790e7 Suggested-by: David Hildenbrand <david@redhat.com> Suggested-by: Oscar Salvador <osalvador@suse.de> Tested-by: <syzbot+f26d7c75c26ec19790e7@syzkaller.appspotmail.com> Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-10-21 15:46:17 -07:00
Chuck Lever	3e7f011c25	Revert "NFSD: Remove the cap on number of operations per NFSv4 COMPOUND" I've found that pynfs COMP6 now leaves the connection or lease in a strange state, which causes CLOSE9 to hang indefinitely. I've dug into it a little, but I haven't been able to root-cause it yet. However, I bisected to commit `48aab1606f` ("NFSD: Remove the cap on number of operations per NFSv4 COMPOUND"). Tianshuo Han also reports a potential vulnerability when decoding an NFSv4 COMPOUND. An attacker can place an arbitrarily large op count in the COMPOUND header, which results in: [ 51.410584] nfsd: vmalloc error: size 1209533382144, exceeds total pages, mode:0xdc0(GFP_KERNEL\|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0 when NFSD attempts to allocate the COMPOUND op array. Let's restore the operation-per-COMPOUND limit, but increased to 200 for now. Reported-by: tianshuo han <hantianshuo233@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Cc: stable@vger.kernel.org Tested-by: Tianshuo Han <hantianshuo233@gmail.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-10-21 11:03:50 -04:00
Nathan Chancellor	29cdfb4950	nfsd: Avoid strlen conflict in nfsd4_encode_components_esc() There is an error building nfs4xdr.c with CONFIG_SUNRPC_DEBUG_TRACE=y and CONFIG_FORTIFY_SOURCE=n due to the local variable strlen conflicting with the function strlen(): In file included from include/linux/cpumask.h:11, from arch/x86/include/asm/paravirt.h:21, from arch/x86/include/asm/irqflags.h:102, from include/linux/irqflags.h:18, from include/linux/spinlock.h:59, from include/linux/mmzone.h:8, from include/linux/gfp.h:7, from include/linux/slab.h:16, from fs/nfsd/nfs4xdr.c:37: fs/nfsd/nfs4xdr.c: In function 'nfsd4_encode_components_esc': include/linux/kernel.h:321:46: error: called object 'strlen' is not a function or function pointer 321 \| __trace_puts(_THIS_IP_, str, strlen(str)); \ \| ^~~~~~ include/linux/kernel.h:265:17: note: in expansion of macro 'trace_puts' 265 \| trace_puts(fmt); \ \| ^~~~~~~~~~ include/linux/sunrpc/debug.h:34:41: note: in expansion of macro 'trace_printk' 34 \| # define __sunrpc_printk(fmt, ...) trace_printk(fmt, ##__VA_ARGS__) \| ^~~~~~~~~~~~ include/linux/sunrpc/debug.h:42:17: note: in expansion of macro '__sunrpc_printk' 42 \| __sunrpc_printk(fmt, ##__VA_ARGS__); \ \| ^~~~~~~~~~~~~~~ include/linux/sunrpc/debug.h:25:9: note: in expansion of macro 'dfprintk' 25 \| dfprintk(FACILITY, fmt, ##__VA_ARGS__) \| ^~~~~~~~ fs/nfsd/nfs4xdr.c:2646:9: note: in expansion of macro 'dprintk' 2646 \| dprintk("nfsd4_encode_components(%s)\n", components); \| ^~~~~~~ fs/nfsd/nfs4xdr.c:2643:13: note: declared here 2643 \| int strlen, count=0; \| ^~~~~~ This dprintk() instance is not particularly useful, so just remove it altogether to get rid of the immediate strlen() conflict. At the same time, eliminate the local strlen variable to avoid potential conflicts with strlen() in the future. Fixes: `ec7d8e68ef` ("sunrpc: add a Kconfig option to redirect dfprintk() output to trace buffer") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-10-21 11:03:19 -04:00
Chuck Lever	abb1f08a21	NFSD: Fix crash in nfsd4_read_release() When tracing is enabled, the trace_nfsd_read_done trace point crashes during the pynfs read.testNoFh test. Fixes: `15a8b55dbb` ("nfsd: call op_release, even when op_func returns an error") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-10-21 11:03:19 -04:00
Chuck Lever	4f76435fd5	NFSD: Define actions for the new time_deleg FATTR4 attributes NFSv4 clients won't send legitimate GETATTR requests for these new attributes because they are intended to be used only with CB_GETATTR and SETATTR. But NFSD has to do something besides crashing if it ever sees a GETATTR request that queries these attributes. RFC 8881 Section 18.7.3 states: > The server MUST return a value for each attribute that the client > requests if the attribute is supported by the server for the > target file system. If the server does not support a particular > attribute on the target file system, then it MUST NOT return the > attribute value and MUST NOT set the attribute bit in the result > bitmap. The server MUST return an error if it supports an > attribute on the target but cannot obtain its value. In that case, > no attribute values will be returned. Further, RFC 9754 Section 5 states: > These new attributes are invalid to be used with GETATTR, VERIFY, > and NVERIFY, and they can only be used with CB_GETATTR and SETATTR > by a client holding an appropriate delegation. Thus there does not appear to be a specific server response mandated by specification. Taking the guidance that querying these attributes via GETATTR is "invalid", NFSD will return nfserr_inval, failing the request entirely. Reported-by: Robert Morris <rtm@csail.mit.edu> Closes: https://lore.kernel.org/linux-nfs/7819419cf0cb50d8130dc6b747765d2b8febc88a.camel@kernel.org/T/#t Fixes: `51c0d4f7e3` ("nfsd: add support for FATTR4_OPEN_ARGUMENTS") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-10-21 11:03:19 -04:00
Christoph Hellwig	0f41997b1b	xfs: don't use __GFP_NOFAIL in xfs_init_fs_context With enough debug options enabled, struct xfs_mount is larger than 4k and thus NOFAIL allocations won't work for it. xfs_init_fs_context is early in the mount process, and if we really are out of memory there we'd better give up ASAP anyway. Fixes: `7b77b46a61` ("xfs: use kmem functions for struct xfs_mount") Reported-by: syzbot+359a67b608de1ef72f65@syzkaller.appspotmail.com Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2025-10-21 11:32:50 +02:00
Christoph Hellwig	ca3d643a97	xfs: cache open zone in inode->i_private The MRU cache for open zones is unfortunately still not ideal, as it can time out pretty easily when doing heavy I/O to hard disks using up most or all open zones. One option would be to just increase the timeout, but while looking into that I realized we're just better off caching it indefinitely as there is no real downside to that once we don't hold a reference to the cache open zone. So switch the open zone to RCU freeing, and then stash the last used open zone into inode->i_private. This helps to significantly reduce fragmentation by keeping I/O localized to zones for workloads that write using many open files to HDD. Fixes: `4e4d520755` ("xfs: add the zoned space allocator") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Tested-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2025-10-21 11:32:50 +02:00
Christoph Hellwig	a8c861f401	xfs: avoid busy loops in GCD When GCD has no new work to handle, but read, write or reset commands are outstanding, it currently busy loops, which is a bit suboptimal, and can lead to softlockup warnings in case of stuck commands. Change the code so that the task state is only set to running when work is performed, which looks a bit tricky due to the design of the reading/writing/resetting lists that contain both in-flight and finished commands. Fixes: `080d01c41d` ("xfs: implement zoned garbage collection") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2025-10-21 11:32:50 +02:00
Geert Uytterhoeven	f5caeb3689	xfs: XFS_ONLINE_SCRUB_STATS should depend on DEBUG_FS Currently, XFS_ONLINE_SCRUB_STATS selects DEBUG_FS. However, DEBUG_FS is meant for debugging, and people may want to disable it on production systems. Since commit `0ff51a1fd7` ("xfs: enable online fsck by default in Kconfig")), XFS_ONLINE_SCRUB_STATS is enabled by default, forcing DEBUG_FS enabled too. Fix this by replacing the selection of DEBUG_FS by a dependency on DEBUG_FS, which is what most other options controlling the gathering and exposing of statistics do. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2025-10-21 09:52:59 +02:00
Damien Le Moal	b00bcb190e	xfs: do not tightly pack-write large files When using a zoned realtime device, tightly packing of data blocks belonging to multiple closed files into the same realtime group (RTG) is very efficient at improving write performance. This is especially true with SMR HDDs as this can reduce, and even suppress, disk head seeks. However, such tight packing does not make sense for large files that require at least a full RTG. If tight packing placement is applied for such files, the VM writeback thread switching between inodes result in the large files to be fragmented, thus increasing the garbage collection penalty later when the RTG needs to be reclaimed. This problem can be avoided with a simple heuristic: if the size of the inode being written back is at least equal to the RTG size, do not use tight-packing. Modify xfs_zoned_pack_tight() to always return false in this case. With this change, a multi-writer workload writing files of 256 MB on a file system backed by an SMR HDD with 256 MB zone size as a realtime device sees all files occupying exactly one RTG (i.e. one device zone), thus completely removing the heavy fragmentation observed without this change. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2025-10-21 09:49:39 +02:00
Damien Le Moal	914f377075	xfs: Improve CONFIG_XFS_RT Kconfig help Improve the description of the XFS_RT configuration option to document that this option is required for zoned block devices. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2025-10-21 09:48:31 +02:00
David Howells	5da6fb6356	cifs: Add a couple of missing smb3_rw_credits tracepoints Add missing smb3_rw_credits tracepoints to cifs_readv_callback() (for SMB1) to match those of SMB2/3. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.org> cc: Shyam Prasad N <sprasad@microsoft.com> cc: Tom Talpey <tom@talpey.com> cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2025-10-20 16:48:05 -05:00
Linus Torvalds	380cb5d353	Merge tag 'fsnotify_for_v6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify fixes from Jan Kara: - Stop-gap solution for a race between unmount of a filesystem with fsnotify marks and someone inspecting fdinfo of fsnotify group with those marks in procfs. A proper solution is in the works but it will get a while to settle. - Fix for non-decodable file handles (used by unprivileged apps using fanotify) * tag 'fsnotify_for_v6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fs/notify: call exportfs_encode_fid with s_umount expfs: Fix exportfs_can_encode_fh() for EXPORT_FH_FID	2025-10-20 09:35:13 -10:00
Babu Moger	19de7113bf	x86,fs/resctrl: Fix NULL pointer dereference with events force-disabled in mbm_event mode The following NULL pointer dereference is encountered on mount of resctrl fs after booting a system that supports assignable counters with the "rdt=!mbmtotal,!mbmlocal" kernel parameters: BUG: kernel NULL pointer dereference, address: 0000000000000008 RIP: 0010:mbm_cntr_get Call Trace: rdtgroup_assign_cntr_event rdtgroup_assign_cntrs rdt_get_tree Specifying the kernel parameter "rdt=!mbmtotal,!mbmlocal" effectively disables the legacy X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features and the MBM events they represent. This results in the per-domain MBM event related data structures to not be allocated during early initialization. resctrl fs initialization follows by implicitly enabling both MBM total and local events on a system that supports assignable counters (mbm_event mode), but this enabling occurs after the per-domain data structures have been created. After booting, resctrl fs assumes that an enabled event can access all its state. This results in NULL pointer dereference when resctrl attempts to access the un-allocated structures of an enabled event. Remove the late MBM event enabling from resctrl fs. This leaves a problem where the X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features may be disabled while assignable counter (mbm_event) mode is enabled without any events to support. Switching between the "default" and "mbm_event" mode without any events is not practical. Create a dependency between the X86_FEATURE_{CQM_MBM_TOTAL,CQM_MBM_LOCAL} and X86_FEATURE_ABMC (assignable counter) hardware features. An x86 system that supports assignable counters now requires support of X86_FEATURE_CQM_MBM_TOTAL or X86_FEATURE_CQM_MBM_LOCAL. This ensures all needed MBM related data structures are created before use and that it is only possible to switch between "default" and "mbm_event" mode when the same events are available in both modes. This dependency does not exist in the hardware but this usage of these feature settings work for known systems. [ bp: Massage commit message. ] Fixes: `13390861b4` ("x86,fs/resctrl: Detect Assignable Bandwidth Monitoring feature details") Co-developed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://patch.msgid.link/a62e6ac063d0693475615edd213d5be5e55443e6.1760560934.git.babu.moger@amd.com	2025-10-20 18:06:31 +02:00
Stefan Metzmacher	e607ef686a	smb: client: allocate enough space for MR WRs and ib_drain_qp() The IB_WR_REG_MR and IB_WR_LOCAL_INV operations for smbdirect_mr_io structures should never fail because the submission or completion queues are too small. So we allocate more send_wr depending on the (local) max number of MRs. While there also add additional space for ib_drain_qp(). This should make sure ib_post_send() will never fail because the submission queue is full. Fixes: `f198186aa9` ("CIFS: SMBD: Establish SMB Direct connection") Fixes: `cc55f65dd3` ("smb: client: make use of common smbdirect_socket_parameters") Cc: stable@vger.kernel.org Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-10-19 20:59:38 -05:00
Linus Torvalds	847f242f7a	Merge tag 'exfat-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat fixes from Namjae Jeon: - Fix out-of-bounds in FS_IOC_SETFSLABEL - Add validation for stream entry size to prevent infinite loop * tag 'exfat-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: fix out-of-bounds in exfat_nls_to_ucs2() exfat: fix improper check of dentry.stream.valid_size	2025-10-18 07:23:59 -10:00
Linus Torvalds	2d07c6c209	Merge tag 'nfs-for-6.18-2' of git://git.linux-nfs.org/projects/anna/linux-nfs Pull NFS client fixes from Anna Schumaker: - Fix for FlexFiles mirror->dss allocation - Apply delay_retrans to async operations - Check if suid/sgid is cleared after a write when needed - Fix setting the state renewal timer for early mounts after a reboot * tag 'nfs-for-6.18-2' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFS4: Fix state renewals missing after boot NFS: check if suid/sgid was cleared after a write as needed NFS4: Apply delay_retrans to async operations NFSv4/flexfiles: fix to allocate mirror->dss before use	2025-10-18 07:18:48 -10:00
Linus Torvalds	4ccb3a8000	Merge tag '6.18-rc1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 Pull smb client fixes from Steve French: "smb client fixes, security and smbdirect improvements, and some minor cleanup: - Important OOB DFS fix - Fix various potential tcon refcount leaks - smbdirect (RDMA) fixes (following up from test event a few weeks ago): - Fixes to improve and simplify handling of memory lifetime of smbdirect_mr_io structures, when a connection gets disconnected - Make sure we really wait to reach SMBDIRECT_SOCKET_DISCONNECTED before destroying resources - Make sure the send/recv submission/completion queues are large enough to avoid ib_post_send() from failing under pressure - convert cifs.ko to use the recommended crypto libraries (instead of crypto_shash), this also can improve performance - Three small cleanup patches" * tag '6.18-rc1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: (24 commits) smb: client: Consolidate cmac(aes) shash allocation smb: client: Remove obsolete crypto_shash allocations smb: client: Use HMAC-MD5 library for NTLMv2 smb: client: Use MD5 library for SMB1 signature calculation smb: client: Use MD5 library for M-F symlink hashing smb: client: Use HMAC-SHA256 library for SMB2 signature calculation smb: client: Use HMAC-SHA256 library for key generation smb: client: Use SHA-512 library for SMB3.1.1 preauth hash cifs: parse_dfs_referrals: prevent oob on malformed input smb: client: Fix refcount leak for cifs_sb_tlink smb: client: let smbd_destroy() wait for SMBDIRECT_SOCKET_DISCONNECTED smb: move some duplicate definitions to common/cifsglob.h smb: client: let destroy_mr_list() keep smbdirect_mr_io memory if registered smb: client: let destroy_mr_list() call ib_dereg_mr() before ib_dma_unmap_sg() smb: client: call ib_dma_unmap_sg if mr->sgt.nents is not 0 smb: client: improve logic in smbd_deregister_mr() smb: client: improve logic in smbd_register_mr() smb: client: improve logic in allocate_mr_list() smb: client: let destroy_mr_list() remove locked from the list smb: client: let destroy_mr_list() call list_del(&mr->list) ...	2025-10-18 07:11:32 -10:00
Ting-Chang Hou	1fabe43b4e	btrfs: send: fix duplicated rmdir operations when using extrefs Commit `29d6d30f5c` ("Btrfs: send, don't send rmdir for same target multiple times") has fixed an issue that a send stream contained a rmdir operation for the same directory multiple times. After that fix we keep track of the last directory for which we sent a rmdir operation and compare with it before sending a rmdir for the parent inode of a deleted hardlink we are processing. But there is still a corner case that in between rmdir dir operations for the same inode we find deleted hardlinks for other parent inodes, so tracking just the last inode for which we sent a rmdir operation is not enough. Hardlinks of a file in the same directory are stored in the same INODE_REF item, but if the number of hardlinks is too large and can not fit in a leaf, we use INODE_EXTREF items to store them. The key of an INODE_EXTREF item is (inode_id, INODE_EXTREF, hash[name, parent ino]), so between two hardlinks for the same parent directory, we can find others for other parent directories. For example for the reproducer below we get the following (from a btrfs inspect-internal dump-tree output): item 0 key (259 INODE_EXTREF 2309449) itemoff 16257 itemsize 26 index 6925 parent 257 namelen 8 name: foo.6923 item 1 key (259 INODE_EXTREF 2311350) itemoff 16231 itemsize 26 index 6588 parent 258 namelen 8 name: foo.6587 item 2 key (259 INODE_EXTREF 2457395) itemoff 16205 itemsize 26 index 6611 parent 257 namelen 8 name: foo.6609 (...) So tracking the last directory's inode number does not work in this case since we process a link for parent inode 257, then for 258 and then back again for 257, and that second time we process a deleted link for 257 we think we have not yet sent a rmdir operation. Fix this by using a rbtree to keep track of all the directories for which we have already sent rmdir operations, and add those directories to the 'check_dirs' ref list in process_recorded_refs() only if the directory is not yet in the rbtree, otherwise skip it since it means we have already sent a rmdir operation for that directory. The following test script reproduces the problem: $ cat test.sh #!/bin/bash DEV=/dev/sdi MNT=/mnt/sdi mkfs.btrfs -f $DEV mount $DEV $MNT mkdir $MNT/a $MNT/b echo 123 > $MNT/a/foo for ((i = 1; i <= 1000; i++)); do ln $MNT/a/foo $MNT/a/foo.$i ln $MNT/a/foo $MNT/b/foo.$i done btrfs subvolume snapshot -r $MNT $MNT/snap1 btrfs send $MNT/snap1 -f /tmp/base.send rm -r $MNT/a $MNT/b btrfs subvolume snapshot -r $MNT $MNT/snap2 btrfs send -p $MNT/snap1 $MNT/snap2 -f /tmp/incremental.send umount $MNT mkfs.btrfs -f $DEV mount $DEV $MNT btrfs receive $MNT -f /tmp/base.send btrfs receive $MNT -f /tmp/incremental.send rm -f /tmp/base.send /tmp/incremental.send umount $MNT When running it, it fails like this: $ ./test.sh (...) At subvol snap1 At snapshot snap2 ERROR: rmdir o257-9-0 failed: No such file or directory CC: <stable@vger.kernel.org> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Ting-Chang Hou <tchou@synology.com> [ Updated changelog ] Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-10-17 18:33:34 +02:00
Dewei Meng	17679ac6df	btrfs: directly free partially initialized fs_info in btrfs_check_leaked_roots() If fs_info->super_copy or fs_info->super_for_commit allocated failed in btrfs_get_tree_subvol(), then no need to call btrfs_free_fs_info(). Otherwise btrfs_check_leaked_roots() would access NULL pointer because fs_info->allocated_roots had not been initialised. syzkaller reported the following information: ------------[ cut here ]------------ BUG: unable to handle page fault for address: fffffffffffffbb0 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 64c9067 P4D 64c9067 PUD 64cb067 PMD 0 Oops: Oops: 0000 [#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 1402 Comm: syz.1.35 Not tainted 6.15.8 #4 PREEMPT(lazy) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), (...) RIP: 0010:arch_atomic_read arch/x86/include/asm/atomic.h:23 [inline] RIP: 0010:raw_atomic_read include/linux/atomic/atomic-arch-fallback.h:457 [inline] RIP: 0010:atomic_read include/linux/atomic/atomic-instrumented.h:33 [inline] RIP: 0010:refcount_read include/linux/refcount.h:170 [inline] RIP: 0010:btrfs_check_leaked_roots+0x18f/0x2c0 fs/btrfs/disk-io.c:1230 [...] Call Trace: <TASK> btrfs_free_fs_info+0x310/0x410 fs/btrfs/disk-io.c:1280 btrfs_get_tree_subvol+0x592/0x6b0 fs/btrfs/super.c:2029 btrfs_get_tree+0x63/0x80 fs/btrfs/super.c:2097 vfs_get_tree+0x98/0x320 fs/super.c:1759 do_new_mount+0x357/0x660 fs/namespace.c:3899 path_mount+0x716/0x19c0 fs/namespace.c:4226 do_mount fs/namespace.c:4239 [inline] __do_sys_mount fs/namespace.c:4450 [inline] __se_sys_mount fs/namespace.c:4427 [inline] __x64_sys_mount+0x28c/0x310 fs/namespace.c:4427 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x92/0x180 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f032eaffa8d [...] Fixes: `3bb17a25bc` ("btrfs: add get_tree callback for new mount API") CC: stable@vger.kernel.org # 6.12+ Reviewed-by: Daniel Vacek <neelx@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Dewei Meng <mengdewei@cqsoftware.com.cn> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-10-17 18:33:27 +02:00
Fernando Fernandez Mancera	c7fbb8218b	sysfs: check visibility before changing group attribute ownership Since commit `0c17270f9b` ("net: sysfs: Implement is_visible for phys_(port_id, port_name, switch_id)"), __dev_change_net_namespace() can hit WARN_ON() when trying to change owner of a file that isn't visible. See the trace below: WARNING: CPU: 6 PID: 2938 at net/core/dev.c:12410 __dev_change_net_namespace+0xb89/0xc30 CPU: 6 UID: 0 PID: 2938 Comm: incusd Not tainted 6.17.1-1-mainline #1 PREEMPT(full) 4b783b4a638669fb644857f484487d17cb45ed1f Hardware name: Framework Laptop 13 (AMD Ryzen 7040Series)/FRANMDCP07, BIOS 03.07 02/19/2025 RIP: 0010:__dev_change_net_namespace+0xb89/0xc30 [...] Call Trace: <TASK> ? if6_seq_show+0x30/0x50 do_setlink.isra.0+0xc7/0x1270 ? __nla_validate_parse+0x5c/0xcc0 ? security_capable+0x94/0x1a0 rtnl_newlink+0x858/0xc20 ? update_curr+0x8e/0x1c0 ? update_entity_lag+0x71/0x80 ? sched_balance_newidle+0x358/0x450 ? psi_task_switch+0x113/0x2a0 ? __pfx_rtnl_newlink+0x10/0x10 rtnetlink_rcv_msg+0x346/0x3e0 ? sched_clock+0x10/0x30 ? __pfx_rtnetlink_rcv_msg+0x10/0x10 netlink_rcv_skb+0x59/0x110 netlink_unicast+0x285/0x3c0 ? __alloc_skb+0xdb/0x1a0 netlink_sendmsg+0x20d/0x430 ____sys_sendmsg+0x39f/0x3d0 ? import_iovec+0x2f/0x40 ___sys_sendmsg+0x99/0xe0 __sys_sendmsg+0x8a/0xf0 do_syscall_64+0x81/0x970 ? __sys_bind+0xe3/0x110 ? syscall_exit_work+0x143/0x1b0 ? do_syscall_64+0x244/0x970 ? sock_alloc_file+0x63/0xc0 ? syscall_exit_work+0x143/0x1b0 ? do_syscall_64+0x244/0x970 ? alloc_fd+0x12e/0x190 ? put_unused_fd+0x2a/0x70 ? do_sys_openat2+0xa2/0xe0 ? syscall_exit_work+0x143/0x1b0 ? do_syscall_64+0x244/0x970 ? exc_page_fault+0x7e/0x1a0 entry_SYSCALL_64_after_hwframe+0x76/0x7e [...] </TASK> Fix this by checking is_visible() before trying to touch the attribute. Fixes: `303a42769c` ("sysfs: add sysfs_group{s}_change_owner()") Fixes: `0c17270f9b` ("net: sysfs: Implement is_visible for phys_(port_id, port_name, switch_id)") Reported-by: Cynthia <cynthia@kosmx.dev> Closes: https://lore.kernel.org/netdev/01070199e22de7f8-28f711ab-d3f1-46d9-b9a0-048ab05eb09b-000000@eu-central-1.amazonses.com/ Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20251016101456.4087-1-fmancera@suse.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-10-17 09:48:34 +02:00
Gao Xiang	a429b76114	erofs: fix crafted invalid cases for encoded extents Robert recently reported two corrupted images that can cause system crashes, which are related to the new encoded extents introduced in Linux 6.15: - The first one [1] has plen != 0 (e.g. plen == 0x2000000) but (plen & Z_EROFS_EXTENT_PLEN_MASK) == 0. It is used to represent special extents such as sparse extents (!EROFS_MAP_MAPPED), but previously only plen == 0 was handled; - The second one [2] has pa 0xffffffffffdcffed and plen 0xb4000, then "cur [0xfffffffffffff000] += bvec.bv_len [0x1000]" in "} while ((cur += bvec.bv_len) < end);" wraps around, causing an out-of-bound access of pcl->compressed_bvecs[] in z_erofs_submit_queue(). EROFS only supports 48-bit physical block addresses (up to 1EiB for 4k blocks), so add a sanity check to enforce this. Fixes: `1d191b4ca5` ("erofs: implement encoded extent metadata") Reported-by: Robert Morris <rtm@csail.mit.edu> Closes: https://lore.kernel.org/r/75022.1759355830@localhost [1] Closes: https://lore.kernel.org/r/80524.1760131149@localhost [2] Reviewed-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2025-10-17 15:21:36 +08:00
Linus Torvalds	98ac9cc4b4	Merge tag 'f2fs-fix-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs fixes from Jaegeuk Kim: - fix soft lockupg caused by iput() added in `bc986b1d75` ("fs: stop accessing ->i_count directly in f2fs and gfs2") - fix a wrong block address map on multiple devices Link: https://lore.kernel.org/oe-lkp/202509301450.138b448f-lkp@intel.com [1] * tag 'f2fs-fix-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: f2fs: fix wrong block mapping for multi-devices f2fs: don't call iput() from f2fs_drop_inode()	2025-10-16 10:58:49 -07:00
Linus Torvalds	9f388a653c	Merge tag 'for-6.18-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - in tree-checker fix extref bounds check - reorder send context structure to avoid -Wflex-array-member-not-at-end warning - fix extent readahead length for compressed extents - fix memory leaks on error paths (qgroup assign ioctl, zone loading with raid stripe tree enabled) - fix how device specific mount options are applied, in particular the 'ssd' option will be set unexpectedly - fix tracking of relocation state when tasks are running and cancellation is attempted - adjust assertion condition for folios allocated for scrub - remove incorrect assertion checking for block group when populating free space tree * tag 'for-6.18-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: send: fix -Wflex-array-member-not-at-end warning in struct send_ctx btrfs: tree-checker: fix bounds check in check_inode_extref() btrfs: fix memory leaks when rejecting a non SINGLE data profile without an RST btrfs: fix incorrect readahead expansion length btrfs: do not assert we found block group item when creating free space tree btrfs: do not use folio_test_partial_kmap() in ASSERT()s btrfs: only set the device specific options after devices are opened btrfs: fix memory leak on duplicated memory in the qgroup assign ioctl btrfs: fix clearing of BTRFS_FS_RELOC_RUNNING if relocation already running	2025-10-16 10:22:38 -07:00
Linus Torvalds	05de41f3e2	Merge tag 'v6.18-rc1-smb-server-fixes' of git://git.samba.org/ksmbd Pull smb server fixes from Steve French: - Fix RPC hang due to locking bug - Fix for memory leak in read and refcount leak (in session setup) - Minor cleanup * tag 'v6.18-rc1-smb-server-fixes' of git://git.samba.org/ksmbd: ksmbd: fix recursive locking in RPC handle list access smb/server: fix possible refcount leak in smb2_sess_setup() smb/server: fix possible memory leak in smb2_read() smb: server: Use common error handling code in smb_direct_rdma_xmit()	2025-10-16 10:16:41 -07:00
Eric Biggers	3c15a6df61	smb: client: Consolidate cmac(aes) shash allocation Now that smb3_crypto_shash_allocate() and smb311_crypto_shash_allocate() are identical and only allocate "cmac(aes)", delete the latter and replace the call to it with the former. Reviewed-by: Stefan Metzmacher <metze@samba.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-10-15 22:10:28 -05:00
Eric Biggers	2c09630d09	smb: client: Remove obsolete crypto_shash allocations Now that the SMB client accesses MD5, HMAC-MD5, HMAC-SHA256, and SHA-512 only via the library API and not via crypto_shash, allocating crypto_shash objects for these algorithms is no longer necessary. Remove all these allocations, their corresponding kconfig selections, and their corresponding module soft dependencies. Reviewed-by: Stefan Metzmacher <metze@samba.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-10-15 22:10:28 -05:00
Eric Biggers	395a77b030	smb: client: Use HMAC-MD5 library for NTLMv2 For the HMAC-MD5 computations in NTLMv2, use the HMAC-MD5 library instead of a "hmac(md5)" crypto_shash. This is simpler and faster. With the library there's no need to allocate memory, no need to handle errors, and the HMAC-MD5 code is accessed directly without inefficient indirect calls and other unnecessary API overhead. To preserve the existing behavior of NTLMv2 support being disabled when the kernel is booted with "fips=1", make setup_ntlmv2_rsp() check fips_enabled itself. Previously it relied on the error from cifs_alloc_hash("hmac(md5)", &hmacmd5). Reviewed-by: Stefan Metzmacher <metze@samba.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-10-15 22:10:28 -05:00
Eric Biggers	c04e55b257	smb: client: Use MD5 library for SMB1 signature calculation Convert cifs_calc_signature() to use the MD5 library instead of a "md5" crypto_shash. This is simpler and faster. With the library there's no need to allocate memory, no need to handle errors, and the MD5 code is accessed directly without inefficient indirect calls and other unnecessary API overhead. To preserve the existing behavior of MD5 signature support being disabled when the kernel is booted with "fips=1", make cifs_calc_signature() check fips_enabled itself. Previously it relied on the error from cifs_alloc_hash("md5", &server->secmech.md5). Reviewed-by: Stefan Metzmacher <metze@samba.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-10-15 22:10:28 -05:00
Eric Biggers	ae04b1bb06	smb: client: Use MD5 library for M-F symlink hashing Convert parse_mf_symlink() and format_mf_symlink() to use the MD5 library instead of a "md5" crypto_shash. This is simpler and faster. With the library there's no need to allocate memory, no need to handle errors, and the MD5 code is accessed directly without inefficient indirect calls and other unnecessary API overhead. This also fixes an issue where these functions did not work on kernels booted in FIPS mode. The use of MD5 here is for data integrity rather than a security purpose, so it can use a non-FIPS-approved algorithm. Reviewed-by: Stefan Metzmacher <metze@samba.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-10-15 22:10:28 -05:00
Eric Biggers	e05b3115e7	smb: client: Use HMAC-SHA256 library for SMB2 signature calculation Convert smb2_calc_signature() to use the HMAC-SHA256 library instead of a "hmac(sha256)" crypto_shash. This is simpler and faster. With the library there's no need to allocate memory, no need to handle errors, and the HMAC-SHA256 code is accessed directly without inefficient indirect calls and other unnecessary API overhead. To make this possible, make __cifs_calc_signature() support both the HMAC-SHA256 library and crypto_shash. (crypto_shash is still needed for HMAC-MD5 and AES-CMAC. A later commit will switch HMAC-MD5 from shash to the library. I'd like to eventually do the same for AES-CMAC, but it doesn't have a library API yet. So for now, shash is still needed.) Also remove the unnecessary 'sigptr' variable. For now smb3_crypto_shash_allocate() still allocates a "hmac(sha256)" crypto_shash. It will be removed in a later commit. Reviewed-by: Stefan Metzmacher <metze@samba.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-10-15 22:10:28 -05:00
Eric Biggers	4b4c6fdb25	smb: client: Use HMAC-SHA256 library for key generation Convert generate_key() to use the HMAC-SHA256 library instead of a "hmac(sha256)" crypto_shash. This is simpler and faster. With the library there's no need to allocate memory, no need to handle errors, and the HMAC-SHA256 code is accessed directly without inefficient indirect calls and other unnecessary API overhead. Also remove the unnecessary 'hashptr' variable. For now smb3_crypto_shash_allocate() still allocates a "hmac(sha256)" crypto_shash. It will be removed in a later commit. Reviewed-by: Stefan Metzmacher <metze@samba.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-10-15 22:10:28 -05:00
Eric Biggers	af5fea5141	smb: client: Use SHA-512 library for SMB3.1.1 preauth hash Convert smb311_update_preauth_hash() to use the SHA-512 library instead of a "sha512" crypto_shash. This is simpler and faster. With the library there's no need to allocate memory, no need to handle errors, and the SHA-512 code is accessed directly without inefficient indirect calls and other unnecessary API overhead. Remove the call to smb311_crypto_shash_allocate() from smb311_update_preauth_hash(), since it appears to have been needed only to allocate the "sha512" crypto_shash. (It also had the side effect of allocating the "cmac(aes)" crypto_shash, but that's also done in generate_key() which is where the AES-CMAC key is initialized.) For now the "sha512" crypto_shash is still being allocated elsewhere. It will be removed in a later commit. Reviewed-by: Stefan Metzmacher <metze@samba.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-10-15 22:10:28 -05:00
Eugene Korenevsky	6447b0e355	cifs: parse_dfs_referrals: prevent oob on malformed input Malicious SMB server can send invalid reply to FSCTL_DFS_GET_REFERRALS - reply smaller than sizeof(struct get_dfs_referral_rsp) - reply with number of referrals smaller than NumberOfReferrals in the header Processing of such replies will cause oob. Return -EINVAL error on such replies to prevent oob-s. Signed-off-by: Eugene Korenevsky <ekorenevsky@aliyun.com> Cc: stable@vger.kernel.org Suggested-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-10-15 22:10:28 -05:00
Shuhao Fu	c2b77f4220	smb: client: Fix refcount leak for cifs_sb_tlink Fix three refcount inconsistency issues related to `cifs_sb_tlink`. Comments for `cifs_sb_tlink` state that `cifs_put_tlink()` needs to be called after successful calls to `cifs_sb_tlink()`. Three calls fail to update refcount accordingly, leading to possible resource leaks. Fixes: `8ceb984379` ("CIFS: Move rename to ops struct") Fixes: `2f1afe2599` ("cifs: Use smb 2 - 3 and cifsacl mount options getacl functions") Fixes: `366ed846df` ("cifs: Use smb 2 - 3 and cifsacl mount options setacl function") Cc: stable@vger.kernel.org Signed-off-by: Shuhao Fu <sfual@cse.ust.hk> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-10-15 22:09:46 -05:00
Linus Torvalds	7ea30958b3	Merge tag 'vfs-6.18-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Handle inode number mismatches in nsfs file handles - Update the comment to init_file() - Add documentation link for EBADF in the rust file code - Skip read lock assertion for read-only filesystems when using dax - Don't leak disconnected dentries during umount - Fix new coredump input pattern validation - Handle ENOIOCTLCMD conversion in vfs_fileattr_{g,s}et() correctly - Remove redundant IOCB_DIO_CALLER_COMP clearing in overlayfs * tag 'vfs-6.18-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: ovl: remove redundant IOCB_DIO_CALLER_COMP clearing fs: return EOPNOTSUPP from file_setattr/file_getattr syscalls Revert "fs: make vfs_fileattr_[get\|set] return -EOPNOTSUPP" coredump: fix core_pattern input validation vfs: Don't leak disconnected dentries on umount dax: skip read lock assertion for read-only filesystems rust: file: add intra-doc link for 'EBADF' fs: update comment in init_file() nsfs: handle inode number mismatches gracefully in file handles	2025-10-15 15:12:58 -07:00
Deepanshu Kartikey	78a63493f8	ocfs2: clear extent cache after moving/defragmenting extents The extent map cache can become stale when extents are moved or defragmented, causing subsequent operations to see outdated extent flags. This triggers a BUG_ON in ocfs2_refcount_cal_cow_clusters(). The problem occurs when: 1. copy_file_range() creates a reflinked extent with OCFS2_EXT_REFCOUNTED 2. ioctl(FITRIM) triggers ocfs2_move_extents() 3. __ocfs2_move_extents_range() reads and caches the extent (flags=0x2) 4. ocfs2_move_extent()/ocfs2_defrag_extent() calls __ocfs2_move_extent() which clears OCFS2_EXT_REFCOUNTED flag on disk (flags=0x0) 5. The extent map cache is not invalidated after the move 6. Later write() operations read stale cached flags (0x2) but disk has updated flags (0x0), causing a mismatch 7. BUG_ON(!(rec->e_flags & OCFS2_EXT_REFCOUNTED)) triggers Fix by clearing the extent map cache after each extent move/defrag operation in __ocfs2_move_extents_range(). This ensures subsequent operations read fresh extent data from disk. Link: https://lore.kernel.org/all/20251009142917.517229-1-kartikey406@gmail.com/T/ Link: https://lkml.kernel.org/r/20251009154903.522339-1-kartikey406@gmail.com Fixes: `53069d4e76` ("Ocfs2/move_extents: move/defrag extents within a certain range.") Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Reported-by: syzbot+6fdd8fa3380730a4b22c@syzkaller.appspotmail.com Tested-by: syzbot+6fdd8fa3380730a4b22c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?id=2959889e1f6e216585ce522f7e8bc002b46ad9e7 Reviewed-by: Mark Fasheh <mark@fasheh.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-10-15 13:24:33 -07:00
Stefan Metzmacher	d451a0e88e	smb: client: let smbd_destroy() wait for SMBDIRECT_SOCKET_DISCONNECTED We should wait for the rdma_cm to become SMBDIRECT_SOCKET_DISCONNECTED, it turns out that (at least running some xfstests e.g. cifs/001) often triggers the case where wait_event_interruptible() returns with -ERESTARTSYS instead of waiting for SMBDIRECT_SOCKET_DISCONNECTED to be reached. Or we are already in SMBDIRECT_SOCKET_DISCONNECTING and never wait for SMBDIRECT_SOCKET_DISCONNECTED. Fixes: `050b8c3740` ("smbd: Make upper layer decide when to destroy the transport") Fixes: `e8b3bfe9bc` ("cifs: smbd: Don't destroy transport on RDMA disconnect") Fixes: `b0aa92a229` ("smb: client: make sure smbd_disconnect_rdma_work() doesn't run after smbd_destroy() took over") Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-10-15 11:21:13 -05:00
Linus Torvalds	66f8e4df00	Merge tag 'ext4_for_linus-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bug fixes from Ted Ts'o: - Fix regression caused by removing CONFIG_EXT3_FS when testing some very old defconfigs - Avoid a BUG_ON when opening a file on a maliciously corrupted file system - Avoid mm warnings when freeing a very large orphan file metadata - Avoid a theoretical races between metadata writeback and checkpoints (it's very hard to hit in practice, since the race requires that the writeback take a very long time) * tag 'ext4_for_linus-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: Use CONFIG_EXT4_FS instead of CONFIG_EXT3_FS in all of the defconfigs ext4: free orphan info with kvfree ext4: detect invalid INLINE_DATA + EXTENTS flag combination ext4, doc: fix and improve directory hash tree description ext4: wait for ongoing I/O to complete before freeing blocks jbd2: ensure that all ongoing I/O complete before freeing blocks	2025-10-15 07:51:57 -07:00
ZhangGuoDong	d877470b59	smb: move some duplicate definitions to common/cifsglob.h In order to maintain the code more easily, move duplicate definitions to new common header file. Co-developed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: ZhangGuoDong <zhangguodong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-10-15 07:44:17 -05:00
Stefan Metzmacher	b0432201a1	smb: client: let destroy_mr_list() keep smbdirect_mr_io memory if registered If a smbdirect_mr_io structure if still visible to callers of smbd_register_mr() we can't free the related memory when the connection is disconnected! Otherwise smbd_deregister_mr() will crash. Now we use a mutex and refcounting in order to keep the memory around if the connection is disconnected. It means smbd_deregister_mr() can be called at any later time to free the memory, which is no longer referenced by nor referencing the connection. It also means smbd_destroy() no longer needs to wait for mr_io.used.count to become 0. Fixes: `050b8c3740` ("smbd: Make upper layer decide when to destroy the transport") Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-10-15 07:44:17 -05:00
Marios Makassikis	88f170814f	ksmbd: fix recursive locking in RPC handle list access Since commit `305853cce3` ("ksmbd: Fix race condition in RPC handle list access"), ksmbd_session_rpc_method() attempts to lock sess->rpc_lock. This causes hung connections / tasks when a client attempts to open a named pipe. Using Samba's rpcclient tool: $ rpcclient //192.168.1.254 -U user%password $ rpcclient $> srvinfo <connection hung here> Kernel side: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/0:0 state:D stack:0 pid:5021 tgid:5021 ppid:2 flags:0x00200000 Workqueue: ksmbd-io handle_ksmbd_work Call trace: __schedule from schedule+0x3c/0x58 schedule from schedule_preempt_disabled+0xc/0x10 schedule_preempt_disabled from rwsem_down_read_slowpath+0x1b0/0x1d8 rwsem_down_read_slowpath from down_read+0x28/0x30 down_read from ksmbd_session_rpc_method+0x18/0x3c ksmbd_session_rpc_method from ksmbd_rpc_open+0x34/0x68 ksmbd_rpc_open from ksmbd_session_rpc_open+0x194/0x228 ksmbd_session_rpc_open from create_smb2_pipe+0x8c/0x2c8 create_smb2_pipe from smb2_open+0x10c/0x27ac smb2_open from handle_ksmbd_work+0x238/0x3dc handle_ksmbd_work from process_scheduled_works+0x160/0x25c process_scheduled_works from worker_thread+0x16c/0x1e8 worker_thread from kthread+0xa8/0xb8 kthread from ret_from_fork+0x14/0x38 Exception stack(0x8529ffb0 to 0x8529fff8) The task deadlocks because the lock is already held: ksmbd_session_rpc_open down_write(&sess->rpc_lock) ksmbd_rpc_open ksmbd_session_rpc_method down_read(&sess->rpc_lock) <-- deadlock Adjust ksmbd_session_rpc_method() callers to take the lock when necessary. Fixes: `305853cce3` ("ksmbd: Fix race condition in RPC handle list access") Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-10-15 06:03:09 -05:00
ZhangGuoDong	379510a815	smb/server: fix possible refcount leak in smb2_sess_setup() Reference count of ksmbd_session will leak when session need reconnect. Fix this by adding the missing ksmbd_user_session_put(). Co-developed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: ZhangGuoDong <zhangguodong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-10-15 06:03:09 -05:00
ZhangGuoDong	6fced056d2	smb/server: fix possible memory leak in smb2_read() Memory leak occurs when ksmbd_vfs_read() fails. Fix this by adding the missing kvfree(). Co-developed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: ZhangGuoDong <zhangguodong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-10-15 06:03:09 -05:00
Jeongjun Park	2d8636119b	exfat: fix out-of-bounds in exfat_nls_to_ucs2() Since the len argument value passed to exfat_ioctl_set_volume_label() from exfat_nls_to_utf16() is passed 1 too large, an out-of-bounds read occurs when dereferencing p_cstring in exfat_nls_to_ucs2() later. And because of the NLS_NAME_OVERLEN macro, another error occurs when creating a file with a period at the end using utf8 and other iocharsets. So to avoid this, you should remove the code that uses NLS_NAME_OVERLEN macro and make the len argument value be the length of the label string, but with a maximum length of FSLABEL_MAX - 1. Reported-by: syzbot+98cc76a76de46b3714d4@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=98cc76a76de46b3714d4 Fixes: `d01579d590` ("exfat: Add support for FS_IOC_{GET,SET}FSLABEL") Suggested-by: Pali Rohár <pali@kernel.org> Signed-off-by: Jeongjun Park <aha310510@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>	2025-10-15 17:53:20 +09:00
Jaehun Gou	82ebecdc74	exfat: fix improper check of dentry.stream.valid_size We found an infinite loop bug in the exFAT file system that can lead to a Denial-of-Service (DoS) condition. When a dentry in an exFAT filesystem is malformed, the following system calls — SYS_openat, SYS_ftruncate, and SYS_pwrite64 — can cause the kernel to hang. Root cause analysis shows that the size validation code in exfat_find() does not check whether dentry.stream.valid_size is negative. As a result, the system calls mentioned above can succeed and eventually trigger the DoS issue. This patch adds a check for negative dentry.stream.valid_size to prevent this vulnerability. Co-developed-by: Seunghun Han <kkamagui@gmail.com> Signed-off-by: Seunghun Han <kkamagui@gmail.com> Co-developed-by: Jihoon Kwon <jimmyxyz010315@gmail.com> Signed-off-by: Jihoon Kwon <jimmyxyz010315@gmail.com> Signed-off-by: Jaehun Gou <p22gone@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>	2025-10-15 14:37:21 +09:00
Linus Torvalds	9b332cece9	Merge tag 'nfsd-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fix from Chuck Lever: - Fix a crasher reported by rtm@csail.mit.edu * tag 'nfsd-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: Define a proc_layoutcommit for the FlexFiles layout type	2025-10-14 09:28:12 -07:00
Jaegeuk Kim	9d5c4f5c7a	f2fs: fix wrong block mapping for multi-devices Assuming the disk layout as below, disk0: 0 --- 0x00035abfff disk1: 0x00035ac000 --- 0x00037abfff disk2: 0x00037ac000 --- 0x00037ebfff and we want to read data from offset=13568 having len=128 across the block devices, we can illustrate the block addresses like below. 0 .. 0x00037ac000 ------------------- 0x00037ebfff, 0x00037ec000 ------- \| ^ ^ ^ \| fofs 0 13568 13568+128 \| ------------------------------------------------------ \| LBA 0x37e8aa9 0x37ebfa9 0x37ec029 --- map 0x3caa9 0x3ffa9 In this example, we should give the relative map of the target block device ranging from 0x3caa9 to 0x3ffa9 where the length should be calculated by 0x37ebfff + 1 - 0x37ebfa9. In the below equation, however, map->m_pblk was supposed to be the original address instead of the one from the target block address. - map->m_len = min(map->m_len, dev->end_blk + 1 - map->m_pblk); Cc: stable@vger.kernel.org Fixes: `71f2c82062` ("f2fs: multidevice: support direct IO") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-10-13 23:55:44 +00:00

1 2 3 4 5 ...

102029 Commits