linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-12-27 10:01:39 -05:00

Author	SHA1	Message	Date
Mike Snitzer	0b873de2c0	nfs/localio: remove 61 byte hole from needless ____cacheline_aligned struct nfs_local_kiocb used ____cacheline_aligned on its iters[] array and as the structure evolved it caused a 61 byte hole to form. Fix this by removing ____cacheline_aligned and reordering iters[] before iter_is_dio_aligned[]. Fixes: `6a218b9c31` ("nfs/localio: do not issue misaligned DIO out-of-order") Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-12-05 19:34:29 -05:00
Mike Snitzer	f50d0328d0	nfs/localio: remove alignment size checking in nfs_is_local_dio_possible This check to ensure dio_offset_align isn't larger than PAGE_SIZE is no longer relevant (older iterations of NFS Direct was allocating misaligned head and tail pages but no longer does, so this check isn't needed). Fixes: `c817248fc8` ("nfs/localio: add proper O_DIRECT support for READ and WRITE") Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-12-05 19:34:29 -05:00
Linus Torvalds	399ead3a6d	Merge tag 'uml-for-linux-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Johannes Berg: "Apart from the usual small churn, we have - initial SMP support (only kernel) - major vDSO cleanups (and fixes for 32-bit)" * tag 'uml-for-linux-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (33 commits) um: Disable KASAN_INLINE when STATIC_LINK is selected um: Don't rename vmap to kernel_vmap um: drivers: virtio: use string choices helper um: Always set up AT_HWCAP and AT_PLATFORM x86/um: Remove FIXADDR_USER_START and FIXADDR_USE_END um: Remove __access_ok_vsyscall() um: Remove redundant range check from __access_ok_vsyscall() um: Remove fixaddr_user_init() x86/um: Drop gate area handling x86/um: Do not inherit vDSO from host um: Split out default elf_aux_hwcap x86/um: Move ELF_PLATFORM fallback to x86-specific code um: Split out default elf_aux_platform um: Avoid circular dependency on asm-offsets in pgtable.h um: Enable SMP support on x86 asm-generic: percpu: Add assembly guard um: vdso: Remove getcpu support on x86 um: Add initial SMP support um: Define timers on a per-CPU basis um: Determine sleep based on need_resched() ...	2025-12-05 16:30:56 -08:00
Christian Brauner	87c9e88ac4	ovl: pass original credentials, not mounter credentials during create When creating new files the security layer expects the original credentials to be passed. When cleaning up the code this was accidently changed to pass the mounter's credentials by relying on current->cred which is already overriden at this point. Pass the original credentials directly. Reported-by: Ondrej Mosnacek <omosnace@redhat.com> Reported-by: Paul Moore <paul@paul-moore.com> Fixes: `e566bff963` ("ovl: port ovl_create_or_link() to new ovl_override_creator_creds") Link: https://lore.kernel.org/CAFqZXNvL1ciLXMhHrnoyBmQu1PAApH41LkSWEhrcvzAAbFij8Q@mail.gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org> Tested-by: Ondrej Mosnacek <omosnace@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2025-12-05 16:16:20 -08:00
David Howells	9146c7e53f	cifs: Remove dead function prototypes Remove a bunch of dead function prototypes. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:54:09 -06:00
Linus Torvalds	4b9d25b4d3	Merge tag 'vfs-6.19-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix a type conversion bug in the ipc subsystem - Fix per-dentry timeout warning in autofs - Drop the fd conversion from sockets - Move assert from iput_not_last() to iput() - Fix reversed check in filesystems_freeze_callback() - Use proper uapi types for new struct delegation definitions * tag 'vfs-6.19-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: vfs: use UAPI types for new struct delegation definition mqueue: correct the type of ro to int Revert "net/socket: convert sock_map_fd() to FD_ADD()" autofs: fix per-dentry timeout warning fs: assert on I_FREEING not being set in iput() and iput_not_last() fs: PM: Fix reverse check in filesystems_freeze_callback()	2025-12-05 15:52:30 -08:00
Linus Torvalds	e40e023591	Merge tag 'exfat-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat updates from Namjae Jeon: - Fix a remount failure caused by differing process masks by inheriting the original mount options during the remount process - Fix a potential divide-by-zero error and system crash in exfat_allocate_bitmap that occurred when the readahead count was zero - Add validation for directory cluster bitmap bits to prevent directory and root cluster from being incorrectly zeroed out on corrupted images - Clear the post-EOF page cache when extending a file to prevent stale mmap data from becoming visible, addressing an generic/363 failure - Fix a reference count leak in exfat_find by properly releasing the dentry set in specific error paths * tag 'exfat-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: fix remount failure in different process environments exfat: fix divide-by-zero in exfat_allocate_bitmap exfat: validate the cluster bitmap bits of directory exfat: zero out post-EOF page cache on file extension exfat: fix refcount leak in exfat_find	2025-12-05 15:48:09 -08:00
ChenXiaoSong	d159702c94	smb/client: add two elements to smb2_error_map_table array Both status codes are mapped to -EIO. Now all status codes from common/smb2status.h are included in the smb2_error_map_table array(except for the first two zero definitions). Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:46:41 -06:00
ChenXiaoSong	523ecd9766	smb: rename to STATUS_SMB_NO_PREAUTH_INTEGRITY_HASH_OVERLAP See MS-SMB2 3.3.5.4. To keep the name consistent with the documentation. Additionally, move STATUS_INVALID_LOCK_RANGE to correct position in order. Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:46:41 -06:00
ChenXiaoSong	bf80d1517d	smb/client: remove unused elements from smb2_error_map_table array STATUS_SUCCESS and STATUS_WAIT_0 are both zero, and since zero indicates success, they are not needed. Since smb2_print_status() has been removed, the last element in the array is no longer needed. Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:46:41 -06:00
ChenXiaoSong	6c1eb31ecb	smb/client: reduce loop count in map_smb2_to_linux_error() by half The smb2_error_map_table array currently has 1743 elements. When searching for the last element and calling smb2_print_status(), 3486 comparisons are needed. The loop in smb2_print_status() is unnecessary, smb2_print_status() can be removed, and only iterate over the array once, printing the message when the target status code is found. Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:46:41 -06:00
Paulo Alcantara	7ad785927d	smb: client: Add tracepoint for krb5 auth Add tracepoint to help debugging krb5 auth failures. Example: $ trace-cmd record -e smb3_kerberos_auth $ mount.cifs ... $ trace-cmd report mount.cifs-1667 [003] ..... 5810.668549: smb3_kerberos_auth: vers=2 host=w22-dc1.zelda.test ip=192.168.124.30:445 sec=krb5 uid=0 cruid=0 user=root pid=1667 upcall_target=app err=-126 Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Reviewed-by: David Howells <dhowells@redhat.com> Cc: Pierguido Lambri <plambri@redhat.com> Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:40:42 -06:00
Paulo Alcantara	a8fce7c807	smb: client: improve error message when creating SMB session When failing to create a new SMB session with 'sec=krb5' for example, the following error message isn't very useful CIFS: VFS: \\srv Send error in SessSetup = -126 Improve it by printing the following instead on dmesg CIFS: VFS: \\srv failed to create a new SMB session with Kerberos: -126 Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Cc: Pierguido Lambri <plambri@redhat.com> Reviewed-by: David Howells <dhowells@redhat.com> Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:40:38 -06:00
Paulo Alcantara	855982a52f	smb: client: relax session and tcon reconnect attempts When the client re-establishes connection to the server, it will queue a worker thread that will attempt to reconnect sessions and tcons on every two seconds, which is kinda overkill as it is a very common scenario when having expired passwords or KRB5 TGT tickets, or deleted shares. Use an exponential backoff strategy to handle session/tcon reconnect attempts in the worker thread to prevent the client from overloading the system when it is very unlikely to re-establish any session/tcon soon while client is idle. Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Reviewed-by: David Howells <dhowells@redhat.com> Cc: Pierguido Lambri <plambri@redhat.com> Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:40:33 -06:00
Linus Torvalds	4b6b432128	Merge tag 'fuse-update-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi: - Add mechanism for cleaning out unused, stale dentries; controlled via a module option (Luis Henriques) - Fix various bugs - Cleanups * tag 'fuse-update-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: Uninitialized variable in fuse_epoch_work() fuse: fix io-uring list corruption for terminated non-committed requests fuse: signal that a fuse inode should exhibit local fs behaviors fuse: Always flush the page cache before FOPEN_DIRECT_IO write fuse: Invalidate the page cache after FOPEN_DIRECT_IO write fuse: rename 'namelen' to 'namesize' fuse: use strscpy instead of strcpy fuse: refactor fuse_conn_put() to remove negative logic. fuse: new work queue to invalidate dentries from old epochs fuse: new work queue to periodically invalidate expired dentries dcache: export shrink_dentry_list() and add new helper d_dispose_if_unused() fuse: add WARN_ON and comment for RCU revalidate fuse: Fix whitespace for fuse_uring_args_to_ring() comment fuse: missing copy_finish in fuse-over-io-uring argument copies fuse: fix readahead reclaim deadlock	2025-12-05 15:25:13 -08:00
David Howells	4ae4dde6f3	cifs: Fix handling of a beyond-EOF DIO/unbuffered read over SMB2 If a DIO read or an unbuffered read request extends beyond the EOF, the server will return a short read and a status code indicating that EOF was hit, which gets translated to -ENODATA. Note that the client does not cap the request at i_size, but asks for the amount requested in case there's a race on the server with a third party. Now, on the client side, the request will get split into multiple subrequests if rsize is smaller than the full request size. A subrequest that starts before or at the EOF and returns short data up to the EOF will be correctly handled, with the NETFS_SREQ_HIT_EOF flag being set, indicating to netfslib that we can't read more. If a subrequest, however, starts after the EOF and not at it, HIT_EOF will not be flagged, its error will be set to -ENODATA and it will be abandoned. This will cause the request as a whole to fail with -ENODATA. Fix this by setting NETFS_SREQ_HIT_EOF on any subrequest that lies beyond the EOF marker. Fixes: `1da29f2c39` ("netfs, cifs: Fix handling of short DIO read") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: Shyam Prasad N <sprasad@microsoft.com> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:15:11 -06:00
Rajasi Mandal	ef529f655a	cifs: client: allow changing multichannel mount options on remount Previously, the client did not update a session's channel state when multichannel or max_channels mount options were changed via remount. This led to inconsistent behavior and prevented enabling or disabling multichannel support without a full unmount/remount cycle. Enable dynamic reconfiguration of multichannel and max_channels during remount by: - Introducing smb3_sync_ses_chan_max(), a centralized function for channel updates which synchronizes the session's channels with the updated configuration. - Replacing cifs_disable_secondary_channels() with cifs_decrease_secondary_channels(), which accepts a disable_mchan flag to support multichannel disable when the server stops supporting multichannel. - Updating remount logic to detect changes in multichannel or max_channels and trigger appropriate session/channel updates. Current limitation: - The query_interfaces worker runs even when max_channels=1 so that multichannel can be enabled later via remount without requiring an unmount. This is a temporary approach and may be refined in the future. Users can safely modify multichannel and max_channels on an existing mount. The client will correctly adjust the session's channel state to match the new configuration, preserving durability where possible and avoiding unnecessary disconnects. Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Rajasi Mandal <rajasimandal@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:14:14 -06:00
David Howells	32a6086809	cifs: Do some preparation prior to organising the function declarations Make some preparatory cleanups prior to running a script to organise the function declarations within the fs/smb/client/ headers. These include: (1) Remove "inline" from the dummy cifs_proc_init/clean() functions as they are in a .c file. (2) Move should_compress()'s kdoc comment to the .c file and remove kdoc markers from the comments. (3) Rename CIFS_ALLOW_INSECURE_LEGACY in #endif comments to have CONFIG_ on the front to allow the script to recognise it. (4) Don't let comments have bare words at the left margin as that confused the simplistic function detection code in the script. (5) Adjust some argument lists so that when and if the cleanup script is run they don't end up over 100 chars. (6) Fix a few comments to have missing '' added or the "/" moved to their own lines so that checkpatch doesn't moan over the cleanup script patch. (7) Move struct cifs_calc_sig_ctx to cifsglob.h. (8) Remove some __KERNEL__ conditionals. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:11:55 -06:00
David Howells	f80ac7eda1	cifs: Add a tracepoint to log EIO errors Add a tracepoint to log EIO errors and give it the capacity to convey up to two integers of information. This is then wrapped with three functions: int smb_EIO(enum smb_eio_trace trace) int smb_EIO1(enum smb_eio_trace trace, unsigned long info) int smb_EIO2(enum smb_eio_trace trace, unsigned long info, unsigned long info2) depending on how many bits of info are desired to be logged with any particular trace. The functions all return -EIO and can be used in place of -EIO. The trace argument is an enum value that gets translated to a string when the trace is printed. This makes is easier to log EIO instances when the client is under high load than turning on a printk wrapper such as cifs_dbg(). Granted, EIO could have its own separate EIO printing since EIO shouldn't happen. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:11:43 -06:00
David Howells	3a7b6d0afe	cifs: Don't need state locking in smb2_get_mid_entry() There's no need to get ->srv_lock or ->ses_lock in smb2_get_mid_entry() as all that happens of relevance (to the lock) inside the locked sections is the reading of one status value in each. Replace the locking with READ_ONCE() and use a switch instead of a chain of if-statements. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: Shyam Prasad N <sprasad@microsoft.com> cc: Tom Talpey <tom@talpey.com> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:11:06 -06:00
David Howells	87fba18abb	cifs: Remove the server pointer from smb_message Remove the server pointer from smb_message and instead pass it down to all the things that access it. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: Shyam Prasad N <sprasad@microsoft.com> cc: Tom Talpey <tom@talpey.com> (RDMA, smbdirect) cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:10:48 -06:00
David Howells	6a86a4cc28	cifs: Fix specification of function pointers Change the mid_receive_t, mid_callback_t and mid_handle_t function pointers to have the pointer marker in the typedef. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:10:13 -06:00
David Howells	28405cb5b2	cifs: Replace SendReceiveBlockingLock() with SendReceive() plus flags Replace the smb1 transport's SendReceiveBlockingLock() with SendReceive() plus a couple of flags. This will then allow that to pick up the transport changes there. The first flag, CIFS_INTERRUPTIBLE_WAIT, is added to indicate that the wait should be interruptible and the second, CIFS_WINDOWS_LOCK, indicates that we need to send a Lock command with unlock type rather than a Cancel. send_lock_cancel() is then called from cifs_lock_cancel() which is called from the main transport loop in compound_send_recv(). [!] I think the error code handling is probably right. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: Shyam Prasad N <sprasad@microsoft.com> cc: Tom Talpey <tom@talpey.com> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:10:05 -06:00
David Howells	62432a3f51	cifs: Clean up some places where an extra kvec[] was required for rfc1002 Clean up some places where previously an extra element in the kvec array was being used to hold an rfc1002 header for SMB1 (a previous patch removed this and generated it on the fly as for SMB2/3). Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: Shyam Prasad N <sprasad@microsoft.com> cc: Tom Talpey <tom@talpey.com> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:10:00 -06:00
David Howells	6be09580df	cifs: Make smb1's SendReceive() wrap cifs_send_recv() Make the smb1 transport's SendReceive() simply wrap cifs_send_recv() as does SendReceive2(). This will then allow that to pick up the transport changes there. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: Shyam Prasad N <sprasad@microsoft.com> cc: Tom Talpey <tom@talpey.com> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:09:47 -06:00
David Howells	83bfbd0bb9	cifs: Remove the RFC1002 header from smb_hdr Remove the RFC1002 header from struct smb_hdr as used for SMB-1.0. This simplifies the SMB-1.0 code by simplifying a lot of places that have to add or subtract 4 to work around the fact that the RFC1002 header isn't really part of the message and the base for various offsets within the message is from the base of the smb_hdr, not the RFC1002 header. Further, clean up a bunch of places that require an extra kvec struct specifically pointing to the RFC1002 header, such that kvec[0].iov_base must be exactly 4 bytes before kvec[1].iov_base. This allows the header preamble size stuff to be removed too. The size of the request and response message are then handed around either directly or by summing the size of all the iov_len members in the kvec array for which we have a count. Also, this simplifies and cleans up the common transmission and receive paths for SMB1 and SMB2/3 as there no longer needs to be special handling casing for SMB1 messages as the RFC1002 header is now generated on the fly for SMB1 as it is for SMB2/3. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Tom Talpey <tom@talpey.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: Shyam Prasad N <sprasad@microsoft.com> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:09:32 -06:00
David Howells	9d85ac939d	cifs: Fix handling of a beyond-EOF DIO/unbuffered read over SMB1 If a DIO read or an unbuffered read request extends beyond the EOF, the server will return a short read and a status code indicating that EOF was hit, which gets translated to -ENODATA. Note that the client does not cap the request at i_size, but asks for the amount requested in case there's a race on the server with a third party. Now, on the client side, the request will get split into multiple subrequests if rsize is smaller than the full request size. A subrequest that starts before or at the EOF and returns short data up to the EOF will be correctly handled, with the NETFS_SREQ_HIT_EOF flag being set, indicating to netfslib that we can't read more. If a subrequest, however, starts after the EOF and not at it, HIT_EOF will not be flagged, its error will be set to -ENODATA and it will be abandoned. This will cause the request as a whole to fail with -ENODATA. Fix this by setting NETFS_SREQ_HIT_EOF on any subrequest that lies beyond the EOF marker. This can be reproduced by mounting with "cache=none,sign,vers=1.0" and doing a read of a file that's significantly bigger than the size of the file (e.g. attempting to read 64KiB from a 16KiB file). Fixes: `a68c74865f` ("cifs: Fix SMB1 readv/writev callback in the same way as SMB2/3") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> cc: Shyam Prasad N <sprasad@microsoft.com> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-05 17:09:28 -06:00
Linus Torvalds	7cd122b552	Merge tag 'pull-persistency' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull persistent dentry infrastructure and conversion from Al Viro: "Some filesystems use a kinda-sorta controlled dentry refcount leak to pin dentries of created objects in dcache (and undo it when removing those). A reference is grabbed and not released, but it's not actually _stored_ anywhere. That works, but it's hard to follow and verify; among other things, we have no way to tell _which_ of the increments is intended to be an unpaired one. Worse, on removal we need to decide whether the reference had already been dropped, which can be non-trivial if that removal is on umount and we need to figure out if this dentry is pinned due to e.g. unlink() not done. Usually that is handled by using kill_litter_super() as ->kill_sb(), but there are open-coded special cases of the same (consider e.g. /proc/self). Things get simpler if we introduce a new dentry flag (DCACHE_PERSISTENT) marking those "leaked" dentries. Having it set claims responsibility for +1 in refcount. The end result this series is aiming for: - get these unbalanced dget() and dput() replaced with new primitives that would, in addition to adjusting refcount, set and clear persistency flag. - instead of having kill_litter_super() mess with removing the remaining "leaked" references (e.g. for all tmpfs files that hadn't been removed prior to umount), have the regular shrink_dcache_for_umount() strip DCACHE_PERSISTENT of all dentries, dropping the corresponding reference if it had been set. After that kill_litter_super() becomes an equivalent of kill_anon_super(). Doing that in a single step is not feasible - it would affect too many places in too many filesystems. It has to be split into a series. This work has really started early in 2024; quite a few preliminary pieces have already gone into mainline. This chunk is finally getting to the meat of that stuff - infrastructure and most of the conversions to it. Some pieces are still sitting in the local branches, but the bulk of that stuff is here" * tag 'pull-persistency' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (54 commits) d_make_discardable(): warn if given a non-persistent dentry kill securityfs_recursive_remove() convert securityfs get rid of kill_litter_super() convert rust_binderfs convert nfsctl convert rpc_pipefs convert hypfs hypfs: swich hypfs_create_u64() to returning int hypfs: switch hypfs_create_str() to returning int hypfs: don't pin dentries twice convert gadgetfs gadgetfs: switch to simple_remove_by_name() convert functionfs functionfs: switch to simple_remove_by_name() functionfs: fix the open/removal races functionfs: need to cancel ->reset_work in ->kill_sb() functionfs: don't bother with ffs->ref in ffs_data_{opened,closed}() functionfs: don't abuse ffs_data_closed() on fs shutdown convert selinuxfs ...	2025-12-05 14:36:21 -08:00
Linus Torvalds	7203ca412f	Merge tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "__vmalloc()/kvmalloc() and no-block support" (Uladzislau Rezki) Rework the vmalloc() code to support non-blocking allocations (GFP_ATOIC, GFP_NOWAIT) "ksm: fix exec/fork inheritance" (xu xin) Fix a rare case where the KSM MMF_VM_MERGE_ANY prctl state is not inherited across fork/exec "mm/zswap: misc cleanup of code and documentations" (SeongJae Park) Some light maintenance work on the zswap code "mm/page_owner: add debugfs files 'show_handles' and 'show_stacks_handles'" (Mauricio Faria de Oliveira) Enhance the /sys/kernel/debug/page_owner debug feature by adding unique identifiers to differentiate the various stack traces so that userspace monitoring tools can better match stack traces over time "mm/page_alloc: pcp->batch cleanups" (Joshua Hahn) Minor alterations to the page allocator's per-cpu-pages feature "Improve UFFDIO_MOVE scalability by removing anon_vma lock" (Lokesh Gidra) Address a scalability issue in userfaultfd's UFFDIO_MOVE operation "kasan: cleanups for kasan_enabled() checks" (Sabyrzhan Tasbolatov) "drivers/base/node: fold node register and unregister functions" (Donet Tom) Clean up the NUMA node handling code a little "mm: some optimizations for prot numa" (Kefeng Wang) Cleanups and small optimizations to the NUMA allocation hinting code "mm/page_alloc: Batch callers of free_pcppages_bulk" (Joshua Hahn) Address long lock hold times at boot on large machines. These were causing (harmless) softlockup warnings "optimize the logic for handling dirty file folios during reclaim" (Baolin Wang) Remove some now-unnecessary work from page reclaim "mm/damon: allow DAMOS auto-tuned for per-memcg per-node memory usage" (SeongJae Park) Enhance the DAMOS auto-tuning feature "mm/damon: fixes for address alignment issues in DAMON_LRU_SORT and DAMON_RECLAIM" (Quanmin Yan) Fix DAMON_LRU_SORT and DAMON_RECLAIM with certain userspace configuration "expand mmap_prepare functionality, port more users" (Lorenzo Stoakes) Enhance the new(ish) file_operations.mmap_prepare() method and port additional callsites from the old ->mmap() over to ->mmap_prepare() "Fix stale IOTLB entries for kernel address space" (Lu Baolu) Fix a bug (and possible security issue on non-x86) in the IOMMU code. In some situations the IOMMU could be left hanging onto a stale kernel pagetable entry "mm/huge_memory: cleanup __split_unmapped_folio()" (Wei Yang) Clean up and optimize the folio splitting code "mm, swap: misc cleanup and bugfix" (Kairui Song) Some cleanups and a minor fix in the swap discard code "mm/damon: misc documentation fixups" (SeongJae Park) "mm/damon: support pin-point targets removal" (SeongJae Park) Permit userspace to remove a specific monitoring target in the middle of the current targets list "mm: MISC follow-up patches for linux/pgalloc.h" (Harry Yoo) A couple of cleanups related to mm header file inclusion "mm/swapfile.c: select swap devices of default priority round robin" (Baoquan He) improve the selection of swap devices for NUMA machines "mm: Convert memory block states (MEM_) macros to enums" (Israel Batista) Change the memory block labels from macros to enums so they will appear in kernel debug info "ksm: perform a range-walk to jump over holes in break_ksm" (Pedro Demarchi Gomes) Address an inefficiency when KSM unmerges an address range "mm/damon/tests: fix memory bugs in kunit tests" (SeongJae Park) Fix leaks and unhandled malloc() failures in DAMON userspace unit tests "some cleanups for pageout()" (Baolin Wang) Clean up a couple of minor things in the page scanner's writeback-for-eviction code "mm/hugetlb: refactor sysfs/sysctl interfaces" (Hui Zhu) Move hugetlb's sysfs/sysctl handling code into a new file "introduce VM_MAYBE_GUARD and make it sticky" (Lorenzo Stoakes) Make the VMA guard regions available in /proc/pid/smaps and improves the mergeability of guarded VMAs "mm: perform guard region install/remove under VMA lock" (Lorenzo Stoakes) Reduce mmap lock contention for callers performing VMA guard region operations "vma_start_write_killable" (Matthew Wilcox) Start work on permitting applications to be killed when they are waiting on a read_lock on the VMA lock "mm/damon/tests: add more tests for online parameters commit" (SeongJae Park) Add additional userspace testing of DAMON's "commit" feature "mm/damon: misc cleanups" (SeongJae Park) "make VM_SOFTDIRTY a sticky VMA flag" (Lorenzo Stoakes) Address the possible loss of a VMA's VM_SOFTDIRTY flag when that VMA is merged with another "mm: support device-private THP" (Balbir Singh) Introduce support for Transparent Huge Page (THP) migration in zone device-private memory "Optimize folio split in memory failure" (Zi Yan) "mm/huge_memory: Define split_type and consolidate split support checks" (Wei Yang) Some more cleanups in the folio splitting code "mm: remove is_swap_[pte, pmd]() + non-swap entries, introduce leaf entries" (Lorenzo Stoakes) Clean up our handling of pagetable leaf entries by introducing the concept of 'software leaf entries', of type softleaf_t "reparent the THP split queue" (Muchun Song) Reparent the THP split queue to its parent memcg. This is in preparation for addressing the long-standing "dying memcg" problem, wherein dead memcg's linger for too long, consuming memory resources "unify PMD scan results and remove redundant cleanup" (Wei Yang) A little cleanup in the hugepage collapse code "zram: introduce writeback bio batching" (Sergey Senozhatsky) Improve zram writeback efficiency by introducing batched bio writeback support "memcg: cleanup the memcg stats interfaces" (Shakeel Butt) Clean up our handling of the interrupt safety of some memcg stats "make vmalloc gfp flags usage more apparent" (Vishal Moola) Clean up vmalloc's handling of incoming GFP flags "mm: Add soft-dirty and uffd-wp support for RISC-V" (Chunyan Zhang) Teach soft dirty and userfaultfd write protect tracking to use RISC-V's Svrsw60t59b extension "mm: swap: small fixes and comment cleanups" (Youngjun Park) Fix a small bug and clean up some of the swap code "initial work on making VMA flags a bitmap" (Lorenzo Stoakes) Start work on converting the vma struct's flags to a bitmap, so we stop running out of them, especially on 32-bit "mm/swapfile: fix and cleanup swap list iterations" (Youngjun Park) Address a possible bug in the swap discard code and clean things up a little [ This merge also reverts commit `ebb9aeb980` ("vfio/nvgrace-gpu: register device memory for poison handling") because it looks broken to me, I've asked for clarification - Linus ] tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits) mm: fix vma_start_write_killable() signal handling mm/swapfile: use plist_for_each_entry in __folio_throttle_swaprate mm/swapfile: fix list iteration when next node is removed during discard fs/proc/task_mmu.c: fix make_uffd_wp_huge_pte() huge pte handling mm/kfence: add reboot notifier to disable KFENCE on shutdown memcg: remove inc/dec_lruvec_kmem_state helpers selftests/mm/uffd: initialize char variable to Null mm: fix DEBUG_RODATA_TEST indentation in Kconfig mm: introduce VMA flags bitmap type tools/testing/vma: eliminate dependency on vma->__vm_flags mm: simplify and rename mm flags function for clarity mm: declare VMA flags by bit zram: fix a spelling mistake mm/page_alloc: optimize lowmem_reserve max lookup using its semantic monotonicity mm/vmscan: skip increasing kswapd_failures when reclaim was boosted pagemap: update BUDDY flag documentation mm: swap: remove scan_swap_map_slots() references from comments mm: swap: change swap_alloc_slow() to void mm, swap: remove redundant comment for read_swap_cache_async mm, swap: use SWP_SOLIDSTATE to determine if swap is rotational ...	2025-12-05 13:52:43 -08:00
Linus Torvalds	ac20755937	Merge tag 'sysctl-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl Pull sysctl updates from Joel Granados: - Move jiffies converters out of kernel/sysctl.c Move the jiffies converters into kernel/time/jiffies.c and replace the pipe-max-size proc_handler converter with a macro based version. This is all part of the effort to relocate non-sysctl logic out of kernel/sysctl.c into more relevant subsystems. No functional changes. - Generalize proc handler converter creation Remove duplicated sysctl converter logic by consolidating it in macros. These are used inside sysctl core as well as in pipe.c and jiffies.c. Converter kernel and user space pointer args are now automatically const qualified for the convenience of the caller. No functional changes. - Miscellaneous Fix kernel-doc format warnings, remove unnecessary __user qualifiers, and move the nmi_watchdog sysctl into .rodata. - Testing This series was run through sysctl selftests/kunit test suite in x86_64. It went into linux-next after rc2, giving it a good 4/5 weeks of testing. * tag 'sysctl-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: (21 commits) sysctl: Wrap do_proc_douintvec with the public function proc_douintvec_conv sysctl: Create pipe-max-size converter using sysctl UINT macros sysctl: Move proc_doulongvec_ms_jiffies_minmax to kernel/time/jiffies.c sysctl: Move jiffies converters to kernel/time/jiffies.c sysctl: Move UINT converter macros to sysctl header sysctl: Move INT converter macros to sysctl header sysctl: Allow custom converters from outside sysctl sysctl: remove __user qualifier from stack_erasing_sysctl buffer argument sysctl: Create macro for user-to-kernel uint converter sysctl: Add optional range checking to SYSCTL_UINT_CONV_CUSTOM sysctl: Create unsigned int converter using new macro sysctl: Add optional range checking to SYSCTL_INT_CONV_CUSTOM sysctl: Create integer converters with one macro sysctl: Create converter functions with two new macros sysctl: Discriminate between kernel and user converter params sysctl: Indicate the direction of operation with macro names sysctl: Remove superfluous __do_proc_* indirection sysctl: Remove superfluous tbl_data param from "dovec" functions sysctl: Replace void pointer with const pointer to ctl_table sysctl: fix kernel-doc format warning ...	2025-12-05 11:15:37 -08:00
Linus Torvalds	3ee37abbbd	Merge tag 'pstore-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull pstore update from Kees Cook: - pstore/ram: Update module parameters from platform data (Tzung-Bi Shih) * tag 'pstore-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: pstore/ram: Update module parameters from platform data	2025-12-05 09:08:13 -08:00
Linus Torvalds	5d45c729ed	Merge tag 'configfs-for-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/a.hindborg/linux Pull configfs updates from Andreas Hindborg: "Two commits changing constness of the configfs vtable pointers. We plan to follow up with changes at call sites down the road" * tag 'configfs-for-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/a.hindborg/linux: configfs: Constify ct_item_ops in struct config_item_type configfs: Constify ct_group_ops in struct config_item_type	2025-12-05 08:59:41 -08:00
Eric Sandeen	3e281113f8	9p: fix new mount API cache option handling After commit `4eb3117888`, 9p needs to be able to accept numerical cache= mount options as well as the string "shortcuts" because the option is printed numerically in /proc/mounts rather than by string. This was missed in the mount API conversion, which used an enum for the shortcuts and therefore could not handle a numeric equivalent as an argument to the cache option. Fix this by removing the enum and reverting to the slightly more open-coded option handling for Opt_cache, with the reinstated get_cache_mode() helper. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Message-ID: <48cdeec9-5bb9-4c7a-a203-39bb8e0ef443@redhat.com> Tested-by: Remi Pommarel <repk@triplefau.lt> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2025-12-05 12:54:05 +00:00
Eric Sandeen	f044561331	9p: fix cache/debug options printing in v9fs_show_options commit `4eb3117888` changed the cache= option to accept either string shortcuts or bitfield values. It also changed /proc/mounts to emit the option as the hexadecimal numeric value rather than the shortcut string. However, by printing "cache=%x" without the leading 0x, shortcuts such as "cache=loose" will emit "cache=f" and 'f' is not a string that is parseable by kstrtoint(), so remounting may fail if a remount with "cache=f" is attempted. debug=%x has had the same problem since options have been displayed in `c4fac91004` ("9p: Implement show_options") Fix these by adding the 0x prefix to the hexadecimal value shown in /proc/mounts. Fixes: `4eb3117888` ("fs/9p: Rework cache modes and add new options to Documentation") Signed-off-by: Eric Sandeen <sandeen@redhat.com> Message-ID: <54b93378-dcf1-4b04-922d-c8b4393da299@redhat.com> [Dominique: use %#x at Al Viro's suggestion, also handle debug] Tested-by: Remi Pommarel <repk@triplefau.lt> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2025-12-05 12:53:16 +00:00
Ian Kent	b6cb3ccef6	autofs: fix per-dentry timeout warning The check that determines if the message that warns about the per-dentry timeout being greater than the super block timeout is not correct. The initial value for this field is -1 and the type of the field is unsigned long. I could change the type to long but the message is in the wrong place too, it should come after the timeout setting. So leave everything else as it is and move the message and check the timeout is actually set as an additional condition on issuing the message. Also fix the timeout comparison. Signed-off-by: Ian Kent <raven@themaw.net> Link: https://patch.msgid.link/20251111060439.19593-2-raven@themaw.net Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-12-04 09:57:51 +01:00
Rajasi Mandal	1ef15fbe67	cifs: client: enforce consistent handling of multichannel and max_channels Previously, the behavior of the multichannel and max_channels mount options was inconsistent and order-dependent. For example, specifying "multichannel,max_channels=1" would result in 2 channels, while "max_channels=1,multichannel" would result in 1 channel. Additionally, conflicting combinations such as "nomultichannel,max_channels=3" or "multichannel,max_channels=1" did not produce errors and could lead to unexpected channel counts. This commit introduces two new fields in smb3_fs_context to explicitly track whether multichannel and max_channels were specified during mount. The option parsing and validation logic is updated to ensure: - The outcome is no longer dependent on the order of options. - Conflicting combinations (e.g., "nomultichannel,max_channels=3" or "multichannel,max_channels=1") are detected and result in an error. - The number of channels created is consistent with the specified options. This improves the reliability and predictability of mount option handling for SMB3 multichannel support. Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Rajasi Mandal <rajasimandal@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-12-03 23:32:44 -06:00
Linus Torvalds	559e608c46	Merge tag 'ntfs3_for_6.19' of https://github.com/Paragon-Software-Group/linux-ntfs3 Pull ntfs3 updates from Konstantin Komarov: "New code: - support timestamps prior to epoch - do not overwrite uptodate pages - disable readahead for compressed files - setting of dummy blocksize to read boot_block when mounting - the run_lock initialization when loading $Extend - initialization of allocated memory before use - support for the NTFS3_IOC_SHUTDOWN ioctl - check for minimum alignment when performing direct I/O reads - check for shutdown in fsync Fixes: - mount failure for sparse runs in run_unpack() - use-after-free of sbi->options in cmp_fnames - KMSAN uninit bug after failed mi_read in mi_format_new - uninit error after buffer allocation by __getname() - KMSAN uninit-value in ni_create_attr_list - double free of sbi->options->nls and ownership of fc->fs_private - incorrect vcn adjustments in attr_collapse_range() - mode update when ACL can be reduced to mode - memory leaks in add sub record Changes: - refactor code, updated terminology, spelling - do not kmap pages in (de)compression code - after ntfs_look_free_mft(), code that fails must put mft_inode - default mount options for "acl" and "prealloc" Replaced: - use unsafe_memcpy() to avoid memcpy size warning - ntfs_bio_pages with page cache for compressed files" * tag 'ntfs3_for_6.19' of https://github.com/Paragon-Software-Group/linux-ntfs3: (26 commits) fs/ntfs3: check for shutdown in fsync fs/ntfs3: change the default mount options for "acl" and "prealloc" fs/ntfs3: Prevent memory leaks in add sub record fs/ntfs3: out1 also needs to put mi fs/ntfs3: Fix spelling mistake "recommened" -> "recommended" fs/ntfs3: update mode in xattr when ACL can be reduced to mode fs/ntfs3: check minimum alignment for direct I/O fs/ntfs3: implement NTFS3_IOC_SHUTDOWN ioctl fs/ntfs3: correct attr_collapse_range when file is too fragmented ntfs3: fix double free of sbi->options->nls and clarify ownership of fc->fs_private fs/ntfs3: Initialize allocated memory before use fs/ntfs3: remove ntfs_bio_pages and use page cache for compressed I/O ntfs3: avoid memcpy size warning fs/ntfs3: fix KMSAN uninit-value in ni_create_attr_list ntfs3: init run lock for extend inode ntfs: set dummy blocksize to read boot_block when mounting fs/ntfs3: disable readahead for compressed files ntfs3: Fix uninit buffer allocated by __getname() ntfs3: fix uninit memory after failed mi_read in mi_format_new ntfs3: fix use-after-free of sbi->options in cmp_fnames ...	2025-12-03 20:45:43 -08:00
Linus Torvalds	fbeea4db51	Merge tag 'ext4_for_linus-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "New features and improvements for the ext4 file system: - Optimize online defragmentation by using folios instead of individual buffer heads - Improve error codes stored in the superblock when the journal aborts - Minor cleanups and clarifications in ext4_map_blocks() - Add documentation of the casefold and encrypt flags - Add support for file systems with a blocksize greater than the pagesize - Improve performance by enabling the caching the fact that an inode does not have a Posix ACL Various Bug Fixes: - Fix false positive complaints from smatch - Fix error code which is returned by ext4fs_dirhash() when Siphash is used without the encryption key - Fix races when writing to inline data files which could trigger a BUG - Fix potential NULL dereference when there is an corrupt file system with an extended attribute value stored in a inode - Fix false positive lockdep report when syzbot uses ext4 and ocfs2 together - Fix false positive reported by DEPT by adjusting lock annotation - Avoid a potential BUG_ON in jbd2 when a file system is massively corrupted - Fix a WARN_ON when superblock is corrupted with a non-NULL terminated mount options field - Add check if the userspace passes in a non-NULL terminated mount options field to EXT4_IOC_SET_TUNE_SB_PARAM - Fix a potential journal checksum failure whena file system is copied while it is mounted read-only - Fix a potential potential orphan file tracking error which only showed on 32-bit systems - Fix assertion checks in mballoc (which have to be explicitly enbled by manually enabling AGGRESSIVE_CHECKS and recompiling) - Avoid complaining about overly large orphan files created by mke2fs with with file systems with a 64k block size" * tag 'ext4_for_linus-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (58 commits) ext4: mark inodes without acls in __ext4_iget() ext4: enable block size larger than page size ext4: add checks for large folio incompatibilities when BS > PS ext4: support verifying data from large folios with fs-verity ext4: make data=journal support large block size ext4: support large block size in __ext4_block_zero_page_range() ext4: support large block size in mpage_prepare_extent_to_map() ext4: support large block size in mpage_map_and_submit_buffers() ext4: support large block size in ext4_block_write_begin() ext4: support large block size in ext4_mpage_readpages() ext4: rename 'page' references to 'folio' in multi-block allocator ext4: prepare buddy cache inode for BS > PS with large folios ext4: support large block size in ext4_mb_init_cache() ext4: support large block size in ext4_mb_get_buddy_page_lock() ext4: support large block size in ext4_mb_load_buddy_gfp() ext4: add EXT4_LBLK_TO_PG and EXT4_PG_TO_LBLK for block/page conversion ext4: add EXT4_LBLK_TO_B macro for logical block to bytes conversion ext4: support large block size in ext4_readdir() ext4: support large block size in ext4_calculate_overhead() ext4: introduce s_min_folio_order for future BS > PS support ...	2025-12-03 20:37:15 -08:00
Linus Torvalds	afcbce74f3	Merge tag 'gfs2-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: - Major withdraw / error handling overhaul based on dlm's new DLM_RELEASE_RECOVER feature: this allows gfs to treat withdraws like node failures. Make withdraws asynchronous - Fix a bug in commit `e4a8b5481c` that caused 'df' to remain out of sync. ('df' is still allowed to go slightly out of sync for short periods of time) - Prevent recusive memory reclaim in gfs2_unstuff_dinode() - Clean up SDF_JOURNAL_LIVE flag handling - Fix remote evict for read-only filesystems - Fix a misuse of bio_chain() - Various other minor cleanups * tag 'gfs2-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (35 commits) gfs2: Fix use of bio_chain gfs2: Clean up SDF_JOURNAL_LIVE flag handling gfs2: No longer thaw filesystems during a withdraw gfs2: Withdraw immediately in gfs2_trans_add_meta gfs2: New gfs2_withdraw_helper gfs2: Clean up properly during a withdraw gfs2: Rename gfs2_{gl_dq_holders => withdraw_glocks} Revert "gfs2: fix infinite loop when checking ail item count before go_inval" Revert "gfs2: Allow some glocks to be used during withdraw" Revert "gfs2: Check for log write errors before telling dlm to unlock" Revert "gfs2: fix a deadlock on withdraw-during-mount" Revert "gfs2: Force withdraw to replay journals and wait for it to finish" (6/6) Revert "gfs2: Force withdraw to replay journals and wait for it to finish" (5/6) Revert "gfs2: Force withdraw to replay journals and wait for it to finish" (4/6) Revert "gfs2: Force withdraw to replay journals and wait for it to finish" (3/6) Revert "gfs2: Force withdraw to replay journals and wait for it to finish" (2/6) Revert "gfs2: Force withdraw to replay journals and wait for it to finish" (1/6) Revert "gfs2: don't stop reads while withdraw in progress" gfs2: Rename LM_FLAG_{NOEXP -> RECOVER} gfs2: Kill gfs2_io_error_bh_wd ...	2025-12-03 20:28:50 -08:00
Linus Torvalds	869737543b	Merge tag 'v6.19-rc-smb-fixes' of git://git.samba.org/ksmbd Pull smb client and server updates from Steve French: - server fixes: - IPC use after free locking fix - fix locking bug in delete paths - fix use after free in disconnect - fix underflow in locking check - error mapping improvement - socket listening improvement - return code mapping fixes - crypto improvements (use default libraries) - cleanup patches: - netfs - client checkpatch cleanup - server cleanup - move server/client duplicate code to common code - fix some defines to better match protocol specification - smbdirect (RDMA) fixes - client debugging improvements for leases * tag 'v6.19-rc-smb-fixes' of git://git.samba.org/ksmbd: (44 commits) cifs: Use netfs_alloc/free_folioq_buffer() smb: client: show smb lease key in open_dirs output smb: client: show smb lease key in open_files output ksmbd: ipc: fix use-after-free in ipc_msg_send_request smb: client: relax WARN_ON_ONCE(SMBDIRECT_SOCKET_) checks in recv_done() and smbd_conn_upcall() smb: server: relax WARN_ON_ONCE(SMBDIRECT_SOCKET_) checks in recv_done() and smb_direct_cm_handler() smb: smbdirect: introduce SMBDIRECT_CHECK_STATUS_{WARN,DISCONNECT}() smb: smbdirect: introduce SMBDIRECT_DEBUG_ERR_PTR() helper ksmbd: vfs: fix race on m_flags in vfs_cache ksmbd: Replace strcpy + strcat to improve convert_to_nt_pathname smb: move FILE_SYSTEM_ATTRIBUTE_INFO to common/fscc.h ksmbd: implement error handling for STATUS_INFO_LENGTH_MISMATCH in smb server ksmbd: fix use-after-free in ksmbd_tree_connect_put under concurrency ksmbd: server: avoid busy polling in accept loop smb: move create_durable_reconn to common/smb2pdu.h smb: fix some warnings reported by scripts/checkpatch.pl smb: do some cleanups smb: move FILE_SYSTEM_SIZE_INFO to common/fscc.h smb: move some duplicate struct definitions to common/fscc.h smb: move list of FileSystemAttributes to common/fscc.h ...	2025-12-03 20:23:41 -08:00
Linus Torvalds	3ed1c68307	Merge tag 'xfs-merge-6.19' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs updates from Carlos Maiolino: "There are no major changes in xfs. This contains mostly some code cleanups, a few bug fixes and documentation update. Highlights are: - Quota locking cleanup - Getting rid of old xlog_in_core_2_t type" * tag 'xfs-merge-6.19' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (33 commits) docs: remove obsolete links in the xfs online repair documentation xfs: move some code out of xfs_iget_recycle xfs: use zi more in xfs_zone_gc_mount xfs: remove the unused bv field in struct xfs_gc_bio xfs: remove xarray mark for reclaimable zones xfs: remove the xlog_in_core_t typedef xfs: remove l_iclog_heads xfs: remove the xlog_rec_header_t typedef xfs: remove xlog_in_core_2_t xfs: remove a very outdated comment from xlog_alloc_log xfs: cleanup xlog_alloc_log a bit xfs: don't use xlog_in_core_2_t in struct xlog_in_core xfs: add a on-disk log header cycle array accessor xfs: add a XLOG_CYCLE_DATA_SIZE constant xfs: reduce ilock roundtrips in xfs_qm_vop_dqalloc xfs: move xfs_dquot_tree calls into xfs_qm_dqget_cache_{lookup,insert} xfs: move quota locking into xrep_quota_item xfs: move quota locking into xqcheck_commit_dquot xfs: move q_qlock locking into xqcheck_compare_dquot xfs: move q_qlock locking into xchk_quota_item ...	2025-12-03 20:19:38 -08:00
Linus Torvalds	477e31fd1e	Merge tag 'erofs-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs updates from Gao Xiang: - Fix a WARNING caused by a recent FSDAX misdetection regression - Fix the filesystem stacking limit for file-backed mounts - Print more informative diagnostics on decompression errors - Switch the on-disk definition `erofs_fs.h` to the MIT license - Minor cleanups * tag 'erofs-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: switch on-disk header `erofs_fs.h` to MIT license erofs: get rid of raw bi_end_io() usage erofs: enable error reporting for z_erofs_fixup_insize() erofs: enable error reporting for z_erofs_stream_switch_bufs() erofs: improve Zstd, LZMA and DEFLATE error strings erofs: improve decompression error reporting erofs: tidy up z_erofs_lz4_handle_overlap() erofs: limit the level of fs stacking for file-backed mounts erofs: correct FSDAX detection	2025-12-03 20:14:44 -08:00
Linus Torvalds	ca010e2ef6	Merge tag 'hfs-v6.19-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/hfs Pull hfs/hfsplus updates from Viacheslav Dubeyko: "Several fixes for syzbot reported issues, HFS/HFS+ fixes of xfstests failures, Kunit-based unit-tests introduction, and code cleanup: - Dan Carpenter fixed a potential use-after-free issue in hfs_correct_next_unused_CNID() method. Tetsuo Handa has made nice fix of syzbot reported issue related to incorrect inode->i_mode management if volume has been corrupted somehow. Yang Chenzhi has made really good fix of potential race condition in __hfs_bnode_create() method for HFS+ file system. - Several fixes to xfstests failures. Particularly, generic/070, generic/073, and generic/101 test-cases finish successfully for the case of HFS+ file system right now. - HFS and HFS+ drivers share multiple structures of on-disk layout declarations. Some structures are used without any change. However, we had two independent declarations of the same structures in HFS and HFS+ drivers. The on-disk layout declarations have been moved into include/linux/hfs_common.h with the goal to exclude the declarations duplication and to keep the HFS/HFS+ on-disk layout declarations in one place. Also, this patch prepares the basis for creating a hfslib that can aggregate common functionality without necessity to duplicate the same code in HFS and HFS+ drivers. - HFS/HFS+ really need unit-tests because of multiple xfstests failures. The first two patches introduce Kunit-based unit-tests for the case string operations in HFS/HFS+ file system drivers" * tag 'hfs-v6.19-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/hfs: hfs/hfsplus: move on-disk layout declarations into hfs_common.h hfsplus: fix volume corruption issue for generic/101 hfsplus: introduce KUnit tests for HFS+ string operations hfs: introduce KUnit tests for HFS string operations hfsplus: fix volume corruption issue for generic/073 hfsplus: Verify inode mode when loading from disk hfsplus: fix volume corruption issue for generic/070 hfs/hfsplus: prevent getting negative values of offset/length hfsplus: fix missing hfs_bnode_get() in __hfs_bnode_create hfs: fix potential use after free in hfs_correct_next_unused_CNID()	2025-12-03 20:08:32 -08:00
Linus Torvalds	7696286034	Merge tag 'for-6.19-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "Features: - shutdown ioctl support (needs CONFIG_BTRFS_EXPERIMENTAL for now): - set filesystem state as being shut down (also named going down in other filesystems), where all active operations return EIO and this cannot be changed until unmount - pending operations are attempted to be finished but error messages may still show up depending on where exactly the shutdown happened - scrub (and device replace) vs suspend/hibernate: - a running scrub will prevent suspend, which can be annoying as suspend is an immediate request and scrub is not critical - filesystem freezing before suspend was not sufficient as the problem was in process freezing - behaviour change: on suspend scrub and device replace are cancelled, where scrub can record the last state and continue from there; the device replace has to be restarted from the beginning - zone stats exported in sysfs, from the perspective of the filesystem this includes active, reclaimable, relocation etc zones Performance: - improvements when processing space reservation tickets by optimizing locking and shrinking critical sections, cumulative improvements in lockstat numbers show +15% Notable fixes: - use vmalloc fallback when allocating bios as high order allocations can happen with wide checksums (like sha256) - scrub will always track the last position of progress so it's not starting from zero after an error Core: - under experimental config, checksum calculations are offloaded to process context, simplifies locking and allows to remove compression write worker kthread(s): - speed improvement in direct IO throughput with buffered IO fallback is +15% when not offloaded but this is more related to internal crypto subsystem improvements - this will be probably default in the future removing the sysfs tunable - (experimental) block size > page size updates: - support more operations when not using large folios (encoded read/write and send) - raid56 - more preparations for fscrypt support Other: - more conversions to auto-cleaned variables - parameter cleanups and removals - extended warning fixes - improved printing of structured values like keys - lots of other cleanups and refactoring" * tag 'for-6.19-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (147 commits) btrfs: remove unnecessary inode key in btrfs_log_all_parents() btrfs: remove redundant zero/NULL initializations in btrfs_alloc_root() btrfs: remaining BTRFS_PATH_AUTO_FREE conversions btrfs: send: do not allocate memory for xattr data when checking it exists btrfs: send: add unlikely to all unexpected overflow checks btrfs: reduce arguments to btrfs_del_inode_ref_in_log() btrfs: remove root argument from btrfs_del_dir_entries_in_log() btrfs: use test_and_set_bit() in btrfs_delayed_delete_inode_ref() btrfs: don't search back for dir inode item in INO_LOOKUP_USER btrfs: don't rewrite ret from inode_permission btrfs: add orig_logical to btrfs_bio for encryption btrfs: disable verity on encrypted inodes btrfs: disable various operations on encrypted inodes btrfs: remove redundant level reset in btrfs_del_items() btrfs: simplify leaf traversal after path release in btrfs_next_old_leaf() btrfs: optimize balance_level() path reference handling btrfs: factor out root promotion logic into promote_child_to_root() btrfs: raid56: remove the "_step" infix btrfs: raid56: enable bs > ps support btrfs: raid56: prepare finish_parity_scrub() to support bs > ps cases ...	2025-12-03 20:03:46 -08:00
Linus Torvalds	cc25df3e2e	Merge tag 'for-6.19/block-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: - Fix head insertion for mq-deadline, a regression from when priority support was added - Series simplifying and improving the ublk user copy code - Various ublk related cleanups - Fixup REQ_NOWAIT handling in loop/zloop, clearing NOWAIT when the request is punted to a thread for handling - Merge and then later revert loop dio nowait support, as it ended up causing excessive stack usage for when the inline issue code needs to dip back into the full file system code - Improve auto integrity code, making it less deadlock prone - Speedup polled IO handling, but manually managing the hctx lookups - Fixes for blk-throttle for SSD devices - Small series with fixes for the S390 dasd driver - Add support for caching zones, avoiding unnecessary report zone queries - MD pull requests via Yu: - fix null-ptr-dereference regression for dm-raid0 - fix IO hang for raid5 when array is broken with IO inflight - remove legacy 1s delay to speed up system shutdown - change maintainer's email address - data can be lost if array is created with different lbs devices, fix this problem and record lbs of the array in metadata - fix rcu protection for md_thread - fix mddev kobject lifetime regression - enable atomic writes for md-linear - some cleanups - bcache updates via Coly - remove useless discard and cache device code - improve usage of per-cpu workqueues - Reorganize the IO scheduler switching code, fixing some lockdep reports as well - Improve the block layer P2P DMA support - Add support to the block tracing code for zoned devices - Segment calculation improves, and memory alignment flexibility improvements - Set of prep and cleanups patches for ublk batching support. The actual batching hasn't been added yet, but helps shrink down the workload of getting that patchset ready for 6.20 - Fix for how the ps3 block driver handles segments offsets - Improve how block plugging handles batch tag allocations - nbd fixes for use-after-free of the configuration on device clear/put - Set of improvements and fixes for zloop - Add Damien as maintainer of the block zoned device code handling - Various other fixes and cleanups * tag 'for-6.19/block-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (162 commits) block/rnbd: correct all kernel-doc complaints blk-mq: use queue_hctx in blk_mq_map_queue_type md: remove legacy 1s delay in md_notify_reboot md/raid5: fix IO hang when array is broken with IO inflight md: warn about updating super block failure md/raid0: fix NULL pointer dereference in create_strip_zones() for dm-raid sbitmap: fix all kernel-doc warnings ublk: add helper of __ublk_fetch() ublk: pass const pointer to ublk_queue_is_zoned() ublk: refactor auto buffer register in ublk_dispatch_req() ublk: add `union ublk_io_buf` with improved naming ublk: add parameter `struct io_uring_cmd *` to ublk_prep_auto_buf_reg() kfifo: add kfifo_alloc_node() helper for NUMA awareness blk-mq: fix potential uaf for 'queue_hw_ctx' blk-mq: use array manage hctx map instead of xarray ublk: prevent invalid access with DEBUG s390/dasd: Use scnprintf() instead of sprintf() s390/dasd: Move device name formatting into separate function s390/dasd: Remove unnecessary debugfs_create() return checks s390/dasd: Fix gendisk parent after copy pair swap ...	2025-12-03 19:26:18 -08:00
Linus Torvalds	0abcfd8983	Merge tag 'for-6.19/io_uring-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull io_uring updates from Jens Axboe: - Unify how task_work cancelations are detected, placing it in the task_work running state rather than needing to check the task state - Series cleaning up and moving the cancelation code to where it belongs, in cancel.c - Cleanup of waitid and futex argument handling - Add support for mixed sized SQEs. 6.18 added support for mixed sized CQEs, improving flexibility and efficiency of workloads that need big CQEs. This adds similar support for SQEs, where the occasional need for a 128b SQE doesn't necessitate having all SQEs be 128b in size - Introduce zcrx and SQ/CQ layout queries. The former returns what zcrx features are available. And both return the ring size information to help with allocation size calculation for user provided rings like IORING_SETUP_NO_MMAP and IORING_MEM_REGION_TYPE_USER - Zcrx updates for 6.19. It includes a bunch of small patches, IORING_REGISTER_ZCRX_CTRL and RQ flushing and David's work on sharing zcrx b/w multiple io_uring instances - Series cleaning up ring initializations, notable deduplicating ring size and offset calculations. It also moves most of the checking before doing any allocations, making the code simpler - Add support for getsockname and getpeername, which is mostly a trivial hookup after a bit of refactoring on the networking side - Various fixes and cleanups * tag 'for-6.19/io_uring-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (68 commits) io_uring: Introduce getsockname io_uring cmd socket: Split out a getsockname helper for io_uring socket: Unify getsockname and getpeername implementation io_uring/query: drop unused io_handle_query_entry() ctx arg io_uring/kbuf: remove obsolete buf_nr_pages and update comments io_uring/register: use correct location for io_rings_layout io_uring/zcrx: share an ifq between rings io_uring/zcrx: add io_fill_zcrx_offsets() io_uring/zcrx: export zcrx via a file io_uring/zcrx: move io_zcrx_scrub() and dependencies up io_uring/zcrx: count zcrx users io_uring/zcrx: add sync refill queue flushing io_uring/zcrx: introduce IORING_REGISTER_ZCRX_CTRL io_uring/zcrx: elide passing msg flags io_uring/zcrx: use folio_nr_pages() instead of shift operation io_uring/zcrx: convert to use netmem_desc io_uring/query: introduce rings info query io_uring/query: introduce zcrx query io_uring: move cq/sq user offset init around io_uring: pre-calculate scq layout ...	2025-12-03 18:58:57 -08:00
Chaitanya Kulkarni	76ee7fd6af	f2fs: ignore discard return value __blkdev_issue_discard() always returns 0, making the error assignment in __submit_discard_cmd() dead code. Initialize err to 0 and remove the error assignment from the __blkdev_issue_discard() call to err. Move fault injection code into already present if branch where err is set to -EIO. This preserves the fault injection behavior while removing dead error handling. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Chaitanya Kulkarni <ckulkarnilinux@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-12-04 02:00:06 +00:00
YH Lin	8d1cb17aca	f2fs: optimize trace_f2fs_write_checkpoint with enums This patch optimizes the tracepoint by replacing these hardcoded strings with a new enumeration f2fs_cp_phase. 1.Defines enum f2fs_cp_phase with values for each checkpoint phase. 2.Updates trace_f2fs_write_checkpoint to accept a u16 phase argument instead of a string pointer. 3.Uses __print_symbolic in TP_printk to convert the enum values back to their corresponding strings for human-readable trace output. This change reduces the storage overhead for each trace event by replacing a variable-length string with a 2-byte integer, while maintaining the same readable output in ftrace. Signed-off-by: YH Lin <yhli@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-12-04 02:00:06 +00:00
Chao Yu	37345eae9d	f2fs: fix to not account invalid blocks in get_left_section_blocks() w/ LFS mode, in get_left_section_blocks(), we should not account the blocks which were used before and now are invalided, otherwise those blocks will be counted as freed one in has_curseg_enough_space(), result in missing to trigger GC in time. Cc: stable@kernel.org Fixes: `249ad438e1` ("f2fs: add a method for calculating the remaining blocks in the current segment in LFS mode.") Fixes: `bf34c93d26` ("f2fs: check curseg space before foreground GC") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-12-04 02:00:05 +00:00
Chao Yu	8f11fe52fc	f2fs: support to show curseg.next_blkoff in debugfs cat /sys/kernel/debug/f2fs/status Main area: 17 segs, 17 secs 17 zones TYPE blkoff segno secno zoneno dirty_seg full_seg valid_blk - COLD data: 0 4 4 4 0 0 0 - WARM data: 0 7 7 7 0 0 0 - HOT data: 1 5 5 5 2 0 512 - Dir dnode: 3 0 0 0 1 0 2 - File dnode: 0 1 1 1 0 0 0 - Indir nodes: 0 2 2 2 0 0 0 - Pinned file: 0 -1 -1 -1 - ATGC data: 0 -1 -1 -1 Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-12-04 02:00:05 +00:00

1 2 3 4 5 ...

103070 Commits