linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-21 18:25:26 -04:00

Author	SHA1	Message	Date
Mateusz Guzik	1ee889fdf4	f2fs: don't call iput() from f2fs_drop_inode() iput() calls the problematic routine, which does a ->i_count inc/dec cycle. Undoing it with iput() recurses into the problem. Note f2fs should not be playing games with the refcount to begin with, but that will be handled later. Right now solve the immediate regression. Fixes: `bc986b1d75` ("fs: stop accessing ->i_count directly in f2fs and gfs2") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202509301450.138b448f-lkp@intel.com Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-10-13 23:55:44 +00:00
Linus Torvalds	86d563ac5f	Merge tag 'f2fs-for-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This focuses on two primary updates for Android devices. First, it sets hash-based file name lookup as the default method to improve performance, while retaining an option to fall back to a linear lookup. Second, it resolves a persistent issue with the 'checkpoint=enable' feature. The update further boosts performance by prefetching node blocks, merging FUA writes more efficiently, and optimizing block allocation policies. The release is rounded out by a comprehensive set of bug fixes that address memory safety, data integrity, and potential system hangs, along with minor documentation and code clean-ups. Enhancements: - add mount option and sysfs entry to tune the lookup mode - dump more information and add a timeout when enabling/disabling checkpoints - readahead node blocks in F2FS_GET_BLOCK_PRECACHE mode - merge FUA command with the existing writes - allocate HOT_DATA for IPU writes - Use allocate_section_policy to control write priority in multi-devices setups - add reserved nodes for privileged users - Add bggc_io_aware to adjust the priority of BG_GC when issuing IO - show the list of donation files Bug fixes: - add missing dput() when printing the donation list - fix UAF issue in f2fs_merge_page_bio() - add sanity check on ei.len in __update_extent_tree_range() - fix infinite loop in __insert_extent_tree() - fix zero-sized extent for precache extents - fix to mitigate overhead of f2fs_zero_post_eof_page() - fix to avoid migrating empty section - fix to truncate first page in error path of f2fs_truncate() - fix to update map->m_next_extent correctly in f2fs_map_blocks() - fix wrong layout information on 16KB page - fix to do sanity check on node footer for non inode dnode - fix to avoid NULL pointer dereference in f2fs_check_quota_consistency() - fix to detect potential corrupted nid in free_nid_list - fix to clear unusable_cap for checkpoint=enable - fix to zero data after EOF for compressed file correctly - fix to avoid overflow while left shift operation - fix condition in __allow_reserved_blocks()" * tag 'f2fs-for-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (43 commits) f2fs: add missing dput() when printing the donation list f2fs: fix UAF issue in f2fs_merge_page_bio() f2fs: readahead node blocks in F2FS_GET_BLOCK_PRECACHE mode f2fs: add sanity check on ei.len in __update_extent_tree_range() f2fs: fix infinite loop in __insert_extent_tree() f2fs: fix zero-sized extent for precache extents f2fs: fix to mitigate overhead of f2fs_zero_post_eof_page() f2fs: fix to avoid migrating empty section f2fs: fix to truncate first page in error path of f2fs_truncate() f2fs: fix to update map->m_next_extent correctly in f2fs_map_blocks() f2fs: fix wrong layout information on 16KB page f2fs: clean up error handing of f2fs_submit_page_read() f2fs: avoid unnecessary folio_clear_uptodate() for cleanup f2fs: merge FUA command with the existing writes f2fs: allocate HOT_DATA for IPU writes f2fs: Use allocate_section_policy to control write priority in multi-devices setups Documentation: f2fs: Reword title Documentation: f2fs: Indent compression_mode option list Documentation: f2fs: Wrap snippets in literal code blocks Documentation: f2fs: Span write hint table section rows ...	2025-10-03 14:05:12 -07:00
Linus Torvalds	56e7b31071	Merge tag 'vfs-6.18-rc1.inode' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs inode updates from Christian Brauner: "This contains a series I originally wrote and that Eric brought over the finish line. It moves out the i_crypt_info and i_verity_info pointers out of 'struct inode' and into the fs-specific part of the inode. So now the few filesytems that actually make use of this pay the price in their own private inode storage instead of forcing it upon every user of struct inode. The pointer for the crypt and verity info is simply found by storing an offset to its address in struct fsverity_operations and struct fscrypt_operations. This shrinks struct inode by 16 bytes. I hope to move a lot more out of it in the future so that struct inode becomes really just about very core stuff that we need, much like struct dentry and struct file, instead of the dumping ground it has become over the years. On top of this are a various changes associated with the ongoing inode lifetime handling rework that multiple people are pushing forward: - Stop accessing inode->i_count directly in f2fs and gfs2. They simply should use the __iget() and iput() helpers - Make the i_state flags an enum - Rework the iput() logic Currently, if we are the last iput, and we have the I_DIRTY_TIME bit set, we will grab a reference on the inode again and then mark it dirty and then redo the put. This is to make sure we delay the time update for as long as possible We can rework this logic to simply dec i_count if it is not 1, and if it is do the time update while still holding the i_count reference Then we can replace the atomic_dec_and_lock with locking the ->i_lock and doing atomic_dec_and_test, since we did the atomic_add_unless above - Add an icount_read() helper and convert everyone that accesses inode->i_count directly for this purpose to use the helper - Expand dump_inode() to dump more information about an inode helping in debugging - Add some might_sleep() annotations to iput() and associated helpers" * tag 'vfs-6.18-rc1.inode' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: add might_sleep() annotation to iput() and more fs: expand dump_inode() inode: fix whitespace issues fs: add an icount_read helper fs: rework iput logic fs: make the i_state flags an enum fs: stop accessing ->i_count directly in f2fs and gfs2 fsverity: check IS_VERITY() in fsverity_cleanup_inode() fs: remove inode::i_verity_info btrfs: move verity info pointer to fs-specific part of inode f2fs: move verity info pointer to fs-specific part of inode ext4: move verity info pointer to fs-specific part of inode fsverity: add support for info in fs-specific part of inode fs: remove inode::i_crypt_info ceph: move crypt info pointer to fs-specific part of inode ubifs: move crypt info pointer to fs-specific part of inode f2fs: move crypt info pointer to fs-specific part of inode ext4: move crypt info pointer to fs-specific part of inode fscrypt: add support for info in fs-specific part of inode fscrypt: replace raw loads of info pointer with helper function	2025-09-29 09:42:30 -07:00
Mateusz Guzik	f99b391778	fs: rename generic_delete_inode() and generic_drop_inode() generic_delete_inode() is rather misleading for what the routine is doing. inode_just_drop() should be much clearer. The new naming is inconsistent with generic_drop_inode(), so rename that one as well with inode_ as the suffix. No functional changes. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-09-15 16:09:42 +02:00
Liao Yuanhong	b639c20e74	f2fs: Use allocate_section_policy to control write priority in multi-devices setups Introduces two new sys nodes: allocate_section_hint and allocate_section_policy. The allocate_section_hint identifies the boundary between devices, measured in sections; it defaults to the end of the device for single storage setups, and the end of the first device for multiple storage setups. The allocate_section_policy determines the write strategy, with a default value of 0 for normal sequential write strategy. A value of 1 prioritizes writes before the allocate_section_hint, while a value of 2 prioritizes writes after it. This strategy addresses the issue where, despite F2FS supporting multiple devices, SOC vendors lack multi-devices support (currently only supporting zoned devices). As a workaround, multiple storage devices are mapped to a single dm device. Both this workaround and the F2FS multi-devices solution may require prioritizing writing to certain devices, such as a device with better performance or when switching is needed due to performance degradation near a device's end. For scenarios with more than two devices, sort them at mount time to utilize this feature. When using this feature with a single storage device, it has almost no impact. However, for configurations where multiple storage devices are mapped to the same dm device using F2FS, utilizing this feature can provide some optimization benefits. Therefore, I believe it should not be limited to just multi-devices usage. Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-08-29 20:48:47 +00:00
Josef Bacik	bc986b1d75	fs: stop accessing ->i_count directly in f2fs and gfs2 Instead of accessing ->i_count directly in these file systems, use the appropriate __iget and iput helpers. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/b8e6eb8a3e690ce082828d3580415bf70dfa93aa.1755806649.git.josef@toxicpanda.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-08-27 13:12:48 +02:00
Eric Biggers	1f66cef4a9	f2fs: move verity info pointer to fs-specific part of inode Move the fsverity_info pointer into the filesystem-specific part of the inode by adding the field f2fs_inode_info::i_verity_info and configuring fsverity_operations::inode_info_offs accordingly. This is a prerequisite for a later commit that removes inode::i_verity_info, saving memory and improving cache efficiency on filesystems that don't support fsverity. Co-developed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Link: https://lore.kernel.org/20250810075706.172910-11-ebiggers@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-08-21 13:58:08 +02:00
Eric Biggers	7afb71ee92	f2fs: move crypt info pointer to fs-specific part of inode Move the fscrypt_inode_info pointer into the filesystem-specific part of the inode by adding the field f2fs_inode_info::i_crypt_info and configuring fscrypt_operations::inode_info_offs accordingly. This is a prerequisite for a later commit that removes inode::i_crypt_info, saving memory and improving cache efficiency with filesystems that don't support fscrypt. Co-developed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Link: https://lore.kernel.org/20250810075706.172910-5-ebiggers@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-08-21 13:58:07 +02:00
Chao Yu	ff11d8701b	f2fs: fix to allow removing qf_name The mount behavior changed after commit `d185351325` ("f2fs: separate the options parsing and options checking"), let's fix it. [Scripts] mkfs.f2fs -f /dev/vdb mount -t f2fs -o usrquota /dev/vdb /mnt/f2fs quotacheck -uc /mnt/f2fs umount /mnt/f2fs mount -t f2fs -o usrjquota=aquota.user,jqfmt=vfsold /dev/vdb /mnt/f2fs mount\|grep f2fs mount -t f2fs -o remount,usrjquota=,jqfmt=vfsold /dev/vdb /mnt/f2fs mount\|grep f2fs dmesg [Before commit] mount#1: ...,quota,jqfmt=vfsold,usrjquota=aquota.user,... mount#2: ...,quota,jqfmt=vfsold,... kmsg: no output [After commit] mount#1: ...,quota,jqfmt=vfsold,usrjquota=aquota.user,... mount#2: ...,quota,jqfmt=vfsold,usrjquota=aquota.user,... kmsg: "user quota file already specified" [After patch] mount#1: ...,quota,jqfmt=vfsold,usrjquota=aquota.user,... mount#2: ...,quota,jqfmt=vfsold,... kmsg: "remove qf_name aquota.user" Fixes: `d185351325` ("f2fs: separate the options parsing and options checking") Cc: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-08-20 17:45:00 +00:00
Chao Yu	930a9a6ee8	f2fs: fix to avoid NULL pointer dereference in f2fs_check_quota_consistency() syzbot reported a f2fs bug as below: Oops: gen[ 107.736417][ T5848] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 1 UID: 0 PID: 5848 Comm: syz-executor263 Tainted: G W 6.17.0-rc1-syzkaller-00014-g0e39a731820a #0 PREEMPT_{RT,(full)} RIP: 0010:strcmp+0x3c/0xc0 lib/string.c:284 Call Trace: <TASK> f2fs_check_quota_consistency fs/f2fs/super.c:1188 [inline] f2fs_check_opt_consistency+0x1378/0x2c10 fs/f2fs/super.c:1436 __f2fs_remount fs/f2fs/super.c:2653 [inline] f2fs_reconfigure+0x482/0x1770 fs/f2fs/super.c:5297 reconfigure_super+0x224/0x890 fs/super.c:1077 do_remount fs/namespace.c:3314 [inline] path_mount+0xd18/0xfe0 fs/namespace.c:4112 do_mount fs/namespace.c:4133 [inline] __do_sys_mount fs/namespace.c:4344 [inline] __se_sys_mount+0x317/0x410 fs/namespace.c:4321 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f The direct reason is f2fs_check_quota_consistency() may suffer null-ptr-deref issue in strcmp(). The bug can be reproduced w/ below scripts: mkfs.f2fs -f /dev/vdb mount -t f2fs -o usrquota /dev/vdb /mnt/f2fs quotacheck -uc /mnt/f2fs/ umount /mnt/f2fs mount -t f2fs -o usrjquota=aquota.user,jqfmt=vfsold /dev/vdb /mnt/f2fs mount -t f2fs -o remount,usrjquota=,jqfmt=vfsold /dev/vdb /mnt/f2fs umount /mnt/f2fs So, before old_qname and new_qname comparison, we need to check whether they are all valid pointers, fix it. Reported-by: syzbot+d371efea57d5aeab877b@syzkaller.appspotmail.com Fixes: `d185351325` ("f2fs: separate the options parsing and options checking") Closes: https://lore.kernel.org/linux-f2fs-devel/689ff889.050a0220.e29e5.0037.GAE@google.com Cc: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-08-20 17:45:00 +00:00
Chunhai Guo	2141879369	f2fs: add reserved nodes for privileged users This patch allows privileged users to reserve nodes via the 'reserve_node' mount option, which is similar to the existing 'reserve_root' option. "-o reserve_node=<N>" means <N> nodes are reserved for privileged users only. Signed-off-by: Chunhai Guo <guochunhai@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-08-20 17:44:10 +00:00
Liao Yuanhong	00798cd24f	f2fs: Add bggc_io_aware to adjust the priority of BG_GC when issuing IO Currently, we have encountered some issues while testing ZUFS. In situations near the storage limit (e.g., 50GB remaining), and after simulating fragmentation by repeatedly writing and deleting data, we found that application installation and startup tests conducted after idling for a few minutes take significantly longer several times that of traditional UFS. Tracing the operations revealed that the majority of I/Os were issued by background GC, which blocks normal I/O operations. Under normal circumstances, ZUFS indeed requires more background GC and employs a more aggressive GC strategy. However, I aim to find a way to minimize the impact on regular I/O operations under these near-limit conditions. To address this, I have introduced a bggc_io_aware feature, which controls the prioritization of background GC in the presence of I/Os. This switch can be adjusted at the framework level to implement different strategies. If set to AWARE_ALL_IO, all background GC operations will be skipped during active I/O issuance. The default option remains consistent with the current strategy, ensuring no change in behavior. Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-08-20 17:44:10 +00:00
Chao Yu	80b6d1d253	f2fs: dump more information for f2fs_{enable,disable}_checkpoint() Changes as below: - print more logs for f2fs_{enable,disable}_checkpoint() - account and dump time stats for f2fs_enable_checkpoint() Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-08-20 17:44:10 +00:00
Chao Yu	4bc3477796	f2fs: add timeout in f2fs_enable_checkpoint() During f2fs_enable_checkpoint() in remount(), if we flush a large amount of dirty pages into slow device, it may take long time which will block write IO, let's add a timeout machanism during dirty pages flush to avoid long time block in f2fs_enable_checkpoint(). Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-08-20 17:44:09 +00:00
Chao Yu	2e8f4c2b2b	f2fs: fix to clear unusable_cap for checkpoint=enable mount -t f2fs -o checkpoint=disable:10% /dev/vdb /mnt/f2fs/ mount -t f2fs -o remount,checkpoint=enable /dev/vdb /mnt/f2fs/ kernel log: F2FS-fs (vdb): Adjust unusable cap for checkpoint=disable = 204440 / 10% If we has assigned checkpoint=enable mount option, unusable_cap{,_perc} parameters of checkpoint=disable should be reset, then calculation and log print could be avoid in adjust_unusable_cap_perc(). Fixes: `1ae18f71cb` ("f2fs: fix checkpoint=disable:%u%%") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-08-20 17:44:09 +00:00
Daniel Lee	632f0b6c3e	f2fs: add lookup_mode mount option For casefolded directories, f2fs may fall back to a linear search if a hash-based lookup fails. This can cause severe performance regressions. While this behavior can be controlled by userspace tools (e.g. mkfs, fsck) by setting an on-disk flag, a kernel-level solution is needed to guarantee the lookup behavior regardless of the on-disk state. This commit introduces the 'lookup_mode' mount option to provide this kernel-side control. The option accepts three values: - perf: (Default) Enforces a hash-only lookup. The linear fallback is always disabled. - compat: Enables the linear search fallback for compatibility with directory entries from older kernels. - auto: Determines the mode based on the on-disk flag, preserving the userspace-based behavior. Signed-off-by: Daniel Lee <chullee@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-08-11 17:03:27 +00:00
Jaegeuk Kim	078cad8212	f2fs: drop inode from the donation list when the last file is closed Let's drop the inode from the donation list when there is no other open file. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-07-30 17:13:12 +00:00
Hongbo Li	94b3ce7f15	f2fs: switch to the new mount api The new mount api will execute .parse_param, .init_fs_context, .get_tree and will call .remount if remount happened. So we add the necessary functions for the fs_context_operations. If .init_fs_context is added, the old .mount should remove. See Documentation/filesystems/mount_api.rst for more information. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> [sandeen: forward port] Signed-off-by: Eric Sandeen <sandeen@redhat.com> [hongbo: context modified] Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-07-22 15:58:15 +00:00
Hongbo Li	bb463a75ab	f2fs: introduce fs_context_operation structure The handle_mount_opt() helper is used to parse mount parameters, and so we can rename this function to f2fs_parse_param() and set it as .param_param in fs_context_operations. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> [sandeen: forward port] Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-07-22 15:58:14 +00:00
Hongbo Li	d185351325	f2fs: separate the options parsing and options checking The new mount api separates option parsing and super block setup into two distinct steps and so we need to separate the options parsing out of the parse_options(). In order to achieve this, here we handle the mount options with three steps: - Firstly, we move sb/sbi out of handle_mount_opt. As the former patch introduced f2fs_fs_context, so we record the changed mount options in this context. In handle_mount_opt, sb/sbi is null, so we should move all relative code out of handle_mount_opt (thus, some check case which use sb/sbi should move out). - Secondly, we introduce the some check helpers to keep the option consistent. During filling superblock period, sb/sbi are ready. So we check the f2fs_fs_context which holds the mount options base on sb/sbi. - Thirdly, we apply the new mount options to sb/sbi. After checking the f2fs_fs_context, all changed on mount options are valid. So we can apply them to sb/sbi directly. After do these, option parsing and super block setting have been decoupled. Also it should have retained the original execution flow. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> [sandeen: forward port, minor fixes and updates] Signed-off-by: Eric Sandeen <sandeen@redhat.com> [hongbo: minor fixes] Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-07-22 15:58:14 +00:00
Hongbo Li	1a9094b10c	f2fs: Add f2fs_fs_context to record the mount options At the parsing phase of mouont in the new mount api, options value will be recorded with the context, and then it will be used in fill_super and other helpers. Note that, this is a temporary status, we want remove the sb and sbi usages in handle_mount_opt. So here the f2fs_fs_context only records the mount options, it will be copied in sb/sbi in later process. (At this point in the series, mount options are temporarily not set during mount.) Signed-off-by: Hongbo Li <lihongbo22@huawei.com> [sandeen: forward port, minor fixes and updates] Signed-off-by: Eric Sandeen <sandeen@redhat.com> [hongbo: minor cleanup] Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-07-22 15:58:14 +00:00
Hongbo Li	19c4b380f2	f2fs: Allow sbi to be NULL in f2fs_printk At the parsing phase of the new mount api, sbi will not be available. So here allows sbi to be NULL in f2fs log helpers and use that in handle_mount_opt(). Signed-off-by: Hongbo Li <lihongbo22@huawei.com> [sandeen: forward port] Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-07-22 15:58:14 +00:00
Hongbo Li	02eb5fe42a	f2fs: move the option parser into handle_mount_opt In handle_mount_opt, we use fs_parameter to parse each option. However we're still using the old API to get the options string. Using fsparams parse_options allows us to remove many of the Opt_ enums, so remove them. The checkpoint disable cap (or percent) involves rather complex parsing; we retain the old match_table mechanism for this, which handles it well. There are some changes about parsing options: 1. For `active_logs`, `inline_xattr_size` and `fault_injection`, we use s32 type according the internal structure to record the option's value. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> [sandeen: forward port, minor fixes and updates] Signed-off-by: Eric Sandeen <sandeen@redhat.com> [hongbo: minor cleanup] Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-07-22 15:58:14 +00:00
Hongbo Li	f2091cc188	f2fs: Add fs parameter specifications for mount options Use an array of `fs_parameter_spec` called f2fs_param_specs to hold the mount option specifications for the new mount api. Add constant_table structures for several options to facilitate parsing. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> [sandeen: forward port, minor fixes and updates, more fsparam_enum] Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-07-22 15:58:13 +00:00
Jiazi Li	e9705c61b1	f2fs: use kfree() instead of kvfree() to free some memory options in f2fs_fill_super is alloc by kstrdup: options = kstrdup((const char *)data, GFP_KERNEL) sit_bitmap[_mir], nat_bitmap[_mir] are alloc by kmemdup: sit_i->sit_bitmap = kmemdup(src_bitmap, sit_bitmap_size, GFP_KERNEL); sit_i->sit_bitmap_mir = kmemdup(src_bitmap, sit_bitmap_size, GFP_KERNEL); nm_i->nat_bitmap = kmemdup(version_bitmap, nm_i->bitmap_size, GFP_KERNEL); nm_i->nat_bitmap_mir = kmemdup(version_bitmap, nm_i->bitmap_size, GFP_KERNEL); write_io is alloc by f2fs_kmalloc: sbi->write_io[i] = f2fs_kmalloc(sbi, array_size(n, sizeof(struct f2fs_bio_info)) Use kfree is more efficient. Signed-off-by: Jiazi Li <jqqlijiazi@gmail.com> Signed-off-by: peixuan.qiu <peixuan.qiu@transsion.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-07-09 17:59:39 +00:00
Swarna Prabhu	1f13689026	f2fs: Fix the typos in comments This patch fixes minor typos in comments in f2fs. Signed-off-by: Swarna Prabhu <s.prabhu@samsung.com> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-06-24 21:34:37 +00:00
Chao Yu	59c1c89e9b	f2fs: introduce reserved_pin_section sysfs entry This patch introduces /sys/fs/f2fs/<dev>/reserved_pin_section for tuning @needed parameter of has_not_enough_free_secs(), if we configure it w/ zero, it can avoid f2fs_gc() as much as possible while fallocating on pinned file. Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: wangzijie <wangzijie1@honor.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-06-23 22:13:02 +00:00
Sheng Yong	554d9b7242	f2fs: fix bio memleak when committing super block When committing new super block, bio is allocated but not freed, and kmemleak complains: unreferenced object 0xffff88801d185600 (size 192): comm "kworker/3:2", pid 128, jiffies 4298624992 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 80 67 c3 00 81 88 ff ff .........g...... 01 08 06 00 00 00 00 00 00 00 00 00 01 00 00 00 ................ backtrace (crc 650ecdb1): kmem_cache_alloc_noprof+0x3a9/0x460 mempool_alloc_noprof+0x12f/0x310 bio_alloc_bioset+0x1e2/0x7e0 __f2fs_commit_super+0xe0/0x370 f2fs_commit_super+0x4ed/0x8c0 f2fs_record_error_work+0xc7/0x190 process_one_work+0x7db/0x1970 worker_thread+0x518/0xea0 kthread+0x359/0x690 ret_from_fork+0x34/0x70 ret_from_fork_asm+0x1a/0x30 The issue can be reproduced by: mount /dev/vda /mnt i=0 while :; do echo '[h]abc' > /sys/fs/f2fs/vda/extension_list echo '[h]!abc' > /sys/fs/f2fs/vda/extension_list echo scan > /sys/kernel/debug/kmemleak dmesg \| grep "new suspected memory leaks" [ $? -eq 0 ] && break i=$((i + 1)) echo "$i" done umount /mnt Fixes: `5bcde45578` ("f2fs: get rid of buffer_head use") Signed-off-by: Sheng Yong <shengyong1@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-06-23 22:13:01 +00:00
Zhiguo Niu	a6c397a31f	f2fs: use d_inode(dentry) cleanup dentry->d_inode no logic changes. Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-05-28 16:05:22 +00:00
Chao Yu	54ca9be0bc	f2fs: introduce FAULT_VMALLOC Introduce a new fault type FAULT_VMALLOC to simulate no memory error in f2fs_vmalloc(). Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-05-27 23:52:36 +00:00
Chao Yu	5827e3c720	f2fs: add f2fs_bug_on() in f2fs_quota_read() mapping_read_folio_gfp() will return a folio, it should always be uptodate, let's check folio uptodate status to detect any potenial bug. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-05-27 23:52:35 +00:00
Eric Biggers	d005af3b67	f2fs: remove unused sbi argument from checksum functions Since __f2fs_crc32() now calls crc32() directly, it no longer uses its sbi argument. Remove that, and simplify its callers accordingly. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-05-27 23:52:35 +00:00
Chao Yu	a920196062	f2fs: don't over-report free space or inodes in statvfs This fixes an analogus bug that was fixed in modern filesystems: a) xfs in commit `4b8d867ca6` ("xfs: don't over-report free space or inodes in statvfs") b) ext4 in commit `f87d3af741` ("ext4: don't over-report free space or inodes in statvfs") where statfs can report misleading / incorrect information where project quota is enabled, and the free space is less than the remaining quota. This commit will resolve a test failure in generic/762 which tests for this bug. generic/762 - output mismatch (see /share/git/fstests/results//generic/762.out.bad) --- tests/generic/762.out 2025-04-15 10:21:53.371067071 +0800 +++ /share/git/fstests/results//generic/762.out.bad 2025-05-13 16:13:37.000000000 +0800 @@ -6,8 +6,10 @@ root blocks2 is in range dir blocks2 is in range root bavail2 is in range -dir bavail2 is in range +dir bavail2 has value of 1539066 +dir bavail2 is NOT in range 304734.87 .. 310891.13 root blocks3 is in range ... (Run 'diff -u /share/git/fstests/tests/generic/762.out /share/git/fstests/results//generic/762.out.bad' to see the entire diff) HINT: You _MAY_ be missing kernel fix: XXXXXXXXXXXXXX xfs: don't over-report free space or inodes in statvfs Cc: stable@kernel.org Fixes: `ddc34e328d` ("f2fs: introduce f2fs_statfs_project") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-05-13 15:32:41 +00:00
Kairui Song	0427e811c9	f2fs: drop usage of folio_index folio_index is only needed for mixed usage of page cache and swap cache, for pure page cache usage, the caller can just use folio->index instead. It can't be a swap cache folio here. Swap mapping may only call into fs through `swap_rw` but f2fs does not use that method for swap. Signed-off-by: Kairui Song <kasong@tencent.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> (maintainer:F2FS FILE SYSTEM) Cc: Chao Yu <chao@kernel.org> (maintainer:F2FS FILE SYSTEM) Cc: linux-f2fs-devel@lists.sourceforge.net (open list:F2FS FILE SYSTEM) Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Chao Yu <chao@kernel.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-05-06 15:46:55 +00:00
Chao Yu	0244c77fed	f2fs: support FAULT_TIMEOUT Support to inject a timeout fault into function, currently it only support to inject timeout to commit_atomic_write flow to reproduce inconsistent bug, like the bug fixed by commit `f098aeba04` ("f2fs: fix to avoid atomicity corruption of atomic file"). By default, the new type fault will inject 1000ms timeout, and the timeout process can be interrupted by SIGKILL. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-05-06 15:46:55 +00:00
Chao Yu	dc6d9ef57f	f2fs: zone: fix to calculate first_zoned_segno correctly A zoned device can has both conventional zones and sequential zones, so we should not treat first segment of zoned device as first_zoned_segno, instead, we need to check zone type for each zone during traversing zoned device to find first_zoned_segno. Otherwise, for below case, first_zoned_segno will be 0, which could be wrong. create_null_blk 512 2 1024 1024 mkfs.f2fs -m /dev/nullb0 Testcase: export SCRIPTS_PATH=/share/git/scripts test multiple devices w/ zoned device for ((i=0;i<8;i++)) do { zonesize=$((2<<$i)) conzone=$((4096/$zonesize)) seqzone=$((4096/$zonesize)) $SCRIPTS_PATH/nullblk_create.sh 512 $zonesize $conzone $seqzone mkfs.f2fs -f -m /dev/vdb -c /dev/nullb0 mount /dev/vdb /mnt/f2fs touch /mnt/f2fs/file f2fs_io pinfile set /mnt/f2fs/file $((85899345922)) stat /mnt/f2fs/file df cat /proc/fs/f2fs/vdb/segment_info umount /mnt/f2fs $SCRIPTS_PATH/nullblk_remove.sh 0 } done test single zoned device for ((i=0;i<8;i++)) do { zonesize=$((2<<$i)) conzone=$((4096/$zonesize)) seqzone=$((4096/$zonesize)) $SCRIPTS_PATH/nullblk_create.sh 512 $zonesize $conzone $seqzone mkfs.f2fs -f -m /dev/nullb0 mount /dev/nullb0 /mnt/f2fs touch /mnt/f2fs/file f2fs_io pinfile set /mnt/f2fs/file $((85899345922)) stat /mnt/f2fs/file df cat /proc/fs/f2fs/nullb0/segment_info umount /mnt/f2fs $SCRIPTS_PATH/nullblk_remove.sh 0 } done Fixes: `9703d69d9d` ("f2fs: support file pinning for zoned devices") Cc: Daeho Jeong <daehojeong@google.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-04-28 15:26:48 +00:00
Chao Yu	5db0d252c6	f2fs: fix to do sanity check on sit_bitmap_size w/ below testcase, resize will generate a corrupted image which contains inconsistent metadata, so when mounting such image, it will trigger kernel panic: touch img truncate -s $((512102410241024)) img mkfs.f2fs -f img $((25610241024)) resize.f2fs -s -i img -t $((102410241024)) mount img /mnt/f2fs ------------[ cut here ]------------ kernel BUG at fs/f2fs/segment.h:863! Oops: invalid opcode: 0000 [#1] SMP PTI CPU: 11 UID: 0 PID: 3922 Comm: mount Not tainted 6.15.0-rc1+ #191 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:f2fs_ra_meta_pages+0x47c/0x490 Call Trace: f2fs_build_segment_manager+0x11c3/0x2600 f2fs_fill_super+0xe97/0x2840 mount_bdev+0xf4/0x140 legacy_get_tree+0x2b/0x50 vfs_get_tree+0x29/0xd0 path_mount+0x487/0xaf0 __x64_sys_mount+0x116/0x150 do_syscall_64+0x82/0x190 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fdbfde1bcfe The reaseon is: sit_i->bitmap_size is 192, so size of sit bitmap is 1928=1536, at maximum there are 1536 sit blocks, however MAIN_SEGS is 261893, so that sit_blk_cnt is 4762, build_sit_entries() -> current_sit_addr() tries to access out-of-boundary in sit_bitmap at offset from [1536, 4762), once sit_bitmap and sit_bitmap_mirror is not the same, it will trigger f2fs_bug_on(). Let's add sanity check in f2fs_sanity_check_ckpt() to avoid panic. Cc: stable@vger.kernel.org Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-04-28 15:26:48 +00:00
Matthew Wilcox (Oracle)	0d1e687e43	f2fs: Use a folio in f2fs_quota_read() Support arbitrary size folios and remove a few hidden calls to compound_head(). Also remove an unnecessary test of the uptodaate flag; if mapping_read_folio_gfp() cannot bring the folio uptodate, it will return an error. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-04-28 15:21:35 +00:00
Chao Yu	db03c20c08	f2fs: fix to set atomic write status more clear 1. After we start atomic write in a database file, before committing all data, we'd better not set inode w/ vfs dirty status to avoid redundant updates, instead, we only set inode w/ atomic dirty status. 2. After we commit all data, before committing metadata, we need to clear atomic dirty status, and set vfs dirty status to allow vfs flush dirty inode. Cc: Daeho Jeong <daehojeong@google.com> Reported-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-04-10 03:59:58 +00:00
Chao Yu	2be96c2147	f2fs: fix to update injection attrs according to fault_option When we update inject type via sysfs, it shows wrong rate value as below, there is a same problem when we update inject rate, fix it. Before: F2FS-fs (vdd): build fault injection attr: rate: 0, type: 0xffff F2FS-fs (vdd): build fault injection attr: rate: 1, type: 0x0 After: F2FS-fs (vdd): build fault injection type: 0x1 F2FS-fs (vdd): build fault injection rate: 1 Meanwhile, let's avoid turning on all fault types when we enable fault injection via fault_injection mount option, it will lead to shutdown filesystem or fail the mount() easily. mount -o fault_injection=4 /dev/vdd /mnt/f2fs F2FS-fs (vdd): build fault injection attr: rate: 4, type: 0x7fffff F2FS-fs (vdd): inject kmalloc in f2fs_kmalloc of f2fs_fill_super+0xbdf/0x27c0 Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-04-10 03:59:58 +00:00
Chao Yu	e073e92789	f2fs: add a proc entry show inject stats This patch adds a proc entry named inject_stats to show total injected count for each fault type. cat /proc/fs/f2fs/<dev>/inject_stats fault_type injected_count kmalloc 0 kvmalloc 0 page alloc 0 page get 0 alloc bio(obsolete) 0 alloc nid 0 orphan 0 no more block 0 too big dir depth 0 evict_inode fail 0 truncate fail 0 read IO error 0 checkpoint error 0 discard error 0 write IO error 0 slab alloc 0 dquot initialize 0 lock_op 0 invalid blkaddr 0 inconsistent blkaddr 0 no free segment 0 inconsistent footer 0 Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-04-10 03:59:57 +00:00
Yeongjin Gil	f098aeba04	f2fs: fix to avoid atomicity corruption of atomic file In the case of the following call stack for an atomic file, FI_DIRTY_INODE is set, but FI_ATOMIC_DIRTIED is not subsequently set. f2fs_file_write_iter f2fs_map_blocks f2fs_reserve_new_blocks inc_valid_block_count __mark_inode_dirty(dquot) f2fs_dirty_inode If FI_ATOMIC_DIRTIED is not set, atomic file can encounter corruption due to a mismatch between old file size and new data. To resolve this issue, I changed to set FI_ATOMIC_DIRTIED when FI_DIRTY_INODE is set. This ensures that FI_DIRTY_INODE, which was previously cleared by the Writeback thread during the commit atomic, is set and i_size is updated. Cc: <stable@vger.kernel.org> Fixes: `fccaa81de8` ("f2fs: prevent atomic file from being dirtied before commit") Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Sunmin Jeong <s_min.jeong@samsung.com> Signed-off-by: Yeongjin Gil <youngjin.gil@samsung.com> Reviewed-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-03-17 17:38:33 +00:00
Eric Sandeen	71e9bd3d5c	f2fs: pass sbi rather than sb to parse_options() With the new mount API the sb will not be available during initial option parsing, which will happen before fill_super reads sb from disk. Now that the sb is no longer directly referenced in parse_options, switch it to use sbi. (Note that all calls to f2fs_sb_has_* originating from parse_options will need to be deferred to later before we can use the new mount API.) Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-03-13 18:16:07 +00:00
Eric Sandeen	b7de231b9d	f2fs: pass sbi rather than sb to quota qf_name helpers With the new mount api we will not have the superblock available during option parsing. Prepare for this by passing sbi rather than sb. For now, we are parsing after fill_super has been done, so sbi->sb will exist. Under the new mount API this will require more care, but do the simple change for now. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-03-13 18:16:06 +00:00
Eric Sandeen	9cca498759	f2fs: defer readonly check vs norecovery Defer the readonly-vs-norecovery check until after option parsing is done so that option parsing does not require an active superblock for the test. Add a helpful message, while we're at it. (I think could be moved back into parsing after we switch to the new mount API if desired, as the fs context will have RO state available.) Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-03-13 18:16:06 +00:00
Eric Sandeen	0edcb2197e	f2fs: Pass sbi rather than sb to f2fs_set_test_dummy_encryption This removes another sb instance from parse_options() Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-03-13 18:16:06 +00:00
Eric Sandeen	9100adf326	f2fs: make LAZYTIME a mount option flag Set LAZYTIME into sbi during parsing, and transfer it to the sb in fill_super, so that an sb is not required during option parsing. (Note: While lazytime is normally handled via mount flag in the vfs, some f2fs users do expect to be able to use it as an explicit mount option string via the mount syscall, so this option must remain.) Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-03-13 18:16:06 +00:00
Eric Sandeen	7d6ee50330	f2fs: make INLINECRYPT a mount option flag Set INLINECRYPT into sbi during parsing, and transfer it to the sb in fill_super, so that an sb is not required during option parsing. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-03-13 18:16:06 +00:00
Eric Sandeen	abd0e040e9	f2fs: factor out an f2fs_default_check function The current options parsing function both parses options and validates them - factor the validation out to reduce the size of the function and make transition to the new mount API possible, because under the new mount API, options are parsed one at a time, and cannot all be tested at the end of the parsing function. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-03-13 18:16:06 +00:00
Eric Sandeen	277352b6cb	f2fs: consolidate unsupported option handling errors When certain build-time options are disabled, some mount options are not accepted. For quota and compression, all related options are dismissed with a single error message. For xattr, acl, and fault injection, each option is handled individually. In addition, inline_xattr_size was missed when CONFIG_F2FS_FS_XATTR was disabled. Collapse xattr, acl, and fault injection errors into a single string, for simplicity, and handle the missing inline_xattr_size case. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2025-03-13 18:16:06 +00:00

1 2 3 4 5 ...

891 Commits