Commit Graph

1280696 Commits

Author SHA1 Message Date
Jan Kara
e3a00a2378 jbd2: precompute number of transaction descriptor blocks
Instead of computing the number of descriptor blocks a transaction can
have each time we need it (which is currently when starting each
transaction but will become more frequent later) precompute the number
once during journal initialization together with maximum transaction
size. We perform the precomputation whenever journal feature set is
updated similarly as for computation of
journal->j_revoke_records_per_block.

CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://patch.msgid.link/20240624170127.3253-2-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-08 23:59:37 -04:00
Jan Kara
4aa99c71e4 jbd2: make jbd2_journal_get_max_txn_bufs() internal
There's no reason to have jbd2_journal_get_max_txn_bufs() public
function. Currently all users are internal and can use
journal->j_max_transaction_buffers instead. This saves some unnecessary
recomputations of the limit as a bonus which becomes important as this
function gets more complex in the following patch.

CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://patch.msgid.link/20240624170127.3253-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-08 23:59:37 -04:00
Ye Bin
0bab8db415 jbd2: avoid mount failed when commit block is partial submitted
We encountered a problem that the file system could not be mounted in
the power-off scenario. The analysis of the file system mirror shows that
only part of the data is written to the last commit block.
The valid data of the commit block is concentrated in the first sector.
However, the data of the entire block is involved in the checksum calculation.
For different hardware, the minimum atomic unit may be different.
If the checksum of a committed block is incorrect, clear the data except the
'commit_header' and then calculate the checksum. If the checkusm is correct,
it is considered that the block is partially committed, Then continue to replay
journal.

Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240620072405.3533701-1-yebin@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-08 23:59:37 -04:00
Jan Kara
65121eff3e ext4: avoid writing unitialized memory to disk in EA inodes
If the extended attribute size is not a multiple of block size, the last
block in the EA inode will have uninitialized tail which will get
written to disk. We will never expose the data to userspace but still
this is not a good practice so just zero out the tail of the block as it
isn't going to cause a noticeable performance overhead.

Fixes: e50e5129f3 ("ext4: xattr-in-inode support")
Reported-by: syzbot+9c1fe13fcb51574b249b@syzkaller.appspotmail.com
Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240613150234.25176-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-08 23:59:37 -04:00
Luis Henriques (SUSE)
7882b0187b ext4: don't track ranges in fast_commit if inode has inlined data
When fast-commit needs to track ranges, it has to handle inodes that have
inlined data in a different way because ext4_fc_write_inode_data(), in the
actual commit path, will attempt to map the required blocks for the range.
However, inodes that have inlined data will have it's data stored in
inode->i_block and, eventually, in the extended attribute space.

Unfortunately, because fast commit doesn't currently support extended
attributes, the solution is to mark this commit as ineligible.

Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1039883
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Tested-by: Ben Hutchings <benh@debian.org>
Fixes: 9725958bb7 ("ext4: fast commit may miss tracking unwritten range during ftruncate")
Link: https://patch.msgid.link/20240618144312.17786-1-luis.henriques@linux.dev
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-08 23:59:37 -04:00
Luis Henriques (SUSE)
63469662cc ext4: fix possible tid_t sequence overflows
In the fast commit code there are a few places where tid_t variables are
being compared without taking into account the fact that these sequence
numbers may wrap.  Fix this issue by using the helper functions tid_gt()
and tid_geq().

Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://patch.msgid.link/20240529092030.9557-3-luis.henriques@linux.dev
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-08 23:59:35 -04:00
Luis Henriques (SUSE)
2d4d6bda0f ext4: use ext4_update_inode_fsync_trans() helper in inode creation
Call helper function ext4_update_inode_fsync_trans() instead of open
coding it in __ext4_new_inode().  This helper checks both that the handle
is valid *and* that it hasn't been aborted due to some fatal error in the
journalling layer, using is_handle_aborted().

Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Link: https://patch.msgid.link/20240527161447.21434-1-luis.henriques@linux.dev
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-05 16:48:54 -04:00
Jeff Johnson
7378e8991a ext4: add missing MODULE_DESCRIPTION()
Fix the 'make W=1' warning:
WARNING: modpost: missing MODULE_DESCRIPTION() in fs/ext4/ext4-inode-test.o

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240527-md-fs-ext4-v1-1-07aad5936bb1@quicinc.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-05 16:07:24 -04:00
Jeff Johnson
89a8718cef jbd2: add missing MODULE_DESCRIPTION()
Fix the 'make W=1' warning:
WARNING: modpost: missing MODULE_DESCRIPTION() in fs/jbd2/jbd2.o

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://patch.msgid.link/20240526-md-fs-jbd2-v1-1-7bba6665327d@quicinc.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-05 13:26:19 -04:00
Kees Cook
be27cd6446 ext4: use memtostr_pad() for s_volume_name
As with the other strings in struct ext4_super_block, s_volume_name is
not NUL terminated. The other strings were marked in commit 072ebb3bff
("ext4: add nonstring annotations to ext4.h"). Using strscpy() isn't
the right replacement for strncpy(); it should use memtostr_pad()
instead.

Reported-by: syzbot+50835f73143cc2905b9e@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/00000000000019f4c00619192c05@google.com/
Fixes: 744a56389f ("ext4: replace deprecated strncpy with alternatives")
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://patch.msgid.link/20240523225408.work.904-kees@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-05 13:14:51 -04:00
Zhang Yi
7c73ddb758 jbd2: speed up jbd2_transaction_committed()
jbd2_transaction_committed() is used to check whether a transaction with
the given tid has already committed, it holds j_state_lock in read mode
and check the tid of current running transaction and committing
transaction, but holding the j_state_lock is expensive.

We have already stored the sequence number of the most recently
committed transaction in journal t->j_commit_sequence, we could do this
check by comparing it with the given tid instead. If the given tid isn't
smaller than j_commit_sequence, we can ensure that the given transaction
has been committed. That way we could drop the expensive lock and
achieve about 10% ~ 20% performance gains in concurrent DIOs on may
virtual machine with 100G ramdisk.

fio -filename=/mnt/foo -direct=1 -iodepth=10 -rw=$rw -ioengine=libaio \
    -bs=4k -size=10G -numjobs=10 -runtime=60 -overwrite=1 -name=test \
    -group_reporting

Before:
  overwrite       IOPS=88.2k, BW=344MiB/s
  read            IOPS=95.7k, BW=374MiB/s
  rand overwrite  IOPS=98.7k, BW=386MiB/s
  randread        IOPS=102k, BW=397MiB/s

After:
  overwrite       IOPS=105k, BW=410MiB/s
  read            IOPS=112k, BW=436MiB/s
  rand overwrite  IOPS=104k, BW=404MiB/s
  randread        IOPS=111k, BW=432MiB/s

CC: Dave Chinner <david@fromorbit.com>
Suggested-by: Dave Chinner <david@fromorbit.com>
Link: https://lore.kernel.org/linux-ext4/ZjILCPNZRHeazSqV@dread.disaster.area/
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://patch.msgid.link/20240520131831.2910790-1-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-05 12:45:36 -04:00
Zhang Yi
8262fe9a90 ext4: make ext4_da_map_blocks() buffer_head unaware
After calling the ext4_da_map_blocks(), a delalloc extent state could
be identified through the EXT4_MAP_DELAYED flag in map. So factor out
buffer_head related handles in ext4_da_map_blocks(), make this function
buffer_head unaware and becomes a common helper, and also update the
stale function commtents, preparing for the iomap da write path in the
future.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-11-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 18:04:50 -04:00
Zhang Yi
1850d76c1b ext4: make ext4_insert_delayed_block() insert multi-blocks
Rename ext4_insert_delayed_block() to ext4_insert_delayed_blocks(),
pass length parameter to make it insert multiple delalloc blocks at a
time. For non-bigalloc case, just reserve len blocks and insert delalloc
extent. For bigalloc case, we can ensure that the clusters in the middle
of a extent must be unallocated, we only need to check whether the start
and end clusters are delayed/allocated. We should subtract the space for
the start and/or end block(s) if they are allocated.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-10-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 18:04:50 -04:00
Zhang Yi
49bf6ab4d3 ext4: factor out a helper to check the cluster allocation state
Factor out a common helper ext4_clu_alloc_state(), check whether the
cluster containing a delalloc block to be added has been allocated or
has delalloc reservation, no logic changes.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-9-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 18:04:50 -04:00
Zhang Yi
0d66b23d79 ext4: make ext4_da_reserve_space() reserve multi-clusters
Add 'nr_resv' parameter to ext4_da_reserve_space(), which indicates the
number of clusters wants to reserve, make it reserve multiple clusters
at a time.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-8-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 18:04:50 -04:00
Zhang Yi
12eba993b9 ext4: make ext4_es_insert_delayed_block() insert multi-blocks
Rename ext4_es_insert_delayed_block() to ext4_es_insert_delayed_extent()
and pass length parameter to make it insert multiple delalloc blocks at
a time. For the case of bigalloc, split the allocated parameter to
lclu_allocated and end_allocated. lclu_allocated indicates the
allocation state of the cluster which is containing the lblk,
end_allocated indicates the allocation state of the extent end, clusters
in the middle of delay allocated extent must be unallocated.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-7-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 18:04:49 -04:00
Zhang Yi
bb6b18057f ext4: drop iblock parameter
The start block of the delalloc extent to be inserted is equal to
map->m_lblk, just drop the duplicate iblock input parameter.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://patch.msgid.link/20240517124005.347221-6-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 18:04:49 -04:00
Zhang Yi
14a210c110 ext4: trim delalloc extent
In ext4_da_map_blocks(), we could find four kind of extents in the
extent status tree: hole, unwritten, written and delayed extent. Now we
only trim the map len if we found an unwritten extent or a written
extent. This is okay now since map->m_len is always set to one and we
always insert one delayed block at a time. But this will become isn't
okay for other two cases if ext4_insert_delayed_block() and
ext4_da_map_blocks() support inserting multiple map->len blocks later.

1. If we found a hole in the extent status tree which es->es_len is
   shorter than the length we want to write, we should trim the
   map->m_len to prevent adding extra delay more blocks than we
   expected. For example, assume we write data [A, C) to a file that
   contains a hole extent [A, B) and a written extent [B, D) in the
   cache.

                         A     B  C  D
   before da write:   ...hhhhhh|wwwwww....

   Then we will get extent [A, B), we should trim map->m_len to B-A
   before inserting new delalloc blocks, if not, the range [B, C) will
   be duplicated.

2. If we found a delayed extent in the extent status tree which
   es->es_len is shorter than the length we want to write, we should
   trim the map->m_len to es->es_len and return directly since the front
   part of this map has been delayed, we can't insert the delalloc
   extent that contains the latter part in this round, we should return
   the delayed length and the caller should increase the position and
   call ext4_da_map_blocks() again. For example, assume we write data
   [A, C) to a file that contains a delayed extent [A, B) in the cache.

                         A     B  C
   before da write:   ...dddddd|hhh....

   Then we will get delayed extent [A, B), we should also trim map->m_len
   to B-A and return, if not, we will incorrectly assume that the write
   is complete and won't insert [B, C).

So we need to always trim the map->m_len if the found es->es_len in the
extent status tree is shorter than the map->m_len, prearing for
inserting a extent with multiple delalloc blocks. This patch only does a
pre-fix, the handle is crude and ext4_da_map_blocks() deserve a cleanup,
we will do that later.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-5-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 18:04:49 -04:00
Zhang Yi
b37c907073 ext4: warn if delalloc counters are not zero on inactive
The per-inode i_reserved_data_blocks count the reserved delalloc blocks
in a regular file, it should be zero when destroying the file. The
per-fs s_dirtyclusters_counter count all reserved delalloc blocks in a
filesystem, it also should be zero when umounting the filesystem. Now we
have only an error message if the i_reserved_data_blocks is not zero,
which is unable to be simply captured, so add WARN_ON_ONCE to make it
more visable.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-4-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 18:04:49 -04:00
Zhang Yi
0ea6560abb ext4: check the extent status again before inserting delalloc block
ext4_da_map_blocks looks up for any extent entry in the extent status
tree (w/o i_data_sem) and then the looks up for any ondisk extent
mapping (with i_data_sem in read mode).

If it finds a hole in the extent status tree or if it couldn't find any
entry at all, it then takes the i_data_sem in write mode to add a da
entry into the extent status tree. This can actually race with page
mkwrite & fallocate path.

Note that this is ok between
1. ext4 buffered-write path v/s ext4_page_mkwrite(), because of the
   folio lock
2. ext4 buffered write path v/s ext4 fallocate because of the inode
   lock.

But this can race between ext4_page_mkwrite() & ext4 fallocate path

ext4_page_mkwrite()             ext4_fallocate()
 block_page_mkwrite()
  ext4_da_map_blocks()
   //find hole in extent status tree
                                 ext4_alloc_file_blocks()
                                  ext4_map_blocks()
                                   //allocate block and unwritten extent
   ext4_insert_delayed_block()
    ext4_da_reserve_space()
     //reserve one more block
    ext4_es_insert_delayed_block()
     //drop unwritten extent and add delayed extent by mistake

Then, the delalloc extent is wrong until writeback and the extra
reserved block can't be released any more and it triggers below warning:

 EXT4-fs (pmem2): Inode 13 (00000000bbbd4d23): i_reserved_data_blocks(1) not cleared!

Fix the problem by looking up extent status tree again while the
i_data_sem is held in write mode. If it still can't find any entry, then
we insert a new da entry into the extent status tree.

Cc: stable@vger.kernel.org
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-3-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 18:04:49 -04:00
Zhang Yi
8e4e5cdf2f ext4: factor out a common helper to query extent map
Factor out a new common helper ext4_map_query_blocks() from the
ext4_da_map_blocks(), it query and return the extent map status on the
inode's extent path, no logic changes.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://patch.msgid.link/20240517124005.347221-2-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 18:04:49 -04:00
Luis Henriques (SUSE)
907c3fe532 ext4: fix infinite loop when replaying fast_commit
When doing fast_commit replay an infinite loop may occur due to an
uninitialized extent_status struct.  ext4_ext_determine_insert_hole() does
not detect the replay and calls ext4_es_find_extent_range(), which will
return immediately without initializing the 'es' variable.

Because 'es' contains garbage, an integer overflow may happen causing an
infinite loop in this function, easily reproducible using fstest generic/039.

This commit fixes this issue by unconditionally initializing the structure
in function ext4_es_find_extent_range().

Thanks to Zhang Yi, for figuring out the real problem!

Fixes: 8016e29f43 ("ext4: fast commit recovery path")
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://patch.msgid.link/20240515082857.32730-1-luis.henriques@linux.dev
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 10:26:28 -04:00
Kemeng Shi
b07855348b jbd2: remove unnecessary "should_sleep" in kjournald2
We only need to sleep if no running transaction is expired. Simply remove
unnecessary "should_sleep".

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-10-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 10:20:26 -04:00
Kemeng Shi
136d3e0703 jbd2: remove dead check of JBD2_UNMOUNT in kjournald2
We always set JBD2_UNMOUNT with j_state_lock held in journal_kill_thread.
In kjournald2, we check JBD2_UNMOUNT flag two times under the same
j_state_lock. Then the second check is unnecessary.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-9-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 10:20:26 -04:00
Kemeng Shi
d5c545735a jbd2: remove dead equality check of j_commit_[sequence/request] in kjournald2
The j_commit_[sequence/request] are updated with j_state_lock held during
runtime. In kjournald2, two equality checks of j_commit_[sequence/request]
are under the same j_state_lock, then the second check is unnecessary.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-8-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 10:20:26 -04:00
Kemeng Shi
4eb9bd13eb jbd2: use bh_in instead of jh2bh(jh_in) to simplify code
We save jh2bh(jh_in) to bh_in, so use bh_in directly instead of
jh2bh(jh_in) to simplify the code.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-7-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 10:20:26 -04:00
Kemeng Shi
daabedd664 jbd2: remove unneeded kmap to do escape in jbd2_journal_write_metadata_buffer
The data to do escape could be accessed directly from b_frozen_data,
just remove unneeded kmap.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-6-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 10:20:26 -04:00
Kemeng Shi
4c15129aaa jbd2: jump to new copy_done tag when b_frozen_data is created concurrently
If b_frozen_data is created concurrently, we can update new_folio and
new_offset with b_frozen_data and then move forward

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-5-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 10:20:26 -04:00
Kemeng Shi
5dd3e8c075 jbd2: remove unnedded "need_copy_out" in jbd2_journal_write_metadata_buffer
As we only need to copy out when we should do escape, need_copy_out
could be simply replaced by "do_escape".

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-4-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 10:20:26 -04:00
Kemeng Shi
abe48a5225 jbd2: remove unused return info from jbd2_journal_write_metadata_buffer
The done_copy_out info from jbd2_journal_write_metadata_buffer is not
used. Simply remove it.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-3-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 10:20:26 -04:00
Kemeng Shi
cc102aa246 jbd2: avoid memleak in jbd2_journal_write_metadata_buffer
The new_bh is from alloc_buffer_head, we should call free_buffer_head to
free it in error case.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240514112438.1269037-2-shikemeng@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 10:20:26 -04:00
Xiaxi Shen
8dc9c3da79 ext4: fix uninitialized variable in ext4_inlinedir_to_tree
Syzbot has found an uninit-value bug in ext4_inlinedir_to_tree

This error happens because ext4_inlinedir_to_tree does not
handle the case when ext4fs_dirhash returns an error

This can be avoided by checking the return value of ext4fs_dirhash
and propagating the error,
similar to how it's done with ext4_htree_store_dirent

Signed-off-by: Xiaxi Shen <shenxiaxi26@gmail.com>
Reported-and-tested-by: syzbot+eaba5abe296837a640c0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=eaba5abe296837a640c0
Link: https://patch.msgid.link/20240501033017.220000-1-shenxiaxi26@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 10:08:36 -04:00
Thorsten Blum
be210737fe jbd2: use str_plural() to fix Coccinelle warning
Fixes the following Coccinelle/coccicheck warning reported by
string_choices.cocci:

	opportunity for str_plural(dropped)

Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Link: https://patch.msgid.link/20240402105157.254389-2-thorsten.blum@toblux.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 09:34:00 -04:00
Li zeming
57802c73bf ext4: block_validity: Remove unnecessary ‘NULL’ values from new_node
new_node is assigned first, so it does not need to initialize the
assignment.

Signed-off-by: Li zeming <zeming@nfschina.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://patch.msgid.link/20240402022300.25858-1-zeming@nfschina.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27 09:34:00 -04:00
Linus Torvalds
f2661062f1 Linux 6.10-rc5 v6.10-rc5 2024-06-23 17:08:54 -04:00
Linus Torvalds
7c16f0a4ed Merge tag 'i2c-for-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "The core gains placeholders for recently added functions when
  CONFIG_I2C is not defined as well documentation fixes to start using
  inclusive terminology.

  The drivers get paths in DT bindings fixed as well as proper interrupt
  handling for the ocores driver"

* tag 'i2c-for-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  docs: i2c: summary: be clearer with 'controller/target' and 'adapter/client' pairs
  docs: i2c: summary: document 'local' and 'remote' targets
  docs: i2c: summary: document use of inclusive language
  docs: i2c: summary: update speed mode description
  docs: i2c: summary: update I2C specification link
  docs: i2c: summary: start sentences consistently.
  i2c: Add nop fwnode operations
  i2c: ocores: set IACK bit after core is enabled
  dt-bindings: i2c: google,cros-ec-i2c-tunnel: correct path to i2c-controller schema
  dt-bindings: i2c: atmel,at91sam: correct path to i2c-controller schema
2024-06-23 11:06:01 -04:00
Linus Torvalds
d14f2780f0 Merge tag '6.10-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
 "Five smb3 client fixes

   - three nets/fiolios cifs fixes

   - fix typo in module parameters description

   - fix incorrect swap warning"

* tag '6.10-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Move the 'pid' from the subreq to the req
  cifs: Only pick a channel once per read request
  cifs: Defer read completion
  cifs: fix typo in module parameter enable_gcm_256
  cifs: drop the incorrect assertion in cifs_swap_rw()
2024-06-23 11:01:57 -04:00
Linus Torvalds
0971e82ea3 Merge tag 'fixes-2024-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock fix from Mike Rapoport:
 "Fix fragility in checks for unset node ID.

  Use numa_valid_node() function to verify that nid is a valid node
  ID instead of inconsistent comparisons with either NUMA_NO_NODE or
  MAX_NUMNODES"

* tag 'fixes-2024-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  memblock: use numa_valid_node() helper to check for invalid node ID
2024-06-23 10:32:24 -04:00
Linus Torvalds
b67eeff799 Merge tag 'mips-fixes_6.10_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:

 - fix lseek in o32 compat mode

 - fix for microMIPS MT ASE helpers

* tag 'mips-fixes_6.10_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  mips: fix compat_sys_lseek syscall
  MIPS: mipsmtregs: Fix target register for MFTC0
2024-06-23 07:20:02 -07:00
Linus Torvalds
b9e6612d61 Merge tag 'x86_urgent_for_v6.10_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:

 - An ARM-relevant fix to not free default RMIDs of a resource control
   group

 - A randconfig build fix for the VMware virtual GPU driver

* tag 'x86_urgent_for_v6.10_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Don't try to free nonexistent RMIDs
  drm/vmwgfx: Fix missing HYPERVISOR_GUEST dependency
2024-06-23 07:16:13 -07:00
Linus Torvalds
d1505b5cd0 Merge tag 'powerpc-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:

 - Prevent use-after-free in 64-bit KVM VFIO

 - Add generated Power8 crypto asm to .gitignore

Thanks to Al Viro and Nathan Lynch.

* tag 'powerpc-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  KVM: PPC: Book3S HV: Prevent UAF in kvm_spapr_tce_attach_iommu_group()
  powerpc/crypto: Add generated P8 asm to .gitignore
2024-06-23 07:13:23 -07:00
Wolfram Sang
2c50f892ca Merge tag 'i2c-host-fixes-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current
This pull request fixes the paths of the dt-schema to their
complete locations for the ChromeOS EC tunnel driver and the
Atmel at91sam drivers.

Additionally, the OpenCores driver receives a fix for an issue
that dates back to version 2.6.18. Specifically, the interrupts
need to be acknowledged (clearing all pending interrupts) after
enabling the core.
2024-06-23 02:13:27 +02:00
Linus Torvalds
5f583a3162 Merge tag 'rust-fixes-6.10' of https://github.com/Rust-for-Linux/linux
Pull rust fix from Miguel Ojeda:

 - Avoid unused import warning in 'rusttest'.

* tag 'rust-fixes-6.10' of https://github.com/Rust-for-Linux/linux:
  rust: avoid unused import warning in `rusttest`
2024-06-22 15:29:49 -07:00
Linus Torvalds
2765de94dd Merge tag 'regulator-fix-v6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
 "A few driver specific fixes for incorrect device descriptions, plus a
  fix for a missing symbol export which causes build failures for some
  newly added drivers in other trees"

* tag 'regulator-fix-v6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: axp20x: AXP717: fix LDO supply rails and off-by-ones
  regulator: bd71815: fix ramp values
  regulator: core: Fix modpost error "regulator_get_regmap" undefined
  regulator: tps6594-regulator: Fix the number of irqs for TPS65224 and TPS6594
2024-06-22 14:02:16 -07:00
Linus Torvalds
e24638afe2 Merge tag 'spi-fix-v6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "A number of fixes that have built up for SPI, a bunch of driver
  specific ones including an unfortunate revert of an optimisation for
  the i.MX driver which was causing issues with some configurations,
  plus a couple of core fixes for the rarely used octal mode and for a
  bad interaction between multi-CS support and target mode"

* tag 'spi-fix-v6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spi-imx: imx51: revert burst length calculation back to bits_per_word
  spi: Fix SPI slave probe failure
  spi: Fix OCTAL mode support
  spi: stm32: qspi: Clamp stm32_qspi_get_mode() output to CCR_BUSWIDTH_4
  spi: stm32: qspi: Fix dual flash mode sanity test in stm32_qspi_setup()
  spi: cs42l43: Drop cs35l56 SPI speed down to 11MHz
  spi: cs42l43: Correct SPI root clock speed
2024-06-22 13:58:47 -07:00
Linus Torvalds
c2fc946223 Merge tag 'nfsd-6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:

 - Fix crashes triggered by administrative operations on the server

* tag 'nfsd-6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  NFSD: grab nfsd_mutex in nfsd_nl_rpc_status_get_dumpit()
  nfsd: fix oops when reading pool_stats before server is started
2024-06-22 13:55:56 -07:00
Linus Torvalds
563a50672d Merge tag 'xfs-6.10-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fix from Chandan Babu:

 - Fix assertion failure due to a race between unlink and cluster buffer
   instantiation.

* tag 'xfs-6.10-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix unlink vs cluster buffer instantiation race
2024-06-22 09:07:56 -07:00
Linus Torvalds
c3de9b572f Merge tag 'bcachefs-2024-06-22' of https://evilpiepirate.org/git/bcachefs
Pull bcachefs fixes from Kent Overstreet:
 "Lots of (mostly boring) fixes for syzbot bugs and rare(r) CI bugs.

  The LRU_TIME_BITS fix was slightly more involved; we only have 48 bits
  for the LRU position (we would prefer 64), so wraparound is possible
  for the cached data LRUs on a filesystem that has done sufficient
  (petabytes) reads; this is now handled.

  One notable user reported bugfix, where we were forgetting to
  correctly set the bucket data type, which should have been
  BCH_DATA_need_gc_gens instead of BCH_DATA_free; this was causing us to
  go emergency read-only on a filesystem that had seen heavy enough use
  to see bucket gen wraparoud.

  We're now starting to fix simple (safe) errors without requiring user
  intervention - i.e. a small incremental step towards full self
  healing.

  This is currently limited to just certain allocation information
  counters, and the error is still logged in the superblock; see that
  patch for more information. ("bcachefs: Fix safe errors by default")"

* tag 'bcachefs-2024-06-22' of https://evilpiepirate.org/git/bcachefs: (22 commits)
  bcachefs: Move the ei_flags setting to after initialization
  bcachefs: Fix a UAF after write_super()
  bcachefs: Use bch2_print_string_as_lines for long err
  bcachefs: Fix I_NEW warning in race path in bch2_inode_insert()
  bcachefs: Replace bare EEXIST with private error codes
  bcachefs: Fix missing alloc_data_type_set()
  closures: Change BUG_ON() to WARN_ON()
  bcachefs: fix alignment of VMA for memory mapped files on THP
  bcachefs: Fix safe errors by default
  bcachefs: Fix bch2_trans_put()
  bcachefs: set_worker_desc() for delete_dead_snapshots
  bcachefs: Fix bch2_sb_downgrade_update()
  bcachefs: Handle cached data LRU wraparound
  bcachefs: Guard against overflowing LRU_TIME_BITS
  bcachefs: delete_dead_snapshots() doesn't need to go RW
  bcachefs: Fix early init error path in journal code
  bcachefs: Check for invalid btree IDs
  bcachefs: Fix btree ID bitmasks
  bcachefs: Fix shift overflow in read_one_super()
  bcachefs: Fix a locking bug in the do_discard_fast() path
  ...
2024-06-22 09:02:39 -07:00
Linus Torvalds
da3b6ef176 Merge tag 'ata-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fix from Niklas Cassel:

 - We currently enable DIPM (device initiated power management) in the
   device (using a SET FEATURES call to the device), regardless if the
   HBA supports any LPM states or not. It seems counter intuitive, and
   potentially dangerous to enable a device side feature, when the HBA
   does not have the corresponding support. Thus, make sure that we do
   not enable DIPM if the HBA does not support any LPM states.

* tag 'ata-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
  ata: ahci: Do not enable LPM if no LPM states are supported by the HBA
2024-06-22 08:16:17 -07:00
Linus Torvalds
1f5c537182 Merge tag 'pwm/for-6.10-rc5-fixes-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux
Pull pwm fixes from Uwe Kleine-König:
 "Three fixes for the pwm-stm32 driver.

  The first patch prevents an integer wrap-around for small periods. In
  the second patch the calculation of the prescaler is fixed which
  resulted in values for the ARR register that don't fit into the
  corresponding register bit field. The last commit improves an error
  message that was wrongly copied from another error path"

* tag 'pwm/for-6.10-rc5-fixes-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
  pwm: stm32: Fix error message to not describe the previous error path
  pwm: stm32: Fix calculation of prescaler
  pwm: stm32: Refuse too small period requests
2024-06-22 08:03:47 -07:00