Commit Graph

1308802 Commits

Author SHA1 Message Date
Kent Overstreet
672f75238e bcachefs: Fix accounting replay flags
BCH_TRANS_COMMIT_journal_reclaim without BCH_WATERMARK_reclaim means
"return an error if low on journal space" - but accounting replay must
succeed.

Fixes https://github.com/koverstreet/bcachefs/issues/656

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-12 03:02:16 -04:00
Kent Overstreet
c1bd21bb65 bcachefs: Fix invalid shift in member_to_text()
Reported-by: syzbot+064ce437a1ad63d3f6ef@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-12 03:02:16 -04:00
Kent Overstreet
7d84d9f449 bcachefs: Fix bch2_have_enough_devs() for BCH_SB_MEMBER_INVALID
This fixes a kasan splat in the ec device removal tests.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-11 22:20:51 -04:00
Kent Overstreet
3b80552e70 bcachefs: __wait_for_freeing_inode: Switch to wait_bit_queue_entry
inode_bit_waitqueue() is changing - this update clears the way for
sched changes.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-09 16:58:18 -04:00
Kent Overstreet
a7e2dd58fb bcachefs: Check if stuck in journal_res_get()
Like how we already do when the allocator seems to be stuck, check if
we're waiting too long for a journal reservation and print some debug
info.

This is specifically to track down
https://github.com/koverstreet/bcachefs/issues/656

which is showing up in userspace where we don't have sysfs/debugfs to
get the journal debug info.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-09 16:57:59 -04:00
Kent Overstreet
04b670de28 closures: Add closure_wait_event_timeout()
Add a closure version of wait_event_timeout(), with the same semantics.

The closure version is useful because unlike wait_event(), it allows
blocking code to run in the conditional expression.

Cc: Coly Li <colyli@suse.de>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-09 16:57:57 -04:00
Alan Huang
9205d24cf7 bcachefs: Fix state lock involved deadlock
We increased write ref, if the fs went to RO, that would lead to
a deadlock, it actually happens:

00171 ========= TEST   generic/279
00171
00172 bcachefs (vdb): starting version 1.12: rebalance_work_acct_fix opts=nocow
00172 bcachefs (vdb): recovering from clean shutdown, journal seq 35
00172 bcachefs (vdb): accounting_read... done
00172 bcachefs (vdb): alloc_read... done
00172 bcachefs (vdb): stripes_read... done
00172 bcachefs (vdb): snapshots_read... done
00172 bcachefs (vdb): journal_replay... done
00172 bcachefs (vdb): resume_logged_ops... done
00172 bcachefs (vdb): going read-write
00172 bcachefs (vdb): done starting filesystem
00172 FSTYP         -- bcachefs
00172 PLATFORM      -- Linux/aarch64 farm3-kvm 6.11.0-rc1-ktest-g3e290a0b8e34 #7030 SMP Tue Oct  8 14:15:12 UTC 2024
00172 MKFS_OPTIONS  -- --nocow /dev/vdc
00172 MOUNT_OPTIONS -- /dev/vdc /mnt/scratch
00172
00172 bcachefs (vdc): starting version 1.12: rebalance_work_acct_fix opts=nocow
00172 bcachefs (vdc): initializing new filesystem
00172 bcachefs (vdc): going read-write
00172 bcachefs (vdc): marking superblocks
00172 bcachefs (vdc): initializing freespace
00172 bcachefs (vdc): done initializing freespace
00172 bcachefs (vdc): reading snapshots table
00172 bcachefs (vdc): reading snapshots done
00172 bcachefs (vdc): done starting filesystem
00173 bcachefs (vdc): shutting down
00173 bcachefs (vdc): going read-only
00173 bcachefs (vdc): finished waiting for writes to stop
00173 bcachefs (vdc): flushing journal and stopping allocators, journal seq 4
00173 bcachefs (vdc): flushing journal and stopping allocators complete, journal seq 6
00173 bcachefs (vdc): shutdown complete, journal seq 7
00173 bcachefs (vdc): marking filesystem clean
00173 bcachefs (vdc): shutdown complete
00173 bcachefs (vdb): shutting down
00173 bcachefs (vdb): going read-only
00361 INFO: task umount:6180 blocked for more than 122 seconds.
00361 Not tainted 6.11.0-rc1-ktest-g3e290a0b8e34 #7030
00361 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
00361 task:umount          state:D stack:0     pid:6180  tgid:6180  ppid:6176   flags:0x00000004
00361 Call trace:
00362 __switch_to (arch/arm64/kernel/process.c:556)
00362 __schedule (kernel/sched/core.c:5191 kernel/sched/core.c:6529)
00363 schedule (include/asm-generic/bitops/generic-non-atomic.h:128 include/linux/thread_info.h:192 include/linux/sched.h:2084 kernel/sched/core.c:6608 kernel/sched/core.c:6621)
00365 bch2_fs_read_only (fs/bcachefs/super.c:346 (discriminator 41))
00367 __bch2_fs_stop (fs/bcachefs/super.c:620)
00368 bch2_put_super (fs/bcachefs/fs.c:1942)
00369 generic_shutdown_super (include/linux/list.h:373 (discriminator 2) fs/super.c:650 (discriminator 2))
00371 bch2_kill_sb (fs/bcachefs/fs.c:2170)
00372 deactivate_locked_super (fs/super.c:434 fs/super.c:475)
00373 deactivate_super (fs/super.c:508)
00374 cleanup_mnt (fs/namespace.c:250 fs/namespace.c:1374)
00376 __cleanup_mnt (fs/namespace.c:1381)
00376 task_work_run (include/linux/sched.h:2024 kernel/task_work.c:224)
00377 do_notify_resume (include/linux/resume_user_mode.h:50 arch/arm64/kernel/entry-common.c:151)
00377 el0_svc (arch/arm64/include/asm/daifflags.h:28 arch/arm64/kernel/entry-common.c:171 arch/arm64/kernel/entry-common.c:178 arch/arm64/kernel/entry-common.c:713)
00377 el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:731)
00378 el0t_64_sync (arch/arm64/kernel/entry.S:598)
00378 INFO: task tee:6182 blocked for more than 122 seconds.
00378 Not tainted 6.11.0-rc1-ktest-g3e290a0b8e34 #7030
00378 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
00378 task:tee             state:D stack:0     pid:6182  tgid:6182  ppid:533    flags:0x00000004
00378 Call trace:
00378 __switch_to (arch/arm64/kernel/process.c:556)
00378 __schedule (kernel/sched/core.c:5191 kernel/sched/core.c:6529)
00378 schedule (include/asm-generic/bitops/generic-non-atomic.h:128 include/linux/thread_info.h:192 include/linux/sched.h:2084 kernel/sched/core.c:6608 kernel/sched/core.c:6621)
00378 schedule_preempt_disabled (kernel/sched/core.c:6680)
00379 rwsem_down_read_slowpath (kernel/locking/rwsem.c:1073 (discriminator 1))
00379 down_read (kernel/locking/rwsem.c:1529)
00381 bch2_gc_gens (fs/bcachefs/sb-members.h:77 fs/bcachefs/sb-members.h:88 fs/bcachefs/sb-members.h:128 fs/bcachefs/btree_gc.c:1240)
00383 bch2_fs_store_inner (fs/bcachefs/sysfs.c:473)
00385 bch2_fs_internal_store (fs/bcachefs/sysfs.c:417 fs/bcachefs/sysfs.c:580 fs/bcachefs/sysfs.c:576)
00386 sysfs_kf_write (fs/sysfs/file.c:137)
00387 kernfs_fop_write_iter (fs/kernfs/file.c:334)
00389 vfs_write (fs/read_write.c:497 fs/read_write.c:590)
00390 ksys_write (fs/read_write.c:643)
00391 __arm64_sys_write (fs/read_write.c:652)
00391 invoke_syscall.constprop.0 (arch/arm64/include/asm/syscall.h:61 arch/arm64/kernel/syscall.c:54)
00392 do_el0_svc (include/linux/thread_info.h:127 (discriminator 2) arch/arm64/kernel/syscall.c:140 (discriminator 2) arch/arm64/kernel/syscall.c:151 (discriminator 2))
00392 el0_svc (arch/arm64/include/asm/irqflags.h:55 arch/arm64/include/asm/irqflags.h:76 arch/arm64/kernel/entry-common.c:165 arch/arm64/kernel/entry-common.c:178 arch/arm64/kernel/entry-common.c:713)
00392 el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:731)
00392 el0t_64_sync (arch/arm64/kernel/entry.S:598)

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-09 16:42:54 -04:00
Mohammed Anees
a30f32222d bcachefs: Fix NULL pointer dereference in bch2_opt_to_text
This patch adds a bounds check to the bch2_opt_to_text function to prevent
NULL pointer dereferences when accessing the opt->choices array. This
ensures that the index used is within valid bounds before dereferencing.
The new version enhances the readability.

Reported-and-tested-by: syzbot+37186860aa7812b331d5@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=37186860aa7812b331d5
Signed-off-by: Mohammed Anees <pvmohammedanees2003@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-09 16:42:53 -04:00
Alan Huang
a154154148 bcachefs: Release transaction before wake up
We will get this if we wake up first:

Kernel panic - not syncing: btree_node_write_done leaked btree_trans

since there are still transactions waiting for cycle detectors after
BTREE_NODE_write_in_flight is cleared.

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-09 16:42:53 -04:00
Piotr Zalewski
0151d10a48 bcachefs: add check for btree id against max in try read node
Add check for read node's btree_id against BTREE_ID_NR_MAX in
try_read_btree_node to prevent triggering EBUG_ON condition in
bch2_btree_id_root[1].

[1] https://syzkaller.appspot.com/bug?extid=cf7b2215b5d70600ec00

Reported-by: syzbot+cf7b2215b5d70600ec00@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=cf7b2215b5d70600ec00
Fixes: 4409b8081d ("bcachefs: Repair pass for scanning for btree nodes")
Signed-off-by: Piotr Zalewski <pZ010001011111@proton.me>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-09 16:42:53 -04:00
Kent Overstreet
19773ec997 bcachefs: Disk accounting device validation fixes
- Fix failure to validate that accounting replicas entries point to
  valid devices: this wasn't a real bug since they'd be cleaned up by
  GC, but is still something we should know about

- Fix failure to validate that dev_data_type entries point to valid
  devices: this does fix a real bug, since bch2_accounting_read() would
  then try to copy the counters to that device and pop an inconsistent
  error when the device didn't exist

- Remove accounting entries that are zeroed or invalid: if we're not
  validating them we need to get rid of them: they might not exist in
  the superblock, so we need the to trigger the superblock mark path
  when they're readded.

  This fixes the replication.ktest rereplicate test, which was failing
  with "superblock not marked for replicas..."

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-09 16:42:53 -04:00
Kent Overstreet
9d86178782 bcachefs: bch2_inode_or_descendents_is_open()
fsck can now correctly check if inodes in interior snapshot nodes are
open/in use.

- Tweak the vfs inode rhashtable so that the subvolume ID isn't hashed,
  meaning inums in different subvolumes will hash to the same slot. Note
  that this is a hack, and will cause problems if anyone ever has the
  same file in many different snapshots open all at the same time.

- Then check if any of those subvolumes is a descendent of the snapshot
  ID being checked

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-09 16:42:53 -04:00
Kent Overstreet
84878e8245 bcachefs: Kill bch2_propagate_key_to_snapshot_leaves()
Dead code now.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-09 16:42:53 -04:00
Kent Overstreet
9b23fdbd5d bcachefs: bcachefs_metadata_version_inode_has_child_snapshots
There's an inherent race in taking a snapshot while an unlinked file is
open, and then reattaching it in the child snapshot.

In the interior snapshot node the file will appear unlinked, as though
it should be deleted - it's not referenced by anything in that snapshot
- but we can't delete it, because the file data is referenced by the
child snapshot.

This was being handled incorrectly with
propagate_key_to_snapshot_leaves() - but that doesn't resolve the
fundamental inconsistency of "this file looks like it should be deleted
according to normal rules, but - ".

To fix this, we need to fix the rule for when an inode is deleted. The
previous rule, ignoring snapshots (there was no well-defined rule
for with snapshots) was:
  Unlinked, non open files are deleted, either at recovery time or
  during online fsck

The new rule is:
  Unlinked, non open files, that do not exist in child snapshots, are
  deleted.

To make this work transactionally, we add a new inode flag,
BCH_INODE_has_child_snapshot; it overrides BCH_INODE_unlinked when
considering whether to delete an inode, or put it on the deleted list.

For transactional consistency, clearing it handled by the inode trigger:
when deleting an inode we check if there are parent inodes which can now
have the BCH_INODE_has_child_snapshot flag cleared.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-09 16:42:51 -04:00
Kent Overstreet
cba31b7eee bcachefs: Delete vestigal check_inode() checks
BCH_INODE_i_size_dirty dates from before we had logged operations for
truncate (as well as finsert) - it hasn't been needed since before
bcachefs was mainlined.

BCH_INODE_i_sectors_dirty hasn't been needed since we started always
updating i_sectors transactionally - it's been unused for even longer.

BCH_INODE_backptr_untrusted also hasn't been used since prior to
mainlining; when unlinking a hardling, we zero out the backpointer
fields if they're for the dirent being removed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-06 03:03:45 -04:00
Kent Overstreet
12f286085b bcachefs: btree_iter_peek_upto() now handles BTREE_ITER_all_snapshots
end_pos now compares against snapshot ID when required

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-06 03:03:45 -04:00
Kent Overstreet
38864eccf7 bcachefs: reattach_inode() now correctly handles interior snapshot nodes
When we find an unreachable inode, we now reattach it in the oldest
version that needs to be reattached (thus avoiding redundant work
reattaching every single version), and we now fix up inode -> dirent
backpointers in newer versions as needed - or white out the reattaching
dirent in newer versions, if the newer version isn't supposed to be
reattached.

This results in the second verify fsck now passing cleanly after
repairing on a user-provided filesystem image with thousands of
different snapshots.

Reported-by: Christopher Snowhill <chris@kode54.net>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-06 03:03:45 -04:00
Kent Overstreet
bade9711e0 bcachefs: Split out check_unreachable_inodes() pass
With inode backpointers, we can write a very simple
check_unreachable_inodes() pass that only looks for non-unlinked inodes
that are missing backpointers, and reattaches them.

This simplifies check_directory_structure() so that it's now only
checking for directory structure loops,

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-06 03:03:45 -04:00
Kent Overstreet
bf4baaa087 bcachefs: Fix lockdep splat in bch2_accounting_read
We can't take sb_lock while holding mark_lock, so split out
replicas_entry_validate() and replicas_entry_sb_validate() -
replicas_entry_validate() now uses the normal online device interface.

00039 ========= TEST   set_option
00039
00039 WATCHDOG 30
00040 bcachefs (vdb): starting version 1.12: rebalance_work_acct_fix opts=errors=panic
00040 bcachefs (vdb): initializing new filesystem
00040 bcachefs (vdb): going read-write
00040 bcachefs (vdb): marking superblocks
00040 bcachefs (vdb): initializing freespace
00040 bcachefs (vdb): done initializing freespace
00040 bcachefs (vdb): reading snapshots table
00040 bcachefs (vdb): reading snapshots done
00040 bcachefs (vdb): done starting filesystem
00040 zstd
00041 bcachefs (vdb): shutting down
00041 bcachefs (vdb): going read-only
00041 bcachefs (vdb): finished waiting for writes to stop
00041 bcachefs (vdb): flushing journal and stopping allocators, journal seq 3
00041 bcachefs (vdb): flushing journal and stopping allocators complete, journal seq 11
00041 bcachefs (vdb): shutdown complete, journal seq 12
00041 bcachefs (vdb): marking filesystem clean
00041 bcachefs (vdb): shutdown complete
00041 Setting option on offline fs
00041 bch2_write_super(): fatal error : attempting to write superblock that wasn't version downgraded (1.12: (unknown version) > 1.10: disk_accounting_v3)
00041 fatal error - emergency read only
00041 bch2_write_super(): fatal error : attempting to write superblock that wasn't version downgraded (1.12: (unknown version) > 1.10: disk_accounting_v3)
00042 bcachefs (vdb): starting version 1.12: rebalance_work_acct_fix opts=errors=panic,compression=zstd
00042 bcachefs (vdb): recovering from clean shutdown, journal seq 12
00042 bcachefs (vdb): accounting_read...
00042
00042 ======================================================
00042 WARNING: possible circular locking dependency detected
00042 6.12.0-rc1-ktest-g805e938a8502 #6807 Not tainted
00042 ------------------------------------------------------
00042 mount.bcachefs/665 is trying to acquire lock:
00045 ffffff80cc280908 (&c->sb_lock){+.+.}-{3:3}, at: bch2_replicas_entry_validate (fs/bcachefs/replicas.c:102)
00045
00045 but task is already holding lock:
00048 ffffff80cc284870 (&c->mark_lock){++++}-{0:0}, at: bch2_accounting_read (fs/bcachefs/disk_accounting.c:670 (discriminator 1))
00048
00048 which lock already depends on the new lock.
00048
00048
00048 the existing dependency chain (in reverse order) is:
00048
00048 -> #1 (&c->mark_lock){++++}-{0:0}:
00049 percpu_down_write (kernel/locking/percpu-rwsem.c:232)
00052 bch2_sb_replicas_to_cpu_replicas (fs/bcachefs/replicas.c:583)
00055 bch2_sb_to_fs (fs/bcachefs/super-io.c:614)
00057 bch2_fs_open (fs/bcachefs/super.c:828 fs/bcachefs/super.c:2050)
00060 bch2_fs_get_tree (fs/bcachefs/fs.c:2067)
00062 vfs_get_tree (fs/super.c:1801)
00064 path_mount (fs/namespace.c:3507 fs/namespace.c:3834)
00066 __arm64_sys_mount (fs/namespace.c:3847 fs/namespace.c:4055 fs/namespace.c:4032 fs/namespace.c:4032)
00067 invoke_syscall.constprop.0 (arch/arm64/include/asm/syscall.h:61 arch/arm64/kernel/syscall.c:54)
00068 do_el0_svc (include/linux/thread_info.h:127 (discriminator 2) arch/arm64/kernel/syscall.c:140 (discriminator 2) arch/arm64/kernel/syscall.c:151 (discriminator 2))
00069 el0_svc (arch/arm64/include/asm/irqflags.h:82 arch/arm64/include/asm/irqflags.h:123 arch/arm64/include/asm/irqflags.h:136 arch/arm64/kernel/entry-common.c:165 arch/arm64/kernel/entry-common.c:178 arch/arm64/kernel/entry-common.c:713)
00069 ========= FAILED TIMEOUT set_option in 30s

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-06 03:03:45 -04:00
Kent Overstreet
0f25eb4b60 bcachefs: Rework logged op error handling
Initially it was thought that we just wanted to ignore errors from
logged op replay, but it turns out we do need to catch -EROFS, or we'll
go into an infinite loop.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:32 -04:00
Kent Overstreet
1f73cb4d34 bcachefs: Add warn param to subvol_get_snapshot, peek_inode
These shouldn't always be fatal errors - logged op resume, in
particular, and we want it as a parameter there.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:32 -04:00
Kent Overstreet
72350ee0ea bcachefs: Kill snapshot arg to fsck_write_inode()
It was initially believed that it would be better to be explicit about
the snapshot we're updating when writing inodes in fsck; however, it
turns out that passing around the snapshot separately is more error
prone and we're usually updating the inode in the same snapshow we read
it from.

This is different from normal filesystem paths, where we do the update
in the snapshot of the subvolume we're in.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:32 -04:00
Kent Overstreet
c9306a91c3 bcachefs: Check for unlinked, non-empty dirs in check_inode()
We want to check for this early so it can be reattached if necessary in
check_unreachable_inodes(); better than letting it be deleted and having
the children reattached, losing their filenames.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:32 -04:00
Kent Overstreet
c7da5ee2e5 bcachefs: Check for unlinked inodes with dirents
link count works differently in bcachefs - it's only nonzero for files
with multiple hardlinks, which means we can also avoid checking it
except for files that are known to have hardlinks.

That means we need a few different checks instead; in particular, we
don't want fsck to delet a file that has a dirent pointing to it.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:32 -04:00
Kent Overstreet
1c6051bbd7 bcachefs: Check for directories with no backpointers
It's legal for regular files to have missing backpointers (due to
hardlinks), and fsck should automatically add them, but for directories
this is an error that should be flagged.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:32 -04:00
Kent Overstreet
260af1562e bcachefs: Kill alloc_v4.fragmentation_lru
The fragmentation_lru field hasn't been needed since we reworked the LRU
btrees to use the btree write buffer; previously it was used to resolve
collisions, but the revised LRU btree uses the backpointer (the bucket)
as part of the key.

It should have been deleted at the time of the LRU rework; since it
wasn't, that left places for bugs to hide, in check/repair.

This fixes LRU fsck on a filesystem image helpfully provided by a user
who disappeared before I could get his name for the reported-by.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:32 -04:00
Kent Overstreet
01bf5e3bd2 bcachefs: minor lru fsck fixes
check_lru_key() wasn't using write buffer updates for deleting bad lru
entries - dating from before the lru btree used the btree write buffer.

And when possibly flushing the btree write buffer (to make sure we're
seeing a real inconsistency), we need to be using the modern
bch2_btree_write_buffer_maybe_flush().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:31 -04:00
Kent Overstreet
1bea714c53 bcachefs: Mark more errors AUTOFIX
Errors are getting marked as AUTOFIX once they've been (re)-tested and
audited.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:31 -04:00
Kent Overstreet
492e24d760 bcachefs: Make sure we print error that causes fsck to bail out
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:31 -04:00
Kent Overstreet
658c82f41e bcachefs: bkey errors are only AUTOFIX during read
Newly generated keys, in the transaction commit path or write path,
should not be AUTOFIX; those indicate bugs that we need to fail fast
for.

Fixes: 5612daafb7 ("bcachefs: Fix fsck warnings from bkey validation")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:31 -04:00
Kent Overstreet
fda7b1ffde bcachefs: Create lost+found in correct snapshot
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:31 -04:00
Kent Overstreet
20826fe6b8 bcachefs: Fix reattach_inode()
Ensure a copy of the lost+found inode exists in the snapshot that we're
reattaching, so that we don't trigger warnings in
lookup_inode_for_snapshot() later.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:31 -04:00
Kent Overstreet
6b63a948a7 bcachefs: Add missing wakeup to bch2_inode_hash_remove()
This fixes two different bugs:

- Looser locking with the rhashtable means we need to recheck if the
  inode is still hashed after prepare_to_wait(), and add a corresponding
  wakeup after removing from the hash table.

- da18ecbf0f ("fs: add i_state helpers") changed the bit waitqueues
  used for inodes, and bcachefs wasn't updated and thus broke; this
  updates bcachefs to the new helper.

Fixes: 112d21fd1a ("bcachefs: switch to rhashtable for vfs inodes hash")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:31 -04:00
Kent Overstreet
d28786606a bcachefs: Fix trans_commit disk accounting revert
We only are applying JSET_ENTRY_TYPE_write_buffer_keys, revert path was
missed.

Fixes: a3581ca35d ("bcachefs: Fix BCH_TRANS_COMMIT_skip_accounting_apply")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-02 21:37:42 -04:00
Kent Overstreet
3b1425a4eb bcachefs: Fix bch2_inode_is_open() check
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-02 21:31:31 -04:00
Kent Overstreet
abaa6d4f6a bcachefs: Fix return type of dirent_points_to_inode_nowarn()
we're returning an error code now, not a bool

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-02 21:30:55 -04:00
Kent Overstreet
e764e68103 bcachefs: Fix bad shift in bch2_read_flag_list()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-01 17:20:24 -04:00
Guenter Roeck
2007d28ec0 bcachefs: rename version -> bversion for big endian builds
Builds on big endian systems fail as follows.

fs/bcachefs/bkey.h: In function 'bch2_bkey_format_add_key':
fs/bcachefs/bkey.h:557:41: error:
	'const struct bkey' has no member named 'bversion'

The original commit only renamed the variable for little endian builds.
Rename it for big endian builds as well to fix the problem.

Fixes: cf49f8a8c2 ("bcachefs: rename version -> bversion")
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-29 23:55:52 -04:00
Linus Torvalds
9852d85ec9 Linux 6.12-rc1 v6.12-rc1 2024-09-29 15:06:19 -07:00
Linus Torvalds
3f749befb0 x86: kvm: fix build error
The cpu_emergency_register_virt_callback() function is used
unconditionally by the x86 kvm code, but it is declared (and defined)
conditionally:

  #if IS_ENABLED(CONFIG_KVM_INTEL) || IS_ENABLED(CONFIG_KVM_AMD)
  void cpu_emergency_register_virt_callback(cpu_emergency_virt_cb *callback);
  ...

leading to a build error when neither KVM_INTEL nor KVM_AMD support is
enabled:

  arch/x86/kvm/x86.c: In function ‘kvm_arch_enable_virtualization’:
  arch/x86/kvm/x86.c:12517:9: error: implicit declaration of function ‘cpu_emergency_register_virt_callback’ [-Wimplicit-function-declaration]
  12517 |         cpu_emergency_register_virt_callback(kvm_x86_ops.emergency_disable_virtualization_cpu);
        |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  arch/x86/kvm/x86.c: In function ‘kvm_arch_disable_virtualization’:
  arch/x86/kvm/x86.c:12522:9: error: implicit declaration of function ‘cpu_emergency_unregister_virt_callback’ [-Wimplicit-function-declaration]
  12522 |         cpu_emergency_unregister_virt_callback(kvm_x86_ops.emergency_disable_virtualization_cpu);
        |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fix the build by defining empty helper functions the same way the old
cpu_emergency_disable_virtualization() function was dealt with for the
same situation.

Maybe we could instead have made the call sites conditional, since the
callers (kvm_arch_{en,dis}able_virtualization()) have an empty weak
fallback.  I'll leave that to the kvm people to argue about, this at
least gets the build going for that particular config.

Fixes: 590b09b1d8 ("KVM: x86: Register "emergency disable" callbacks when virt is enabled")
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Kai Huang <kai.huang@intel.com>
Cc: Chao Gao <chao.gao@intel.com>
Cc: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-09-29 14:47:33 -07:00
Linus Torvalds
e7ed343658 Merge tag 'mailbox-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox
Pull mailbox updates from Jassi Brar:

 - fix kconfig dependencies (mhu-v3, omap2+)

 - use devie name instead of genereic imx_mu_chan as interrupt name
   (imx)

 - enable sa8255p and qcs8300 ipc controllers (qcom)

 - Fix timeout during suspend mode (bcm2835)

 - convert to use use of_property_match_string (mailbox)

 - enable mt8188 (mediatek)

 - use devm_clk_get_enabled helpers (spreadtrum)

 - fix device-id typo (rockchip)

* tag 'mailbox-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox:
  mailbox, remoteproc: omap2+: fix compile testing
  dt-bindings: mailbox: qcom-ipcc: Document QCS8300 IPCC
  dt-bindings: mailbox: qcom-ipcc: document the support for SA8255p
  dt-bindings: mailbox: mtk,adsp-mbox: Add compatible for MT8188
  mailbox: Use of_property_match_string() instead of open-coding
  mailbox: bcm2835: Fix timeout during suspend mode
  mailbox: sprd: Use devm_clk_get_enabled() helpers
  mailbox: rockchip: fix a typo in module autoloading
  mailbox: imx: use device name in interrupt name
  mailbox: ARM_MHU_V3 should depend on ARM64
2024-09-29 09:53:04 -07:00
Linus Torvalds
907537f570 Merge tag 'i2c-for-6.12-rc1-additional_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:

 - fix DesignWare driver ENABLE-ABORT sequence, ensuring ABORT can
   always be sent when needed

 - check for PCLK in the SynQuacer controller as an optional clock,
   allowing ACPI to directly provide the clock rate

 - KEBA driver Kconfig dependency fix

 - fix XIIC driver power suspend sequence

* tag 'i2c-for-6.12-rc1-additional_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: xiic: Fix pm_runtime_set_suspended() with runtime pm enabled
  i2c: keba: I2C_KEBA should depend on KEBA_CP500
  i2c: synquacer: Deal with optional PCLK correctly
  i2c: designware: fix controller is holding SCL low while ENABLE bit is disabled
2024-09-29 09:47:33 -07:00
Linus Torvalds
b81b78dacc Merge tag 'dma-mapping-6.12-2024-09-29' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fix from Christoph Hellwig:

 - handle chained SGLs in the new tracing code (Christoph Hellwig)

* tag 'dma-mapping-6.12-2024-09-29' of git://git.infradead.org/users/hch/dma-mapping:
  dma-mapping: fix DMA API tracing for chained scatterlists
2024-09-29 09:35:10 -07:00
Linus Torvalds
3ed7df0852 Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
 "These are mostly minor updates.

  There are two drivers (lpfc and mpi3mr) which missed the initial
  pull and a core change to retry a start/stop unit which affect
  suspend/resume"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (32 commits)
  scsi: lpfc: Update lpfc version to 14.4.0.5
  scsi: lpfc: Support loopback tests with VMID enabled
  scsi: lpfc: Revise TRACE_EVENT log flag severities from KERN_ERR to KERN_WARNING
  scsi: lpfc: Ensure DA_ID handling completion before deleting an NPIV instance
  scsi: lpfc: Fix kref imbalance on fabric ndlps from dev_loss_tmo handler
  scsi: lpfc: Restrict support for 32 byte CDBs to specific HBAs
  scsi: lpfc: Update phba link state conditional before sending CMF_SYNC_WQE
  scsi: lpfc: Add ELS_RSP cmd to the list of WQEs to flush in lpfc_els_flush_cmd()
  scsi: mpi3mr: Update driver version to 8.12.0.0.50
  scsi: mpi3mr: Improve wait logic while controller transitions to READY state
  scsi: mpi3mr: Update MPI Headers to revision 34
  scsi: mpi3mr: Use firmware-provided timestamp update interval
  scsi: mpi3mr: Enhance the Enable Controller retry logic
  scsi: sd: Fix off-by-one error in sd_read_block_characteristics()
  scsi: pm8001: Do not overwrite PCI queue mapping
  scsi: scsi_debug: Remove a useless memset()
  scsi: pmcraid: Convert comma to semicolon
  scsi: sd: Retry START STOP UNIT commands
  scsi: mpi3mr: A performance fix
  scsi: ufs: qcom: Update MODE_MAX cfg_bw value
  ...
2024-09-29 09:22:34 -07:00
Linus Torvalds
9f9a534724 Merge tag 'bcachefs-2024-09-28' of git://evilpiepirate.org/bcachefs
Pull more bcachefs updates from Kent Overstreet:
 "Assorted minor syzbot fixes, and for bigger stuff:

  Fix two disk accounting rewrite bugs:

   - Disk accounting keys use the version field of bkey so that journal
     replay can tell which updates have been applied to the btree.

     This is set in the transaction commit path, after we've gotten our
     journal reservation (and our time ordering), but the
     BCH_TRANS_COMMIT_skip_accounting_apply flag that journal replay
     uses was incorrectly skipping this for new updates generated prior
     to journal replay.

     This fixes the underlying cause of an assertion pop in
     disk_accounting_read.

   - A couple of fixes for disk accounting + device removal.

     Checking if acocunting replicas entries were marked in the
     superblock was being done at the wrong point, when deltas in the
     journal could still zero them out, and then additionally we'd try
     to add a missing replicas entry to the superblock without checking
     if it referred to an invalid (removed) device.

  A whole slew of repair fixes:

   - fix infinite loop in propagate_key_to_snapshot_leaves(), this fixes
     an infinite loop when repairing a filesystem with many snapshots

   - fix incorrect transaction restart handling leading to occasional
     "fsck counted ..." warnings

   - fix warning in __bch2_fsck_err() for bkey fsck errors

   - check_inode() in fsck now correctly checks if the filesystem was
     clean

   - there shouldn't be pending logged ops if the fs was clean, we now
     check for this

   - remove_backpointer() doesn't remove a dirent that doesn't actually
     point to the inode

   - many more fsck errors are AUTOFIX"

* tag 'bcachefs-2024-09-28' of git://evilpiepirate.org/bcachefs: (35 commits)
  bcachefs: check_subvol_path() now prints subvol root inode
  bcachefs: remove_backpointer() now checks if dirent points to inode
  bcachefs: dirent_points_to_inode() now warns on mismatch
  bcachefs: Fix lost wake up
  bcachefs: Check for logged ops when clean
  bcachefs: BCH_FS_clean_recovery
  bcachefs: Convert disk accounting BUG_ON() to WARN_ON()
  bcachefs: Fix BCH_TRANS_COMMIT_skip_accounting_apply
  bcachefs: Check for accounting keys with bversion=0
  bcachefs: rename version -> bversion
  bcachefs: Don't delete unlinked inodes before logged op resume
  bcachefs: Fix BCH_SB_ERRS() so we can reorder
  bcachefs: Fix fsck warnings from bkey validation
  bcachefs: Move transaction commit path validation to as late as possible
  bcachefs: Fix disk accounting attempting to mark invalid replicas entry
  bcachefs: Fix unlocked access to c->disk_sb.sb in bch2_replicas_entry_validate()
  bcachefs: Fix accounting read + device removal
  bcachefs: bch_accounting_mode
  bcachefs: fix transaction restart handling in check_extents(), check_dirents()
  bcachefs: kill inode_walker_entry.seen_this_pos
  ...
2024-09-29 09:17:44 -07:00
Linus Torvalds
d37421e655 Merge tag 'x86-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Fix TDX MMIO #VE fault handling, and add two new Intel model numbers
  for 'Pantherlake' and 'Diamond Rapids'"

* tag 'x86-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Add two Intel CPU model numbers
  x86/tdx: Fix "in-kernel MMIO" check
2024-09-29 09:10:00 -07:00
Linus Torvalds
ec03de73b1 Merge tag 'locking-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "lockdep:
    - Fix potential deadlock between lockdep and RCU (Zhiguo Niu)
    - Use str_plural() to address Coccinelle warning (Thorsten Blum)
    - Add debuggability enhancement (Luis Claudio R. Goncalves)

  static keys & calls:
    - Fix static_key_slow_dec() yet again (Peter Zijlstra)
    - Handle module init failure correctly in static_call_del_module()
      (Thomas Gleixner)
    - Replace pointless WARN_ON() in static_call_module_notify() (Thomas
      Gleixner)

  <linux/cleanup.h>:
    - Add usage and style documentation (Dan Williams)

  rwsems:
    - Move is_rwsem_reader_owned() and rwsem_owner() under
      CONFIG_DEBUG_RWSEMS (Waiman Long)

  atomic ops, x86:
    - Redeclare x86_32 arch_atomic64_{add,sub}() as void (Uros Bizjak)
    - Introduce the read64_nonatomic macro to x86_32 with cx8 (Uros
      Bizjak)"

Signed-off-by: Ingo Molnar <mingo@kernel.org>

* tag 'locking-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rwsem: Move is_rwsem_reader_owned() and rwsem_owner() under CONFIG_DEBUG_RWSEMS
  jump_label: Fix static_key_slow_dec() yet again
  static_call: Replace pointless WARN_ON() in static_call_module_notify()
  static_call: Handle module init failure correctly in static_call_del_module()
  locking/lockdep: Simplify character output in seq_line()
  lockdep: fix deadlock issue between lockdep and rcu
  lockdep: Use str_plural() to fix Coccinelle warning
  cleanup: Add usage and style documentation
  lockdep: suggest the fix for "lockdep bfs error:-1" on print_bfs_bug
  locking/atomic/x86: Redeclare x86_32 arch_atomic64_{add,sub}() as void
  locking/atomic/x86: Introduce the read64_nonatomic macro to x86_32 with cx8
2024-09-29 08:51:30 -07:00
Linus Torvalds
68e4b0e024 Merge tag 'cocci-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux
Pull coccinelle updates from Julia Lawall:
 "Extend string_choices.cocci to use more available helpers

  Ten patches from Hongbo Li extending string_choices.cocci with the
  complete set of functions offered by include/linux/string_choices.h.

  One patch from myself reducing the number of redundant cases that are
  checked by Coccinelle, giving a small performance improvement"

* tag 'cocci-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
  Reduce Coccinelle choices in string_choices.cocci
  coccinelle: Remove unnecessary parentheses for only one possible change.
  coccinelle: Add rules to find str_yes_no() replacements
  coccinelle: Add rules to find str_on_off() replacements
  coccinelle: Add rules to find str_write_read() replacements
  coccinelle: Add rules to find str_read_write() replacements
  coccinelle: Add rules to find str_enable{d}_disable{d}() replacements
  coccinelle: Add rules to find str_lo{w}_hi{gh}() replacements
  coccinelle: Add rules to find str_hi{gh}_lo{w}() replacements
  coccinelle: Add rules to find str_false_true() replacements
  coccinelle: Add rules to find str_true_false() replacements
2024-09-29 08:44:28 -07:00
Linus Torvalds
e7ebdb5161 Merge tag 'linux_kselftest-next-6.12-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fix from Shuah Khan:
 "One urgent fix to vDSO as automated testing is failing due to this
  bug"

* tag 'linux_kselftest-next-6.12-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: vDSO: align stack for O2-optimized memcpy
2024-09-29 08:37:03 -07:00
Ingo Molnar
ae39e0bd15 Merge branch 'locking/core' into locking/urgent, to pick up pending commits
Merge all pending locking commits into a single branch.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-09-29 08:57:18 +02:00