In gfs2_dinode_in(), only allow directories to have the GFS2_DIF_EXHASH
flag set. This will prevent other parts of the code from treating
regular inodes as directories based on the presence of that flag.
In sweep_bh_for_rgrps() and __gfs2_free_blocks(), check if the
GFS2_DIF_EXHASH flag is set instead of checking if i_depth is non-zero.
This matches what the directory code does. (The i_depth checks were
introduced in commit 6d3117b412 ("GFS2: Wipe directory hash table
metadata when deallocating a directory").)
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
When a withdraw occurs in gfs2_log_flush() and we are left with an unsubmitted
bio, fail that bio. Otherwise, the bh's in that bio will remain locked and
gfs2_evict_inode() -> truncate_inode_pages() -> gfs2_invalidate_folio() ->
gfs2_discard() will hang trying to discard the locked bh's.
In addition, when gfs2_log_flush() fails to submit a new transaction, unpin the
buffers in the failing transaction like gfs2_remove_from_journal() does. If
any of the bd's are on the ail2 list, leave them there and do_withdraw() ->
gfs2_withdraw_glocks() -> inode_go_inval() -> truncate_inode_pages() ->
gfs2_invalidate_folio() -> gfs2_discard() will remove them. They will be freed
in gfs2_release_folio().
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Function gfs2_logd() calls the log flushing functions gfs2_ail1_start(),
gfs2_ail1_wait(), and gfs2_ail1_empty() without holding sdp->sd_log_flush_lock,
but these functions require exclusion against concurrent transactions.
To fix that, add a non-locking __gfs2_log_flush() function. Then, in
gfs2_logd(), take sdp->sd_log_flush_lock before calling the above mentioned log
flushing functions and __gfs2_log_flush().
Fixes: 5e4c7632aa ("gfs2: Issue revokes more intelligently")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
When a withdrawn filesystem's inodes are being evicted, the address spaces of
those inodes still need to be truncated but we can no longer start new
transactions. We still don't want gfs2_invalidate_folio() to race with
gfs2_log_flush(), so take a read lock on sdp->sd_log_flush_lock in that case.
(It may not be obvious, but gfs2_invalidate_folio() is a jdata-only address
space operation.)
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
When a withdraw is carried out, call gfs2_ail_drain() under the
sdp->sd_log_flush_lock. This isn't strictly necessary but should be easier to
read, and more robust against possible future bugs.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
The locking in gfs2_trans_add_data() and gfs2_trans_add_meta() doesn't
follow the usual coding pattern of checking bh->b_private under lock,
allocating a new bufdata object with the locks dropped, and re-checking
once the lock has been reacquired. Both functions set bh->b_private
without holding the buffer lock. Fix that.
Also, in gfs2_trans_add_meta(), taking the folio lock during the
allocation doesn't actually do anything useful.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Rename trans_drain() to gfs2_trans_drain().
Add a new gfs2_trans_drain_list() helper and use it in
gfs2_trans_drain() to reduce code duplication.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Move gfs2_remove_from_journal() from meta_io.c to log.c and fix a minor
indentation glitch.
With that, gfs2_remove_from_ail() is now only used inside log.c, so it
can be made static.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
These two helpers only hide the locking operation; they do not make
the code more readable.
Created with:
sed -i -e 's:gfs2_log_unlock(sdp):spin_unlock(\&sdp->sd_log_lock):' \
-e 's:gfs2_log_lock(sdp):spin_lock(\&sdp->sd_log_lock):'
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
It turns out that for some workloads, the fix in commit b74cd55aa9
("gfs2: low-memory forced flush fixes") causes the number of forced log
flushes to increase to a degree that the overall filesystem performance
drops significantly. Address that by forcing a log flush only when
gfs2_writepages cannot make any progress rather than when it cannot make
"enough" progress.
Fixes: b74cd55aa9 ("gfs2: low-memory forced flush fixes")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
When gfs2_evict_inode() is called on an inode with unwritten data in the
page cache, the page cache needs to be written before it can be
truncated. This doesn't always happen. Fix that by changing
gfs2_evict_inode() to always either call evict_linked_inode() or
evict_unlinked_inode().
Inside evict_unlinked_inode(), first check if the inode is dirty. If it
is, make sure the inode glock is held and write back the data and
metadata. If it isn't, skip those steps.
Also, make sure that gfs2_evict_inode() calls gfs2_evict_inode() and
evict_unlinked_inode() only if ip->i_gl is not NULL; this avoids
unnecessary complications there.
Fixes xfstest generic/211.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Add gl helper variables in evict_unlinked_inode() and
evict_linked_inode(). This patch isn't very interesting by itself, but
it makes the next patch more readable.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
In evict_linked_inode(), the truncate_inode_pages() calls are carried
out inside a transaction. This code was added to what was then function
gfs2_delete_inode() in commit 16615be18c ("[GFS2] Clean up journaled
data writing").
These transactions are only used for creating revokes for the jdata
buffers in the journal, so don't create such transactions when we know
that the address space doesn't contain any jdata buffers for this inode
and truncate the metadata address space outside of the transaction.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
We are no longer using LM_FLAG_TRY or LM_FLAG_TRY_1CB during inode
evict, so ret cannot be GLR_TRYFAILED here.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Pull gfs2 updates from Andreas Gruenbacher:
- Prevent rename() from failing with -ESTALE when there are locking
conflicts and retry the operation instead
- Don't fail when fiemap triggers a page fault (xfstest generic/742)
- Fix another locking request cancellation bug
- Minor other fixes and cleanups
* tag 'gfs2-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: fiemap page fault fix
gfs2: fix memory leaks in gfs2_fill_super error path
gfs2: Fix use-after-free in iomap inline data write path
gfs2: Fix slab-use-after-free in qd_put
gfs2: Introduce glock_{type,number,sbd} helpers
gfs2: gfs2_glock_hold cleanup
gfs: Use fixed GL_GLOCK_MIN_HOLD time
gfs2: Fix gfs2_log_get_bio argument type
gfs2: gfs2_chain_bio start sector fix
gfs2: Initialize bio->bi_opf early
gfs2: Rename gfs2_log_submit_{bio -> write}
gfs2: Do not cancel internal demote requests
gfs2: run_queue cleanup
gfs2: Retries missing in gfs2_{rename,exchange}
gfs2: glock cancelation flag fix
Pull xfs updates from Carlos Maiolino:
"This contains several improvements to zoned device support,
performance improvements for the parent pointers, and a new health
monitoring feature. There are some improvements in the journaling code
too but no behavior change expected.
Last but not least, some code refactoring and bug fixes are also
included in this series"
* tag 'xfs-merge-7.0' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (67 commits)
xfs: add sysfs stats for zoned GC
xfs: give the defer_relog stat a xs_ prefix
xfs: add zone reset error injection
xfs: refactor zone reset handling
xfs: don't mark all discard issued by zoned GC as sync
xfs: allow setting errortags at mount time
xfs: use WRITE_ONCE/READ_ONCE for m_errortag
xfs: move the guts of XFS_ERRORTAG_DELAY out of line
xfs: don't validate error tags in the I/O path
xfs: allocate m_errortag early
xfs: fix the errno sign for the xfs_errortag_{add,clearall} stubs
xfs: validate log record version against superblock log version
xfs: fix spacing style issues in xfs_alloc.c
xfs: remove xfs_zone_gc_space_available
xfs: use a seprate member to track space availabe in the GC scatch buffer
xfs: check for deleted cursors when revalidating two btrees
xfs: fix UAF in xchk_btree_check_block_owner
xfs: check return value of xchk_scrub_create_subord
xfs: only call xf{array,blob}_destroy if we have a valid pointer
xfs: get rid of the xchk_xfile_*_descr calls
...
Pull erofs updates from Gao Xiang:
"In this cycle, inode page cache sharing among filesystems on the same
machine is now supported, which is particularly useful for
high-density hosts running tens of thousands of containers.
In addition, we fully isolate the EROFS core on-disk format from other
optional encoded layouts since the core on-disk part is designed to be
simple, effective, and secure. Users can use the core format to build
unique golden immutable images and import their filesystem trees
directly from raw block devices via DMA, page-mapped DAX devices,
and/or file-backed mounts without having to worry about unnecessary
intrinsic consistency issues found in other generic filesystems by
design. However, the full vision is still working in progress and will
spend more time to achieve final goals.
There are other improvements and bug fixes as usual, as listed below:
- Support inode page cache sharing among filesystems
- Formally separate optional encoded (aka compressed) inode layouts
(and the implementations) from the EROFS core on-disk aligned plain
format for future zero-trust security usage
- Improve performance by caching the fact that an inode does not have
a POSIX ACL
- Improve LZ4 decompression error reporting
- Enable LZMA by default and promote DEFLATE and Zstandard algorithms
out of EXPERIMENTAL status
- Switch to inode_set_cached_link() to cache symlink lengths
- random bugfixes and minor cleanups"
* tag 'erofs-for-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: (31 commits)
erofs: fix UAF issue for file-backed mounts w/ directio option
erofs: update compression algorithm status
erofs: fix inline data read failure for ztailpacking pclusters
erofs: avoid some unnecessary #ifdefs
erofs: handle end of filesystem properly for file-backed mounts
erofs: separate plain and compressed filesystems formally
erofs: use inode_set_cached_link()
erofs: mark inodes without acls in erofs_read_inode()
erofs: implement .fadvise for page cache share
erofs: support compressed inodes for page cache share
erofs: support unencoded inodes for page cache share
erofs: pass inode to trace_erofs_read_folio
erofs: introduce the page cache share feature
erofs: using domain_id in the safer way
erofs: add erofs_inode_set_aops helper to set the aops
erofs: support user-defined fingerprint name
erofs: decouple `struct erofs_anon_fs_type`
fs: Export alloc_empty_backing_file
erofs: tidy up erofs_init_inode_xattrs()
erofs: add missing documentation about `directio` mount option
...
Pull hfs/hfsplus updates from Viacheslav Dubeyko:
"This pull request contains several fixes of syzbot reported issues and
HFS+ fixes of xfstests failures.
- fix an issue reported by syzbot triggering BUG_ON() in the case of
corrupted superblock, replacing the BUG_ON()s with proper error
handling (Jori Koolstra)
- fix memory leaks in the mount logic of HFS/HFS+ file systems. When
HFS/HFS+ were converted to the new mount api a bug was introduced
by changing the allocation pattern of sb->s_fs_info (Mehdi Ben Hadj
Khelifa)
- fix hfs_bnode_create() by returning ERR_PTR(-EEXIST) instead of
the node pointer when it's already hashed. This avoids a double
unload_nls() on mount failure (suggested by Shardul Bankar)
- set inode's mode as regular file for system inodes (Tetsuo Handa)
The rest fix failures in generic/020, generic/037, generic/062,
generic/480, and generic/498 xfstests for the case of HFS+ file
system. Currently, only 30 xfstests' test-cases experience failures
for HFS+ file system (initially, it was around 100 failed xfstests)"
* tag 'hfs-v7.0-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/hfs:
hfsplus: avoid double unload_nls() on mount failure
hfsplus: fix warning issue in inode.c
hfsplus: fix generic/062 xfstests failure
hfsplus: fix generic/037 xfstests failure
hfsplus: pretend special inodes as regular files
hfsplus: return error when node already exists in hfs_bnode_create
hfs: Replace BUG_ON with error handling for CNID count checks
hfsplus: fix generic/020 xfstests failure
hfsplus: fix volume corruption issue for generic/498
hfsplus: fix volume corruption issue for generic/480
hfsplus: ensure sb->s_fs_info is always cleaned up
hfs: ensure sb->s_fs_info is always cleaned up
Pull nilfs2 updates from Viacheslav Dubeyko:
- Fix potential block overflow that cause system hang
When executing the FITRIM command, an underflow can occur in the
calculation of nblocks. This ultimately leads to the block layer
function __blkdev_issue_discard() taking an excessively long time
to process the bio chain, and the ns_segctor_sem lock remains held
for a long period.
This prevents other tasks from acquiring the ns_segctor_sem lock,
resulting in a hang reported by syzbot (Edward Adam Davis)
- Fix missing struct keywords in nilfs2_api.h kernel-doc (Ryusuke
Konishi)
- Convert nilfs_super_block to kernel-doc
Eliminate 40+ kernel-doc warnings in nilfs2_ondisk.h by converting
all of the struct member comments to kernel-doc comments (Randy
Dunlap)
* tag 'nilfs2-v7.0-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/nilfs2:
nilfs2: fix missing struct keywords in nilfs2_api.h kernel-doc
nilfs2: convert nilfs_super_block to kernel-doc
nilfs2: Fix potential block overflow that cause system hang
Pull btrfs updates from David Sterba:
"User visible changes, feature updates:
- when using block size > page size, enable direct IO
- fallback to buffered IO if the data profile has duplication,
workaround to avoid checksum mismatches on block group profiles
with redundancy, real direct IO is possible on single or RAID0
- redo export of zoned statistics, moved from sysfs to
/proc/pid/mountstats due to size limitations of the former
Experimental features:
- remove offload checksum tunable, intended to find best way to do it
but since we've switched to offload to thread for everything we
don't need it anymore
- initial support for remap-tree feature, a translation layer of
logical block addresses that allow changes without moving/rewriting
blocks to do eg. relocation, or other changes that require COW
Notable fixes:
- automatic removal of accidentally leftover chunks when
free-space-tree is enabled since mkfs.btrfs v6.16.1
- zoned mode:
- do not try to append to conventional zones when RAID is mixing
zoned and conventional drives
- fixup write pointers when mixing zoned and conventional on
DUP/RAID* profiles
- when using squota, relax deletion rules for qgroups with 0 members
to allow easier recovery from accounting bugs, also add more checks
to detect bad accounting
- fix periodic reclaim scanning, properly check boundary conditions
not to trigger it unexpectedly or miss the time to run it
- trim:
- continue after first error
- change reporting to the first detected error
- add more cancellation points
- reduce contention of big device lock that can block other
operations when there's lots of trimmed space
- when chunk allocation is forced (needs experimental build) fix
transaction abort when unexpected space layout is detected
Core:
- switch to crypto library API for checksumming, removed module
dependencies, pointer indirections, etc.
- error handling improvements
- adjust how and where transaction commit or abort are done and are
maybe not necessary
- minor compression optimization to skip single block ranges
- improve how compression folios are handled
- new and updated selftests
- cleanups, refactoring:
- auto-freeing and other automatic variable cleanup conversion
- structure size optimizations
- condition annotations"
* tag 'for-6.20-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (137 commits)
btrfs: get rid of compressed_bio::compressed_folios[]
btrfs: get rid of compressed_folios[] usage for encoded writes
btrfs: get rid of compressed_folios[] usage for compressed read
btrfs: remove the old btrfs_compress_folios() infrastructure
btrfs: switch to btrfs_compress_bio() interface for compressed writes
btrfs: introduce btrfs_compress_bio() helper
btrfs: zlib: introduce zlib_compress_bio() helper
btrfs: zstd: introduce zstd_compress_bio() helper
btrfs: lzo: introduce lzo_compress_bio() helper
btrfs: zoned: factor out the zone loading part into a testable function
btrfs: add cleanup function for btrfs_free_chunk_map
btrfs: tests: add cleanup functions for test specific functions
btrfs: raid56: fix memory leak of btrfs_raid_bio::stripe_uptodate_bitmap
btrfs: tests: add unit tests for pending extent walking functions
btrfs: fix EEXIST abort due to non-consecutive gaps in chunk allocation
btrfs: fix transaction commit blocking during trim of unallocated space
btrfs: handle user interrupt properly in btrfs_trim_fs()
btrfs: preserve first error in btrfs_trim_fs()
btrfs: continue trimming remaining devices on failure
btrfs: do not BUG_ON() in btrfs_remove_block_group()
...
The 'max()' value of a 'long long' and an 'unsigned int' is problematic
if the former is negative:
In function 'fuse_wr_pages',
inlined from 'fuse_perform_write' at fs/fuse/file.c:1347:27:
include/linux/compiler_types.h:652:45: error: call to '__compiletime_assert_390' declared with attribute error: min(((pos + len - 1) >> 12) - (pos >> 12) + 1, max_pages) signedness error
652 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^
Use a temporary variable to make it clearer what is going on here.
Fixes: 0f5bb0cfb0 ("fs: use min() or umin() instead of min_t()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull misc vfs updates from Christian Brauner:
"This contains a mix of VFS cleanups, performance improvements, API
fixes, documentation, and a deprecation notice.
Scalability and performance:
- Rework pid allocation to only take pidmap_lock once instead of
twice during alloc_pid(), improving thread creation/teardown
throughput by 10-16% depending on false-sharing luck. Pad the
namespace refcount to reduce false-sharing
- Track file lock presence via a flag in ->i_opflags instead of
reading ->i_flctx, avoiding false-sharing with ->i_readcount on
open/close hot paths. Measured 4-16% improvement on 24-core
open-in-a-loop benchmarks
- Use a consume fence in locks_inode_context() to match the
store-release/load-consume idiom, eliminating a hardware fence on
some architectures
- Annotate cdev_lock with __cacheline_aligned_in_smp to prevent
false-sharing
- Remove a redundant DCACHE_MANAGED_DENTRY check in
__follow_mount_rcu() that never fires since the caller already
verifies it, eliminating a 100% mispredicted branch
- Fix a 100% mispredicted likely() in devcgroup_inode_permission()
that became wrong after a prior code reorder
Bug fixes and correctness:
- Make insert_inode_locked() wait for inode destruction instead of
skipping, fixing a corner case where two matching inodes could
exist in the hash
- Move f_mode initialization before file_ref_init() in alloc_file()
to respect the SLAB_TYPESAFE_BY_RCU ordering contract
- Add a WARN_ON_ONCE guard in try_to_free_buffers() for folios with
no buffers attached, preventing a null pointer dereference when
AS_RELEASE_ALWAYS is set but no release_folio op exists
- Fix select restart_block to store end_time as timespec64, avoiding
truncation of tv_sec on 32-bit architectures
- Make dump_inode() use get_kernel_nofault() to safely access inode
and superblock fields, matching the dump_mapping() pattern
API modernization:
- Make posix_acl_to_xattr() allocate the buffer internally since
every single caller was doing it anyway. Reduces boilerplate and
unnecessary error checking across ~15 filesystems
- Replace deprecated simple_strtoul() with kstrtoul() for the
ihash_entries, dhash_entries, mhash_entries, and mphash_entries
boot parameters, adding proper error handling
- Convert chardev code to use guard(mutex) and __free(kfree) cleanup
patterns
- Replace min_t() with min() or umin() in VFS code to avoid silently
truncating unsigned long to unsigned int
- Gate LOOKUP_RCU assertions behind CONFIG_DEBUG_VFS since callers
already check the flag
Deprecation:
- Begin deprecating legacy BSD process accounting (acct(2)). The
interface has numerous footguns and better alternatives exist
(eBPF)
Documentation:
- Fix and complete kernel-doc for struct export_operations, removing
duplicated documentation between ReST and source
- Fix kernel-doc warnings for __start_dirop() and ilookup5_nowait()
Testing:
- Add a kunit test for initramfs cpio handling of entries with
filesize > PATH_MAX
Misc:
- Add missing <linux/init_task.h> include in fs_struct.c"
* tag 'vfs-7.0-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits)
posix_acl: make posix_acl_to_xattr() alloc the buffer
fs: make insert_inode_locked() wait for inode destruction
initramfs_test: kunit test for cpio.filesize > PATH_MAX
fs: improve dump_inode() to safely access inode fields
fs: add <linux/init_task.h> for 'init_fs'
docs: exportfs: Use source code struct documentation
fs: move initializing f_mode before file_ref_init()
exportfs: Complete kernel-doc for struct export_operations
exportfs: Mark struct export_operations functions at kernel-doc
exportfs: Fix kernel-doc output for get_name()
acct(2): begin the deprecation of legacy BSD process accounting
device_cgroup: remove branch hint after code refactor
VFS: fix __start_dirop() kernel-doc warnings
fs: Describe @isnew parameter in ilookup5_nowait()
fs/namei: Remove redundant DCACHE_MANAGED_DENTRY check in __follow_mount_rcu
fs: only assert on LOOKUP_RCU when built with CONFIG_DEBUG_VFS
select: store end_time as timespec64 in restart block
chardev: Switch to guard(mutex) and __free(kfree)
namespace: Replace simple_strtoul with kstrtoul to parse boot params
dcache: Replace simple_strtoul with kstrtoul in set_dhash_entries
...
Pull vfs iomap updates from Christian Brauner:
- Erofs page cache sharing preliminaries:
Plumb a void *private parameter through iomap_read_folio() and
iomap_readahead() into iomap_iter->private, matching iomap DIO. Erofs
uses this to replace a bogus kmap_to_page() call, as preparatory work
for page cache sharing.
- Fix for invalid folio access:
Fix an invalid folio access when a folio without iomap_folio_state
is fully submitted to the IO helper — the helper may call
folio_end_read() at any time, so ctx->cur_folio must be invalidated
after full submission.
* tag 'vfs-7.0-rc1.iomap' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
iomap: fix invalid folio access after folio_end_read()
erofs: hold read context in iomap_iter if needed
iomap: stash iomap read ctx in the private field of iomap_iter
Pull vfs mount updates from Christian Brauner:
- statmount: accept fd as a parameter
Extend struct mnt_id_req with a file descriptor field and a new
STATMOUNT_BY_FD flag. When set, statmount() returns mount information
for the mount the fd resides on — including detached mounts
(unmounted via umount2(MNT_DETACH)).
For detached mounts the STATMOUNT_MNT_POINT and STATMOUNT_MNT_NS_ID
mask bits are cleared since neither is meaningful. The capability
check is skipped for STATMOUNT_BY_FD since holding an fd already
implies prior access to the mount and equivalent information is
available through fstatfs() and /proc/pid/mountinfo without
privilege. Includes comprehensive selftests covering both attached
and detached mount cases.
- fs: Remove internal old mount API code (1 patch)
Now that every in-tree filesystem has been converted to the new
mount API, remove all the legacy shim code in fs_context.c that
handled unconverted filesystems. This deletes ~280 lines including
legacy_init_fs_context(), the legacy_fs_context struct, and
associated wrappers. The mount(2) syscall path for userspace remains
untouched. Documentation references to the legacy callbacks are
cleaned up.
- mount: add OPEN_TREE_NAMESPACE to open_tree()
Container runtimes currently use CLONE_NEWNS to copy the caller's
entire mount namespace — only to then pivot_root() and recursively
unmount everything they just copied. With large mount tables and
thousands of parallel container launches this creates significant
contention on the namespace semaphore.
OPEN_TREE_NAMESPACE copies only the specified mount tree (like
OPEN_TREE_CLONE) but returns a mount namespace fd instead of a
detached mount fd. The new namespace contains the copied tree mounted
on top of a clone of the real rootfs.
This functions as a combined unshare(CLONE_NEWNS) + pivot_root() in a
single syscall. Works with user namespaces: an unshare(CLONE_NEWUSER)
followed by OPEN_TREE_NAMESPACE creates a mount namespace owned by
the new user namespace. Mount namespace file mounts are excluded from
the copy to prevent cycles. Includes ~1000 lines of selftests"
* tag 'vfs-7.0-rc1.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
selftests/open_tree: add OPEN_TREE_NAMESPACE tests
mount: add OPEN_TREE_NAMESPACE
fs: Remove internal old mount API code
selftests: statmount: tests for STATMOUNT_BY_FD
statmount: accept fd as a parameter
statmount: permission check should return EPERM
Pull vfs atomic_open updates from Christian Brauner:
"Allow knfsd to use atomic_open()
While knfsd offers combined exclusive create and open results to
clients, on some filesystems those results are not atomic. The
separate vfs_create() + vfs_open() sequence in dentry_create() can
produce races and unexpected errors. For example, open O_CREAT with
mode 0 will succeed in creating the file but return -EACCES from
vfs_open(). Additionally, network filesystems benefit from reducing
remote round-trip operations by using a single atomic_open() call.
Teach dentry_create() -- whose sole caller is knfsd -- to use
atomic_open() for filesystems that support it"
* tag 'vfs-7.0-rc1.atomic_open' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs/namei: fix kernel-doc markup for dentry_create
VFS/knfsd: Teach dentry_create() to use atomic_open()
VFS: Prepare atomic_open() for dentry_create()
VFS: move dentry_create() from fs/open.c to fs/namei.c
Pull vfs nullfs update from Christian Brauner:
"Add a completely catatonic minimal pseudo filesystem called "nullfs"
and make pivot_root() work in the initramfs.
Currently pivot_root() does not work on the real rootfs because it
cannot be unmounted. Userspace has to recursively delete initramfs
contents manually before continuing boot, using the fragile
switch_root sequence (overmount + chroot).
Add nullfs, a minimal immutable filesystem that serves as the true
root of the mount hierarchy. The mutable rootfs (tmpfs/ramfs) is
mounted on top of it. This allows userspace to simply:
chdir(new_root);
pivot_root(".", ".");
umount2(".", MNT_DETACH);
without the traditional switch_root workarounds. systemd already
handles this correctly. It tries pivot_root() first and falls back
to MS_MOVE only when that fails.
This also means rootfs mounts in unprivileged namespaces no longer
need MNT_LOCKED, since the immutable nullfs guarantees nothing can be
revealed by unmounting the covering mount.
nullfs is a single-instance filesystem (get_tree_single()) marked
SB_NOUSER | SB_I_NOEXEC | SB_I_NODEV with an immutable empty root
directory. This means sooner or later it can be used to overmount
other directories to hide their contents without any additional
protection needed.
We enable it unconditionally. If we see any real regression we'll
hide it behind a boot option.
nullfs has extensions beyond this in the future. It will serve as a
concept to support the creation of completely empty mount namespaces -
which is work coming up in the next cycle"
* tag 'vfs-7.0-rc1.nullfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs: use nullfs unconditionally as the real rootfs
docs: mention nullfs
fs: add immutable rootfs
fs: add init_pivot_root()
fs: ensure that internal tmpfs mount gets mount id zero
Pull minix update from Christian Brauner:
"Consolidate and strengthen superblock validation in
minix_check_superblock()
The minix filesystem driver does not validate several superblock
fields before using them during mount, allowing a crafted filesystem
image to trigger out-of-bounds accesses (reported by syzbot)"
* tag 'vfs-7.0-rc1.minix' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
minix: Add required sanity checking to minix_check_superblock()
Pull vfs updates for btrfs from Christian Brauner:
"This contains some changes for btrfs that are taken to the vfs tree to
stop duplicating VFS code for subvolume/snapshot dentry
Btrfs has carried private copies of the VFS may_delete() and
may_create() functions in fs/btrfs/ioctl.c for permission checks
during subvolume creation and snapshot destruction. These copies have
drifted out of sync with the VFS originals — btrfs_may_delete() is
missing the uid/gid validity check and btrfs_may_create() is missing
the audit_inode_child() call.
Export the VFS functions as may_{create,delete}_dentry() and switch
btrfs to use them, removing ~70 lines of duplicated code"
* tag 'vfs-7.0-rc1.btrfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
btrfs: use may_create_dentry() in btrfs_mksubvol()
btrfs: use may_delete_dentry() in btrfs_ioctl_snap_destroy()
fs: export may_create() as may_create_dentry()
fs: export may_delete() as may_delete_dentry()
Pull vfs error reporting updates from Christian Brauner:
"This contains the changes to support generic I/O error reporting.
Filesystems currently have no standard mechanism for reporting
metadata corruption and file I/O errors to userspace via fsnotify.
Each filesystem (xfs, ext4, erofs, f2fs, etc.) privately defines
EFSCORRUPTED, and error reporting to fanotify is inconsistent or
absent entirely.
This introduces a generic fserror infrastructure built around struct
super_block that gives filesystems a standard way to queue metadata
and file I/O error reports for delivery to fsnotify.
Errors are queued via mempools and queue_work to avoid holding
filesystem locks in the notification path; unmount waits for pending
events to drain. A new super_operations::report_error callback lets
filesystem drivers respond to file I/O errors themselves (to be used
by an upcoming XFS self-healing patchset).
On the uapi side, EFSCORRUPTED and EUCLEAN are promoted from private
per-filesystem definitions to canonical errno.h values across all
architectures"
* tag 'vfs-7.0-rc1.fserror' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
ext4: convert to new fserror helpers
xfs: translate fsdax media errors into file "data lost" errors when convenient
xfs: report fs metadata errors via fsnotify
iomap: report file I/O errors to the VFS
fs: report filesystem and file I/O errors to fsnotify
uapi: promote EFSCORRUPTED and EUCLEAN to errno.h
Pull vfs lease updates from Christian Brauner:
"This contains updates for lease support to require filesystems to
explicitly opt-in to lease support
Currently kernel_setlease() falls through to generic_setlease() when a
a filesystem does not define ->setlease(), silently granting lease
support to every filesystem regardless of whether it is prepared for
it.
This is a poor default: most filesystems never intended to support
leases, and the silent fallthrough makes it impossible to distinguish
"supports leases" from "never thought about it".
This inverts the default. It adds explicit
.setlease = generic_setlease;
assignments to every in-tree filesystem that should retain lease
support, then changes kernel_setlease() to return -EINVAL when
->setlease is NULL.
With the new default in place, simple_nosetlease() is redundant and
is removed along with all references to it"
* tag 'vfs-7.0-rc1.leases' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (25 commits)
fuse: add setlease file operation
fs: remove simple_nosetlease()
filelock: default to returning -EINVAL when ->setlease operation is NULL
xfs: add setlease file operation
ufs: add setlease file operation
udf: add setlease file operation
tmpfs: add setlease file operation
squashfs: add setlease file operation
overlayfs: add setlease file operation
orangefs: add setlease file operation
ocfs2: add setlease file operation
ntfs3: add setlease file operation
nilfs2: add setlease file operation
jfs: add setlease file operation
jffs2: add setlease file operation
gfs2: add a setlease file operation
fat: add setlease file operation
f2fs: add setlease file operation
exfat: add setlease file operation
ext4: add setlease file operation
...
Pull vfs timestamp updates from Christian Brauner:
"This contains the changes to support non-blocking timestamp updates.
Since commit 66fa3cedf1 ("fs: Add async write file modification
handling") file_update_time_flags() unconditionally returns -EAGAIN
when any timestamp needs updating and IOCB_NOWAIT is set. This makes
non-blocking direct writes impossible on file systems with granular
enough timestamps, which in practice means all of them.
This reworks the timestamp update path to propagate IOCB_NOWAIT
through ->update_time so that file systems which can update timestamps
without blocking are no longer penalized.
With that groundwork in place, the core change passes IOCB_NOWAIT into
->update_time and returns -EAGAIN only when the file system indicates
it would block.
XFS implements non-blocking timestamp updates by using the new
->sync_lazytime and open-coding generic_update_time without the
S_NOWAIT check, since the lazytime path through the generic helpers
can never block in XFS"
* tag 'vfs-7.0-rc1.nonblocking_timestamps' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
xfs: enable non-blocking timestamp updates
xfs: implement ->sync_lazytime
fs: refactor file_update_time_flags
fs: add support for non-blocking timestamp updates
fs: add a ->sync_lazytime method
fs: factor out a sync_lazytime helper
fs: refactor ->update_time handling
fat: cleanup the flags for fat_truncate_time
nfs: split nfs_update_timestamps
fs: allow error returns from generic_update_time
fs: remove inode_update_time
Pull vfs initrd removal from Christian Brauner:
"Remove the deprecated linuxrc-based initrd code path and related dead
code. The linuxrc initrd path was deprecated in 2020 and this series
completes its removal. If we see real-life regressions we'll revert.
The core change removes handle_initrd() and init_linuxrc() — the
entire flow that ran /linuxrc from an initrd, pivoted roots, and
handed off to the real root filesystem. With that gone, initrd_load()
becomes void (no longer short-circuits prepare_namespace()),
rd_load_image() is simplified to always load /initrd.image instead of
taking a path, and rd_load_disk() is deleted.
The /proc/sys/kernel/real-root-dev sysctl and its backing variable are
removed since they only existed for linuxrc to communicate the real
root device back to the kernel.
The no-op load_ramdisk= and prompt_ramdisk= parameters are dropped,
and noinitrd and ramdisk_start= gain deprecation warnings.
Initramfs is entirely unaffected. The non-linuxrc initrd path
(root=/dev/ram0) is preserved but now carries a deprecation warning
targeting January 2027 removal"
* tag 'vfs-7.0-rc1.initrd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
init: remove /proc/sys/kernel/real-root-dev
initrd: remove deprecated code path (linuxrc)
init: remove deprecated "load_ramdisk" and "prompt_ramdisk" command line parameters
Pull vfs rust updates from Christian Brauner:
"Allow inlining C helpers into Rust when using LTO: Add the
__rust_helper annotation to all VFS-related Rust helper functions.
Currently, C helpers cannot be inlined into Rust code even under LTO
because LLVM detects slightly different codegen options between the C
and Rust compilation units (differing null-pointer-check flags,
builtin lists, and target feature strings). The __rust_helper macro is
the first step toward fixing this: it is currently #defined to
nothing, but a follow-up series will change it to __always_inline when
compiling with LTO (while keeping it empty for bindgen, which ignores
inline functions).
This picks up the VFS portion (fs, pid_namespace, poll) of a larger
tree-wide series"
* tag 'vfs-7.0-rc1.rust' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
rust: poll: add __rust_helper to helpers
rust: pid_namespace: add __rust_helper to helpers
rust: fs: add __rust_helper to helpers
Pull selinux updates from Paul Moore:
- Add support for SELinux based access control of BPF tokens
We worked with the BPF devs to add the necessary LSM hooks when the
BPF token code was first introduced, but it took us a bit longer to
add the SELinux wiring and support.
In order to preserve existing token-unaware SELinux policies, the new
code is gated by the new "bpf_token_perms" policy capability.
Additional details regarding the new permissions, and behaviors can
be found in the associated commit.
- Remove a BUG() from the SELinux capability code
We now perform a similar check during compile time so we can safely
remove the BUG() call.
* tag 'selinux-pr-20260203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: drop the BUG() in cred_has_capability()
selinux: fix a capabilities parsing typo in selinux_bpf_token_capable()
selinux: add support for BPF token access control
selinux: move the selinux_blob_sizes struct
Pull lsm updates from Paul Moore:
- Unify the security_inode_listsecurity() calls in NFSv4
While looking at security_inode_listsecurity() with an eye towards
improving the interface, we realized that the NFSv4 code was making
multiple calls to the LSM hook that could be consolidated into one.
- Mark the LSM static branch keys as static - this helps resolve some
sparse warnings
- Add __rust_helper annotations to the LSM and cred wrapper functions
- Remove the unsused set_security_override_from_ctx() function
- Minor fixes to some of the LSM kdoc comment blocks
* tag 'lsm-pr-20260203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
lsm: make keys for static branch static
cred: remove unused set_security_override_from_ctx()
rust: security: add __rust_helper to helpers
rust: cred: add __rust_helper to helpers
nfs: unify security_inode_listsecurity() calls
lsm: fix kernel-doc struct member names
Pull audit updates from Paul Moore:
- Improve the NETFILTER_PKT audit records
Add source and destination ports to the NETFILTER_PKT audit records
while also consolidating a lot of the code into a new, singular
audit_log_nf_skb() function. This new approach to structuring the
NETFILTER_PKT record generation should eliminate some unnecessary
overhead when audit is not built into the kernel.
- Update the audit syscall classifier code
Add the listxattrat(), getxattrat(), and fchmodat2() syscall to the
audit code which classifies syscalls into categories of operations,
e.g. "read" or "change attributes".
- Move the syscall classifier declarations into audit_arch.h
Shuffle around some header file declarations to resolve some sparse
warnings.
* tag 'audit-pr-20260203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: move the compat_xxx_class[] extern declarations to audit_arch.h
audit: add missing syscalls to read class
audit: include source and destination ports to NETFILTER_PKT
audit: add audit_log_nf_skb helper function
audit: add fchmodat2() to change attributes class
Pull tpm updates from Jarkko Sakkinen.
* tag 'tpmdd-next-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
tpm: st33zp24: Fix missing cleanup on get_burstcount() error
tpm: tpm_i2c_infineon: Fix locality leak on get_burstcount() failure
Pull i3c updates from Alexandre Belloni:
"Subsystem:
- add sysfs entry and attribute for Device NACK Retry count
Drivers:
- dw: Device NACK Retry configuration knob
- mipi-i3c-hci: support multi-bus instances, runtime PM, and suspend
- renesas: suspend/resume support"
* tag 'i3c/for-6.20' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux: (52 commits)
i3c: dw-i3c-master: fix SIR reject bit mapping for dynamic addresses
i3c: dw-i3c-master: convert spinlock usage to scoped guards
i3c: dw: Fix memory leak in dw_i3c_master_i2c_xfers()
i3c: mipi-i3c-hci-pci: Add System Suspend support
i3c: mipi-i3c-hci: Add optional System Suspend support
i3c: master: Add i3c_master_do_daa_ext() for post-hibernation address recovery
i3c: dw: Initialize spinlock to avoid upsetting lockdep
i3c: mipi-i3c-hci-pci: Add Runtime PM support
i3c: mipi-i3c-hci: Add optional Runtime PM support
i3c: master: Introduce optional Runtime PM support
i3c: mipi-i3c-hci: Factor out master dynamic address setting into helper
i3c: mipi-i3c-hci: Allow core re-initialization for Runtime PM support
i3c: mipi-i3c-hci: Factor out core initialization into helper
i3c: mipi-i3c-hci: Factor out IO mode setting into helper
i3c: mipi-i3c-hci: Factor out software reset into helper
i3c: mipi-i3c-hci: Add PIO suspend and resume support
i3c: mipi-i3c-hci: Refactor PIO register initialization
i3c: mipi-i3c-hci: Add DMA suspend and resume support
i3c: mipi-i3c-hci: Extract ring initialization from hci_dma_init()
i3c: mipi-i3c-hci: Introduce helper to restore DAT
...
Pull RCU updates from Boqun Feng:
- RCU Tasks Trace:
Re-implement RCU tasks trace in term of SRCU-fast, not only more than
500 lines of code are saved because of the reimplementation, a new
set of API, rcu_read_{,un}lock_tasks_trace(), becomes possible as
well. Compared to the previous rcu_read_{,un}lock_trace(), the new
API avoid the task_struct accesses thanks to the SRCU-fast semantics.
As a result, the old rcu_read{,un}lock_trace() API is now deprecated.
- RCU Torture Test:
- Multiple improvements on kvm-series.sh (parallel run and
progress showing metrics)
- Add context checks to rcu_torture_timer()
- Make config2csv.sh properly handle comments in .boot files
- Include commit discription in testid.txt
- Miscellaneous RCU changes:
- Reduce synchronize_rcu() latency by reporting GP kthread's
CPU QS early
- Use suitable gfp_flags for the init_srcu_struct_nodes()
- Fix rcu_read_unlock() deadloop due to softirq
- Correctly compute probability to invoke ->exp_current()
in rcutorture
- Make expedited RCU CPU stall warnings detect stall-end races
- RCU nocb:
- Remove unnecessary WakeOvfIsDeferred wake path and callback
overload handling
- Extract nocb_defer_wakeup_cancel() helper
* tag 'rcu.release.v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (25 commits)
rcu/nocb: Extract nocb_defer_wakeup_cancel() helper
rcu/nocb: Remove dead callback overload handling
rcu/nocb: Remove unnecessary WakeOvfIsDeferred wake path
rcu: Reduce synchronize_rcu() latency by reporting GP kthread's CPU QS early
srcu: Use suitable gfp_flags for the init_srcu_struct_nodes()
rcu: Fix rcu_read_unlock() deadloop due to softirq
rcutorture: Correctly compute probability to invoke ->exp_current()
rcu: Make expedited RCU CPU stall warnings detect stall-end races
rcutorture: Add --kill-previous option to terminate previous kvm.sh runs
rcutorture: Prevent concurrent kvm.sh runs on same source tree
torture: Include commit discription in testid.txt
torture: Make config2csv.sh properly handle comments in .boot files
torture: Make kvm-series.sh give run numbers and totals
torture: Make kvm-series.sh give build numbers and totals
torture: Parallelize kvm-series.sh guest-OS execution
rcutorture: Add context checks to rcu_torture_timer()
rcutorture: Test rcu_tasks_trace_expedite_current()
srcu: Create an rcu_tasks_trace_expedite_current() function
checkpatch: Deprecate rcu_read_{,un}lock_trace()
rcu: Update Requirements.rst for RCU Tasks Trace
...
Pull kselftest updates from Shuah Khan:
"resctrl test:
- fix division by zero error on Hygon
- fix non-contiguous CBM check for Hygon
- define CPU vendor IDs as bits to match usage
- add CPU vendor detection for Hygon
misc:
- coredeump test: use __builtin_trap() instead of a null pointer
- anon_inode: replace null pointers with empty arrays
- kublk: include message in _Static_assert for C11 compatibility
- run_kselftest.sh: add `--skip` argument option
- pidfd: fix typo in comment"
* tag 'linux_kselftest-next-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/pidfd: fix typo in comment
selftests/run_kselftest.sh: Add `--skip` argument option
selftests/resctrl: Fix non-contiguous CBM check for Hygon
selftests/resctrl: Add CPU vendor detection for Hygon
selftests/resctrl: Define CPU vendor IDs as bits to match usage
selftests/resctrl: Fix a division by zero error on Hygon
kselftest/kublk: include message in _Static_assert for C11 compatibility
kselftest/anon_inode: replace null pointers with empty arrays
kselftest/coredump: use __builtin_trap() instead of null pointer
Pull kunit updates from Shuah Khan:
"kunit:
- add __rust_helper to helpers
- fix up const mismatch in many assert functions
- fix up const mismatch in test_list_sort
- protect KUNIT_BINARY_STR_ASSERTION against ERR_PTR values
- respect KBUILD_OUTPUT env variable by default
- add bash completion
kunit tool:
- add test for nested test result reporting
- do not overwrite test status based on subtest counts
- add 32-bit big endian ARM configuration to qemu_configs
- rename test_data_path() to _test_data_path()
- do not rely on implicit working directory change"
* tag 'linux_kselftest-kunit-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: add bash completion
kunit: tool: test: Don't rely on implicit working directory change
kunit: tool: test: Rename test_data_path() to _test_data_path()
kunit: qemu_configs: Add 32-bit big endian ARM configuration
kunit: tool: Don't overwrite test status based on subtest counts
kunit: tool: Add test for nested test result reporting
kunit: respect KBUILD_OUTPUT env variable by default
kunit: Protect KUNIT_BINARY_STR_ASSERTION against ERR_PTR values
test_list_sort: fix up const mismatch
kunit: fix up const mis-match in many assert functions
rust: kunit: add __rust_helper to helpers
Pull auxdisplay updates from Andy Shevchenko:
- A good refactoring of arm-charlcd to use modern kernel APIs
* tag 'auxdisplay-v6.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-auxdisplay:
auxdisplay: max6959: Replace slab.h with device/devres.h
auxdisplay: arm-charlcd: Remove redundant ternary operators
auxdisplay: arm-charlcd: Join string literals back
auxdisplay: arm-charlcd: Use readl_poll_timeout
auxdisplay: arm-charlcd: Don't use "proxy" headers
auxdisplay: arm-charlcd: drop of_match_ptr()
auxdisplay: arm-charlcd: Remove unneeded info message
auxdisplay: arm-charlcd: convert to use device managed APIs
auxdisplay: arm-charlcd: fix release_mem_region() size
Pull i2c fix from Wolfram Sang:
- imx: preserve error state during SMBus block read length handling
* tag 'i2c-for-6.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: imx: preserve error state in block data length handler
Pull spi fixes from Mark Brown:
"One final batch of fixes for the Tegra SPI drivers, the main one is a
batch of fixes for races with the interrupts in the Tegra210 QSPI
driver that Breno has been working on for a while"
* tag 'spi-fix-v6.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: tegra114: Preserve SPI mode bits in def_command1_reg
spi: tegra: Fix a memory leak in tegra_slink_probe()
spi: tegra210-quad: Protect curr_xfer check in IRQ handler
spi: tegra210-quad: Protect curr_xfer clearing in tegra_qspi_non_combined_seq_xfer
spi: tegra210-quad: Protect curr_xfer in tegra_qspi_combined_seq_xfer
spi: tegra210-quad: Protect curr_xfer assignment in tegra_qspi_setup_transfer_one
spi: tegra210-quad: Move curr_xfer read inside spinlock
spi: tegra210-quad: Return IRQ_HANDLED when timeout already processed transfer
Pull regulator fix from Mark Brown:
"One last fix for v6.19: the voltages for the SpaceMIT P1 were not
described correctly"
* tag 'regulator-fix-v6.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: spacemit-p1: Fix n_voltages for BUCK and LDO regulators
Pull binder fixes from Greg KH:
"Here are some small, last-minute binder C and Rust driver fixes for
reported issues. They include a number of fixes for reported crashes
and other problems.
All of these have been in linux-next this week, and longer"
* tag 'char-misc-6.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
binderfs: fix ida_alloc_max() upper bound
rust_binderfs: fix ida_alloc_max() upper bound
binder: fix BR_FROZEN_REPLY error log
rust_binder: add additional alignment checks
binder: fix UAF in binder_netlink_report()
rust_binder: correctly handle FDA objects of length zero
Pull scheduler fixes from Ingo Molnar:
"Miscellaneous MMCID fixes to address bugs and performance regressions
in the recent rewrite of the SCHED_MM_CID management code:
- Fix livelock triggered by BPF CI testing
- Fix hard lockup on weakly ordered systems
- Simplify the dropping of CIDs in the exit path by removing an
unintended transition phase
- Fix performance/scalability regression on a thread-pool benchmark
by optimizing transitional CIDs when scheduling out"
* tag 'sched-urgent-2026-02-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/mmcid: Optimize transitional CIDs when scheduling out
sched/mmcid: Drop per CPU CID immediately when switching to per task mode
sched/mmcid: Protect transition on weakly ordered systems
sched/mmcid: Prevent live lock on task to CPU mode transition
Pull objtool fixes from Ingo Molnar::
- Bump up the Clang minimum version requirements for livepatch
builds, due to Clang assembler section handling bugs causing
silent miscompilations
- Strip livepatching symbol artifacts from non-livepatch modules
- Fix livepatch build warnings when certain Clang LTO options
are enabled
- Fix livepatch build error when CONFIG_MEM_ALLOC_PROFILING_DEBUG=y
* tag 'objtool-urgent-2026-02-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool/klp: Fix unexported static call key access for manually built livepatch modules
objtool/klp: Fix symbol correlation for orphaned local symbols
livepatch: Free klp_{object,func}_ext data after initialization
livepatch: Fix having __klp_objects relics in non-livepatch modules
livepatch/klp-build: Require Clang assembler >= 20