Update lazy counters in xfs_growfs_rt_bmblock() similar to the way it
is done xfs_growfs_data_private(). This is because the lazy counters are
not always updated and synching the counters will avoid inconsistencies
between frexents and rtextents(total realtime extent count). This will
be more useful once realtime shrink is implemented as this will prevent
some transient state to occur where frexents might be greater than
total rtextents.
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Add a comment explaining why the sb_frextents are updated outside the
if (xfs_has_lazycount(mp) check even though it is a lazycounter.
RT groups are supported only in v5 filesystems which always have
lazycounter enabled - so putting it inside the if(xfs_has_lazycount(mp)
check is redundant.
Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Bug description:
If the size of the last rtgroup i.e, the rtg passed to
xfs_last_rt_bmblock() is such that the last rtextent falls in 0th word
offset of a bmblock of the bitmap file tracking this (last) rtgroup,
then in that case xfs_last_rt_bmblock() incorrectly returns the next
bmblock number instead of the current/last used bmblock number.
When xfs_last_rt_bmblock() incorrectly returns the next bmblock,
the loop to grow/modify the bmblocks in xfs_growfs_rtg() doesn't
execute and xfs_growfs basically does a nop in certain cases.
xfs_growfs will do a nop when the new size of the fs will have the same
number of rtgroups i.e, we are only growing the last rtgroup.
Reproduce:
$ mkfs.xfs -m metadir=0 -r rtdev=/dev/loop1 /dev/loop0 \
-r size=32769b -f
$ mount -o rtdev=/dev/loop1 /dev/loop0 /mnt/scratch
$ xfs_growfs -R $(( 32769 + 1 )) /mnt/scratch
$ xfs_info /mnt/scratch | grep rtextents
$ # We can see that rtextents hasn't changed
Fix:
Fix this by returning the current/last used bmblock when the last
rtgroup size is not a multiple xfs_rtbitmap_rtx_per_rbmblock()
and the next bmblock when the rtgroup size is a multiple of
xfs_rtbitmap_rtx_per_rbmblock() i.e, the existing blocks are
completely used up.
Also, I have renamed xfs_last_rt_bmblock() to
xfs_last_rt_bmblock_to_extend() to signify that this function
returns the bmblock number to extend and NOT always the last used
bmblock number.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Sam Sun apparently found a syzbot way to fuzz a filesystem such that
xfs_iget_cache_miss would free the inode before the fserror code could
catch up. Frustratingly he doesn't use the syzbot dashboard so there's
no C reproducer and not even a full error report, so I'm guessing that:
Inodes that are being constructed or torn down inside XFS are not
visible to the VFS. They should never be reported to fserror.
Also, any inode that has been freshly allocated in _cache_miss should be
marked INEW immediately because, well, it's an incompletely constructed
inode that isn't yet visible to the VFS.
Reported-by: Sam Sun <samsun1006219@gmail.com>
Fixes: 5eb4cb18e4 ("xfs: convey metadata health events to the health monitor")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Internal metadata inodes are not exposed to userspace programs, so it
makes no sense to pass them to the fserror functions (aka fsnotify).
Instead, report metadata file problems as general filesystem corruption.
Fixes: 5eb4cb18e4 ("xfs: convey metadata health events to the health monitor")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Pankaj Raghav asks about this code in xfs_healthmon_get:
hm = mp->m_healthmon;
if (hm && !refcount_inc_not_zero(&hm->ref))
hm = NULL;
rcu_read_unlock();
return hm;
(slightly edited to compress a mailing list thread)
"Nit: Should we do a READ_ONCE(mp->m_healthmon) here to avoid any
compiler tricks that can result in an undefined behaviour? I am not sure
if I am being paranoid here.
"So this is my understanding: RCU guarantees that we get a valid object
(actual data of m_healthmon) but does not guarantee the compiler will
not reread the pointer between checking if hm is !NULL and accessing the
pointer as we are doing it lockless.
"So just a barrier() call in rcu_read_lock is enough to make sure this
doesn't happen and probably adding a READ_ONCE() is not needed?"
After some initial confusion I concluded that he's correct. The
compiler could very well eliminate the hm variable in favor of walking
the pointers twice, turning the code into:
if (mp->m_healthmon && !refcount_inc_not_zero(&mp->m_healthmon->ref))
If this happens, then xfs_healthmon_detach can sneak in between the
two sides of the && expression and set mp->m_healthmon to NULL, and
thereby cause a null pointer dereference crash. Fix this by using the
rcu pointer assignment and dereference functions, which ensure that the
proper reordering barriers are in place.
Practically speaking, gcc seems to allocate an actual variable for hm
and only reads mp->m_healthmon once (as intended), but we ought to be
more explicit about requiring this.
Reported-by: Pankaj Raghav <pankaj.raghav@linux.dev>
Fixes: a48373e7d3 ("xfs: start creating infrastructure for health monitoring")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Chris Mason reports that his AI tools noticed that we were using
xfs_perag_put and xfs_group_put to release the group reference returned
by xfs_group_next_range. However, the iterator function returns an
object with an active refcount, which means that we must use the correct
function to release the active refcount, which is _rele.
Cc: <stable@vger.kernel.org> # v6.0
Fixes: 6f643c57d5 ("xfs: implement ->notify_failure() for XFS")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Chris Mason reports that his AI tools noticed that we were using
xfs_perag_put and xfs_group_put to release the group reference returned
by xfs_group_next_range. However, the iterator function returns an
object with an active refcount, which means that we must use the correct
function to release the active refcount, which is _rele.
Fixes: b8accfd65d ("xfs: add media verification ioctl")
Reported-by: Chris Mason <clm@meta.com>
Link: https://lore.kernel.org/linux-xfs/20260206030527.2506821-1-clm@meta.com/
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
The function try_lookup_noperm() can return an error pointer and is not
checked for one.
Add checks for error pointer in xrep_adoption_check_dcache() and
xrep_adoption_zap_dcache().
Detected by Smatch:
fs/xfs/scrub/orphanage.c:449 xrep_adoption_check_dcache() error:
'd_child' dereferencing possible ERR_PTR()
fs/xfs/scrub/orphanage.c:485 xrep_adoption_zap_dcache() error:
'd_child' dereferencing possible ERR_PTR()
Fixes: 73597e3e42 ("xfs: ensure dentry consistency when the orphanage adopts a file")
Cc: stable@vger.kernel.org # v6.16
Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
The active inode (or active vnode until recently) stat can get much larger
than expected on file systems with a lot of metafile inodes like zoned
file systems on SMR hard disks with 10.000s of rtg rmap inodes.
Remove all metafile inodes from the active counter to make it more useful
to track actual workloads and add a separate counter for active metafile
inodes.
This fixes xfs/177 on SMR hard drives.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Most of them are unused, so mark them as such. Give the remaining ones
names that match their use instead of the historic IRIX ones based on
vnodes. Note that the names are purely internal to the XFS code, the
user interface is based on section names and arrays of counters.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Fixup some code alignment issues in xfs_ondisk.c
Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Use the already existing rtg_group() wrapper instead of directly
accessing the struct xfs_group member in struct xfs_rtgroup.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com>
[cem: Conflict resolution against 06873dbd94]
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Introduce xfs_growfs_compute_delta() to calculate the nagcount
and delta blocks and refactor the code from xfs_growfs_data_private().
No functional changes.
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Replace ASSERT(sum > 0) with an XFS_IS_CORRUPT() and place it just
after the call to xfs_rtget_summary() so that we don't end up using
an illegal value of sum.
Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Pull fsverity fixes from Eric Biggers:
- Fix a build error on parisc
- Remove the non-large-folio-aware function fsverity_verify_page()
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
fsverity: fix build error by adding fsverity_readahead() stub
fsverity: remove fsverity_verify_page()
f2fs: make f2fs_verify_cluster() partially large-folio-aware
f2fs: remove unnecessary ClearPageUptodate in f2fs_verify_cluster()
Pull crypto library fix from Eric Biggers:
"Fix a big endian specific issue in the PPC64-optimized AES code"
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: powerpc/aes: Fix rndkey_from_vsx() on big endian CPUs
Stephen retired and stepped back from -next maintainership, update his
entry in CREDITS to recognise his 18 years of hard work making it what
it is today and all the impact it's had on our development process.
Also update to his current GnuPG key while we're here.
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The x509 public key code gained a dependency on the sha256 hash
implementation, causing a rare link time failure in randconfig
builds:
arm-linux-gnueabi-ld: crypto/asymmetric_keys/x509_public_key.o: in function `x509_get_sig_params':
x509_public_key.c:(.text.x509_get_sig_params+0x12): undefined reference to `sha256'
arm-linux-gnueabi-ld: (sha256): Unknown destination type (ARM/Thumb) in crypto/asymmetric_keys/x509_public_key.o
x509_public_key.c:(.text.x509_get_sig_params+0x12): dangerous relocation: unsupported relocation
Select the necessary library code from Kconfig.
Fixes: 2c62068ac8 ("x509: Separately calculate sha256 for blacklist")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Align to the commit bf4afc53b7 ("Convert 'alloc_obj' family to use the
new default GFP_KERNEL argument") update the 'kmalloc_obj' declaration
for userspace to fix below compile error:
In file included from arch/arm/boot/compressed/../../../../lib/decompress_unxz.c:241,
from arch/arm/boot/compressed/decompress.c:56:
arch/arm/boot/compressed/../../../../lib/xz/xz_dec_stream.c: In function 'xz_dec_init':
arch/arm/boot/compressed/../../../../lib/xz/xz_dec_stream.c:787:28: error: implicit declaration of function 'kmalloc_obj'; did you mean 'kmalloc'? [-Wimplicit-function-declaration]
787 | struct xz_dec *s = kmalloc_obj(*s);
| ^~~~~~~~~~~
| kmalloc
Signed-off-by: Haiyue Wang <haiyuewa@163.com>
Fixes: 69050f8d6d ("treewide: Replace kmalloc with kmalloc_obj for non-scalar types")
Fixes: bf4afc53b7 ("Convert 'alloc_obj' family to use the new default GFP_KERNEL argument")
Reviewed-by: Kees Cook <kees@kernel.org>
Acked-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull RTC updates from Alexandre Belloni:
- loongson: Loongson-2K0300 support
- s35390a: nvmem support
- zynqmp: rework calibration
* tag 'rtc-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
rtc: ds1390: fix number of bytes read from RTC
rtc: class: Remove duplicate check for alarm
rtc: optee: simplify OP-TEE context match
rtc: interface: Alarm race handling should not discard preceding error
rtc: s35390a: implement nvmem support
rtc: loongson: Add Loongson-2K0300 support
dt-bindings: rtc: loongson: Document Loongson-2K0300 compatible
dt-bindings: rtc: loongson: Correct Loongson-1C interrupts property
dt-bindings: rtc: renesas,rz-rtca3: Add RZ/V2N support
dt-bindings: rtc: cpcap: convert to schema
rtc: zynqmp: use dynamic max and min offset ranges
rtc: zynqmp: rework set_offset
rtc: zynqmp: rework read_offset
rtc: zynqmp: check calibration max value
rtc: zynqmp: correct frequency value
rtc: amlogic-a4: Remove IRQF_ONESHOT
rtc: pcf8563: use correct of_node for output clock
rtc: max31335: use correct CONFIG symbol in IS_REACHABLE()
rtc: nvvrs: Add ARCH_TEGRA to the NV VRS RTC driver
Pull rust fixes from Miguel Ojeda:
"Toolchain and infrastructure:
- Pass '-Zunstable-options' flag required by the future Rust 1.95.0
- Fix 'objtool' warning for Rust 1.84.0
'kernel' crate:
- 'irq' module: add missing bound detected by the future Rust 1.95.0
- 'list' module: add missing 'unsafe' blocks and placeholder safety
comments to macros (an issue for future callers within the crate)
'pin-init' crate:
- Clean Clippy warning that changed behavior in the future Rust
1.95.0"
* tag 'rust-fixes-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
rust: list: Add unsafe blocks for container_of and safety comments
rust: pin-init: replace clippy `expect` with `allow`
rust: irq: add `'static` bounds to irq callbacks
objtool/rust: add one more `noreturn` Rust function
rust: kbuild: pass `-Zunstable-options` for Rust 1.95.0
Pull runtime verifier fix from Steven Rostedt:
- Fix multiple definition of __pcpu_unique_da_mon_this
After refactoring monitors, we used static per-cpu variables with the
same names across different per-cpu monitors. This is explicitly
disallowed for modules on some architectures (alpha) or if
CONFIG_DEBUG_FORCE_WEAK_PER_CPU is enabled (e.g. Fedora's debug
kernel). Make sure all those variables have different names to avoid
compilation issues.
* tag 'trace-rv-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
rv: Fix multiple definition of __pcpu_unique_da_mon_this
This converts some of the visually simpler cases that have been split
over multiple lines. I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.
Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script. I probably had made it a bit _too_ trivial.
So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.
The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the exact same thing as the 'alloc_obj()' version, only much
smaller because there are a lot fewer users of the *alloc_flex()
interface.
As with alloc_obj() version, this was done entirely with mindless brute
force, using the same script, except using 'flex' in the pattern rather
than 'objs*'.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Most simple allocations use GFP_KERNEL, and with the new allocation
helpers being introduced, let's just take advantage of that to simplify
that default case.
It's a numbers game:
git grep 'alloc_obj(' |
sed 's/.*\(GFP_[_A-Z]*\).*/\1/' |
sort | uniq -c | sort -n | tail
shows that about 90% of all those new allocator instances just use that
standard GFP_KERNEL.
Those helpers are already macros, and we can easily just make it be the
default case when the gfp argument is missing.
And yes, we could do that for all the legacy interfaces too, but let's
keep it to just the new ones at least for now, since those all got
converted recently anyway, so this is not any "extra" noise outside of
that limited conversion.
And, in fact, I want to do this before doing the -rc1 release, exactly
so that we don't get extra merge conflicts.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 69050f8d6d ("treewide: Replace kmalloc with kmalloc_obj for
non-scalar types") started using the new allocation helpers, and in the
process showed that they were completely non-working.
The overflow logic in overflows_flex_counter_type() is completely the
wrong way around, and that broke __alloc_flex() completely. By chance,
the resulting code was then such a mess that clang generated
sufficiently garbage code that objtool warned about it all. Which made
it somewhat quicker to narrow things down.
While fixing overflows_flex_counter_type() would presumably fix this
all, I'm excising the whole broken overflow logic from __alloc_flex(),
because we don't want that kind of code in basic allocation functions
anyway.
That (no longer) broken overflows_flex_counter_type() thing needs to be
inserted into the actual __set_flex_counter() logic in the unlikely case
that we ever want this at all. And made conditional.
Fixes: 81cee9166a ("compiler_types: Introduce __flex_counter() and family")
Fixes: 69050f8d6d ("treewide: Replace kmalloc with kmalloc_obj for non-scalar types")
Cc: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/all/CAHk-=whEd020BYzGTzYrENjD9Z5_82xx6h8HsQvH5xDSnv0=Hw@mail.gmail.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull kmalloc_obj conversion from Kees Cook:
"This does the tree-wide conversion to kmalloc_obj() and friends using
coccinelle, with a subsequent small manual cleanup of whitespace
alignment that coccinelle does not handle.
This uncovered a clang bug in __builtin_counted_by_ref(), so the
conversion is preceded by disabling that for current versions of
clang. The imminent clang 22.1 release has the fix.
I've done allmodconfig build tests for x86_64, arm64, i386, and arm. I
did defconfig builds for alpha, m68k, mips, parisc, powerpc, riscv,
s390, sparc, sh, arc, csky, xtensa, hexagon, and openrisc"
* tag 'kmalloc_obj-treewide-v7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
kmalloc_obj: Clean up after treewide replacements
treewide: Replace kmalloc with kmalloc_obj for non-scalar types
compiler_types: Disable __builtin_counted_by_ref for Clang
Pull perf tools updates from Arnaldo Carvalho de Melo:
- Introduce 'perf sched stats' tool with record/report/diff workflows
using schedstat counters
- Add a faster libdw based addr2line implementation and allow selecting
it or its alternatives via 'perf config addr2line.style='
- Data-type profiling fixes and improvements including the ability to
select fields using 'perf report''s -F/-fields, e.g.:
'perf report --fields overhead,type'
- Add 'perf test' regression tests for Data-type profiling with C and
Rust workloads
- Fix srcline printing with inlines in callchains, make sure this has
coverage in 'perf test'
- Fix printing of leaf IP in LBR callchains
- Fix display of metrics without sufficient permission in 'perf stat'
- Print all machines in 'perf kvm report -vvv', not just the host
- Switch from SHA-1 to BLAKE2s for build ID generation, remove SHA-1
code
- Fix 'perf report's histogram entry collapsing with '-F' option
- Use system's cacheline size instead of a hardcoded value in 'perf
report'
- Allow filtering conversion by time range in 'perf data'
- Cover conversion to CTF using 'perf data' in 'perf test'
- Address newer glibc const-correctness (-Werror=discarded-qualifiers)
issues
- Fixes and improvements for ARM's CoreSight support, simplify ARM SPE
event config in 'perf mem', update docs for 'perf c2c' including the
ARM events it can be used with
- Build support for generating metrics from arch specific python
script, add extra AMD, Intel, ARM64 metrics using it
- Add AMD Zen 6 events and metrics
- Add JSON file with OpenHW Risc-V CVA6 hardware counters
- Add 'perf kvm' stats live testing
- Add more 'perf stat' tests to 'perf test'
- Fix segfault in `perf lock contention -b/--use-bpf`
- Fix various 'perf test' cases for s390
- Build system cleanups, bump minimum shellcheck version to 0.7.2
- Support building the capstone based annotation routines as a plugin
- Allow passing extra Clang flags via EXTRA_BPF_FLAGS
* tag 'perf-tools-for-v7.0-1-2026-02-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (255 commits)
perf test script: Add python script testing support
perf test script: Add perl script testing support
perf script: Allow the generated script to be a path
perf test: perf data --to-ctf testing
perf test: Test pipe mode with data conversion --to-json
perf json: Pipe mode --to-ctf support
perf json: Pipe mode --to-json support
perf check: Add libbabeltrace to the listed features
perf build: Allow passing extra Clang flags via EXTRA_BPF_FLAGS
perf test data_type_profiling.sh: Skip just the Rust tests if code_with_type workload is missing
tools build: Fix feature test for rust compiler
perf libunwind: Fix calls to thread__e_machine()
perf stat: Add no-affinity flag
perf evlist: Reduce affinity use and move into iterator, fix no affinity
perf evlist: Missing TPEBS close in evlist__close()
perf evlist: Special map propagation for tool events that read on 1 CPU
perf stat-shadow: In prepare_metric fix guard on reading NULL perf_stat_evsel
Revert "perf tool_pmu: More accurately set the cpus for tool events"
tools build: Emit dependencies file for test-rust.bin
tools build: Make test-rust.bin be removed by the 'clean' target
...
Pull coccinelle updates from Julia Lawall:
"This simplifies and clarifies the handling of output generated by
Coccinelle that is sent to standard error.
By default, this goes to /dev/null. Remind the user of that and
encourage them to provide another file name (Benjamin Philip)"
* tag 'cocci-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
Documentation: Coccinelle: document debug log handling
scripts: coccicheck: warn on unset debug file
scripts: coccicheck: simplify debug file handling
Pull NTB (PCIe non-transparent bridge) updates from Jon Mason:
"NTB updates include debugfs improvements, correctness fixes, cleanups,
and new hardware support:
ntb_transport QP stats are converted to seq_file, a tx_memcpy_offload
module parameter is introduced with associated ordering fixes, and a
debugfs queue name truncation bug is corrected.
Additional fixes address format specifier mismatches in ntb_tool and
boundary conditions in the Switchtec driver, while unused MSI helpers
are removed and the codebase migrates to dma_map_phys().
Intel Gen6 (Diamond Rapids) NTB support is also added"
* tag 'ntb-7.0' of https://github.com/jonmason/ntb:
NTB: ntb_transport: Use seq_file for QP stats debugfs
NTB: ntb_transport: Fix too small buffer for debugfs_name
ntb/ntb_tool: correct sscanf format for u64 and size_t in tool_peer_mw_trans_write
ntb: intel: Add Intel Gen6 NTB support for DiamondRapids
NTB/msi: Remove unused functions
ntb: ntb_hw_switchtec: Increase MAX_MWS limit to 256
ntb: ntb_hw_switchtec: Fix array-index-out-of-bounds access
ntb: ntb_hw_switchtec: Fix shift-out-of-bounds for 0 mw lut
NTB: epf: allow built-in build
ntb: migrate to dma_map_phys instead of map_page
NTB: ntb_transport: Add 'tx_memcpy_offload' module option
NTB: ntb_transport: Remove unused 'retries' field from ntb_queue_entry
Pull io_uring fixes from Jens Axboe:
- A fix for a missing URING_CMD128 opcode check, fixing an issue with
the SQE mixed mode support introduced in 6.19. Merged late due to
having multiple dependencies
- Add sqe->cmd size checking for big SQEs, similar to what we have for
normal sized SQEs
- Fix a race condition in zcrx, that leads to a double free
* tag 'io_uring-20260221' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring: Add size check for sqe->cmd
io_uring: add IORING_OP_URING_CMD128 to opcode checks
io_uring/zcrx: fix user_ref race between scrub and refill paths
Pull memblock fix from Mike Rapoport:
"Fix detection of NUMA node for CXL windows
phys_to_target_node() may assign a CXL Fixed Memory Window to the
wrong NUMA node when a CXL node resides in the gap of discontinuous
System RAM node.
Fix this by checking both numa_meminfo and numa_reserved_meminfo,
preferring the reserved NID when the address appears in both"
* tag 'fixes-2026-02-21' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
mm: numa_memblks: Identify the accurate NUMA ID of CFMW
Pull sched_ext fixes from Tejun Heo:
- Various bug fixes for the example schedulers and selftests
* tag 'sched_ext-for-7.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
tools/sched_ext: fix getopt not re-parsed on restart
tools/sched_ext: scx_userland: fix data races on shared counters
tools/sched_ext: scx_pair: fix stride == 0 crash on single-CPU systems
tools/sched_ext: scx_central: fix CPU_SET and skeleton leak on early exit
tools/sched_ext: scx_userland: fix stale data on restart
tools/sched_ext: scx_flatcg: fix potential stack overflow from VLA in fcg_read_stats
selftests/sched_ext: Fix rt_stall flaky failure
tools/sched_ext: scx_userland: fix restart and stats thread lifecycle bugs
tools/sched_ext: scx_central: fix sched_setaffinity() call with the set size
tools/sched_ext: scx_flatcg: zero-initialize stats counter array
Pull smb server fixes from Steve French:
"Two small fixes:
- fix potential deadlock
- minor cleanup"
* tag 'v7.0-rc-part2-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: call ksmbd_vfs_kern_path_end_removing() on some error paths
smb: server: Remove duplicate include of misc.h
The current debug documentation does not mention that logs are printed
to stdout unless DEBUG_FILE is set. It also doesn't mention that
Coccinelle cannot overwrite debug files.
Document this behaviour in the examples and reference it in the
debugging section.
Signed-off-by: Benjamin Philip <benjamin.philip495@gmail.com>
Signed-off-by: Julia Lawall <julia.lawall@inria.fr>
coccicheck prints debug logs to stdout unless a debug file has been set.
This makes it hard to read coccinelle's suggested changes, especially
for someone new to coccicheck.
From this commit, we warn about this behaviour from within the script on
an unset debug file. Explicitly setting the debug file to /dev/null
suppresses the warning while keeping the default.
Signed-off-by: Benjamin Philip <benjamin.philip495@gmail.com>
Signed-off-by: Julia Lawall <julia.lawall@inria.fr>
This commit separates handling unset files and pre-existing files. It
also eliminates a duplicated check for unset files in run_cmd_parmap().
Signed-off-by: Benjamin Philip <benjamin.philip495@gmail.com>
Signed-off-by: Julia Lawall <julia.lawall@inria.fr>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
Unfortunately, there is a corner case of __builtin_counted_by_ref()
usage that crashes[1] Clang since support was introduced in Clang 19.
Disable it prior to Clang 22. Found while tested kmalloc_obj treewide
refactoring (via kmalloc_flex() usage).
Link: https://github.com/llvm/llvm-project/issues/182575 [1]
Signed-off-by: Kees Cook <kees@kernel.org>
After goto restart, optind retains its advanced position from the
previous getopt loop, causing getopt() to immediately return -1.
This silently drops all command-line options on the restarted skeleton.
Reset optind to 1 at the restart label so options are re-parsed.
Affected schedulers: scx_simple, scx_central, scx_flatcg, scx_pair,
scx_sdt, scx_cpu0.
Signed-off-by: David Carlier <devnexen@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
The stats thread reads nr_vruntime_enqueues, nr_vruntime_dispatches,
nr_vruntime_failed, and nr_curr_enqueued concurrently with the main
thread writing them, with no synchronization.
Use __atomic builtins with relaxed ordering for all accesses to these
counters to eliminate the data races.
Only display accuracy is affected, not scheduling correctness.
Signed-off-by: David Carlier <devnexen@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Pull spi fixes from Mark Brown:
"There's a relatively large but ultimately simple fix for spidev here
which addresses some ABBA races by simplifying down to just using a
single lock, it's not clear to me that there was ever any benefit in
having the two separate locks in the first place.
We also have simple missing error check fix in in the wpcm-fiu driver"
* tag 'spi-fix-v7.0-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spidev: fix lock inversion between spi_lock and buf_lock
spi: wpcm-fiu: Fix potential NULL pointer dereference in wpcm_fiu_probe()
Pull regulator fixes from Mark Brown:
"A few driver specific fixes, plus a patch from Bjorn which removes a
fixed limit on regulator names that was breaking some Qualcomm
systems"
* tag 'regulator-fix-v7.0-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: s2mps11: fix pctrlsel macro usage in s2mpg10_of_parse_cb()
regulator: s2mps11: drop redundant sanity checks in s2mpg10_of_parse_cb()
regulator: core: Remove regulator supply_name length limit
regulator: mt6363: Fix interrmittent timeout