Commit Graph

1367550 Commits

Author SHA1 Message Date
Kees Cook
94fd44648d fortify: Fix incorrect reporting of read buffer size
When FORTIFY_SOURCE reports about a run-time buffer overread, the wrong
buffer size was being shown in the error message. (The bounds checking
was correct.)

Fixes: 3d965b33e4 ("fortify: Improve buffer overflow reporting")
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20250729231817.work.023-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-29 17:19:29 -07:00
Kees Cook
fc525d625a kstack_erase: Fix missed export of renamed KSTACK_ERASE_CFLAGS
Certain targets disable kstack_erase by filtering out KSTACK_ERASE_CFLAGS
rather than adding DISABLE_KSTACK_ERASE. The renaming to kstack_erase
missed the CFLAGS export, which broke those build targets (e.g. x86
vdso32).

Fixes: 76261fc7d1 ("stackleak: Split KSTACK_ERASE_CFLAGS from GCC_PLUGINS_CFLAGS")
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-29 17:17:46 -07:00
Kees Cook
ee4cf79820 staging: media: atomisp: Fix stack buffer overflow in gmin_get_var_int()
When gmin_get_config_var() calls efi.get_variable() and the EFI variable
is larger than the expected buffer size, two behaviors combine to create
a stack buffer overflow:

1. gmin_get_config_var() does not return the proper error code when
   efi.get_variable() fails. It returns the stale 'ret' value from
   earlier operations instead of indicating the EFI failure.

2. When efi.get_variable() returns EFI_BUFFER_TOO_SMALL, it updates
   *out_len to the required buffer size but writes no data to the output
   buffer. However, due to bug #1, gmin_get_var_int() believes the call
   succeeded.

The caller gmin_get_var_int() then performs:
- Allocates val[CFG_VAR_NAME_MAX + 1] (65 bytes) on stack
- Calls gmin_get_config_var(dev, is_gmin, var, val, &len) with len=64
- If EFI variable is >64 bytes, efi.get_variable() sets len=required_size
- Due to bug #1, thinks call succeeded with len=required_size
- Executes val[len] = 0, writing past end of 65-byte stack buffer

This creates a stack buffer overflow when EFI variables are larger than
64 bytes. Since EFI variables can be controlled by firmware or system
configuration, this could potentially be exploited for code execution.

Fix the bug by returning proper error codes from gmin_get_config_var()
based on EFI status instead of stale 'ret' value.

The gmin_get_var_int() function is called during device initialization
for camera sensor configuration on Intel Bay Trail and Cherry Trail
platforms using the atomisp camera stack.

Reported-by: zepta <z3ptaa@gmail.com>
Closes: https://lore.kernel.org/all/CAPBS6KoQyM7FMdPwOuXteXsOe44X4H3F8Fw+y_qWq6E+OdmxQA@mail.gmail.com
Fixes: 38d4f74bc1 ("media: atomisp_gmin_platform: stop abusing efivar API")
Reviewed-by: Hans de Goede <hansg@kernel.org>
Link: https://lore.kernel.org/r/20250724080756.work.741-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-28 17:48:41 -07:00
Kees Cook
32e42ab9fc sched/task_stack: Add missing const qualifier to end_of_stack()
Add missing const qualifier to the non-CONFIG_THREAD_INFO_IN_TASK
version of end_of_stack() to match the CONFIG_THREAD_INFO_IN_TASK
version. Fixes a warning with CONFIG_KSTACK_ERASE=y on archs that don't
select THREAD_INFO_IN_TASK (such as LoongArch):

  error: passing 'const struct task_struct *' to parameter of type 'struct task_struct *' discards qualifiers

The stackleak_task_low_bound() function correctly uses a const task
parameter, but the legacy end_of_stack() prototype didn't like that.

Build tested on loongarch (with CONFIG_KSTACK_ERASE=y) and m68k
(with CONFIG_DEBUG_STACK_USAGE=y).

Fixes: a45728fd41 ("LoongArch: Enable HAVE_ARCH_STACKLEAK")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://lore.kernel.org/all/20250726004313.GA3650901@ax162
Cc: Youling Tang <tangyouling@kylinos.cn>
Cc: Huacai Chen <chenhuacai@loongson.cn>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-26 14:28:35 -07:00
Kees Cook
a8f0b1f8ef kstack_erase: Support Clang stack depth tracking
Wire up CONFIG_KSTACK_ERASE to Clang 21's new stack depth tracking
callback[1] option.

Link: https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-stack-depth [1]
Acked-by: Nicolas Schier <n.schier@avm.de>
Link: https://lore.kernel.org/r/20250724055029.3623499-4-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-26 14:28:35 -07:00
Kees Cook
6676fd3c99 kstack_erase: Add -mgeneral-regs-only to silence Clang warnings
Once CONFIG_KSTACK_ERASE is enabled with Clang on i386, the build warns:

  kernel/kstack_erase.c:168:2: warning: function with attribute 'no_caller_saved_registers' should only call a function with attribute 'no_caller_saved_registers' or be compiled with '-mgeneral-regs-only' [-Wexcessive-regsave]

Add -mgeneral-regs-only for the kstack_erase handler, to make Clang feel
better (it is effectively a no-op flag for the kernel). No binary
changes encountered.

Build & boot tested with Clang 21 on x86_64, and i386.
Build tested with GCC 14.2.0 on x86_64, i386, arm64, and arm.

Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://lore.kernel.org/all/20250726004313.GA3650901@ax162
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-26 14:28:35 -07:00
Kees Cook
381a38ea53 init.h: Disable sanitizer coverage for __init and __head
While __noinstr already contained __no_sanitize_coverage, it needs to
be added to __init and __head section markings to support the Clang
implementation of CONFIG_KSTACK_ERASE. This is to make sure the stack
depth tracking callback is not executed in unsupported contexts.

The other sanitizer coverage options (trace-pc and trace-cmp) aren't
needed in __head nor __init either ("We are interested in code coverage
as a function of a syscall inputs"[1]), so this is fine to disable for
them as well.

Link: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/kcov.c?h=v6.14#n179 [1]
Acked-by: Marco Elver <elver@google.com>
Link: https://lore.kernel.org/r/20250724055029.3623499-3-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-26 14:28:35 -07:00
Kees Cook
431a380f93 kstack_erase: Disable kstack_erase for all of arm compressed boot code
When building with CONFIG_KSTACK_ERASE=y and CONFIG_ARM_ATAG_DTB_COMPAT=y,
the compressed boot environment encounters an undefined symbol error:

    ld.lld: error: undefined symbol: __sanitizer_cov_stack_depth
    >>> referenced by atags_to_fdt.c:135

This occurs because the compiler instruments the atags_to_fdt() function
with sanitizer coverage calls, but the minimal compressed boot environment
lacks access to sanitizer runtime support.

The compressed boot environment already disables stack protector with
-fno-stack-protector. Similarly disable sanitizer coverage by adding
$(DISABLE_KSTACK_ERASE) to the general compiler flags (and remove it
from the one place it was noticed before), which contains the appropriate
flags to prevent sanitizer instrumentation.

This follows the same pattern used in other early boot contexts where
sanitizer runtime support is unavailable.

Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Closes: https://lore.kernel.org/all/CA+G9fYtBk8qnpWvoaFwymCx5s5i-5KXtPGpmf=_+UKJddCOnLA@mail.gmail.com
Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://lore.kernel.org/all/20250726004313.GA3650901@ax162
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-26 14:27:33 -07:00
Kees Cook
8245d47cfa x86: Handle KCOV __init vs inline mismatches
GCC appears to have kind of fragile inlining heuristics, in the
sense that it can change whether or not it inlines something based on
optimizations. It looks like the kcov instrumentation being added (or in
this case, removed) from a function changes the optimization results,
and some functions marked "inline" are _not_ inlined. In that case,
we end up with __init code calling a function not marked __init, and we
get the build warnings I'm trying to eliminate in the coming patch that
adds __no_sanitize_coverage to __init functions:

WARNING: modpost: vmlinux: section mismatch in reference: xbc_exit+0x8 (section: .text.unlikely) -> _xbc_exit (section: .init.text)
WARNING: modpost: vmlinux: section mismatch in reference: real_mode_size_needed+0x15 (section: .text.unlikely) -> real_mode_blob_end (section: .init.data)
WARNING: modpost: vmlinux: section mismatch in reference: __set_percpu_decrypted+0x16 (section: .text.unlikely) -> early_set_memory_decrypted (section: .init.text)
WARNING: modpost: vmlinux: section mismatch in reference: memblock_alloc_from+0x26 (section: .text.unlikely) -> memblock_alloc_try_nid (section: .init.text)
WARNING: modpost: vmlinux: section mismatch in reference: acpi_arch_set_root_pointer+0xc (section: .text.unlikely) -> x86_init (section: .init.data)
WARNING: modpost: vmlinux: section mismatch in reference: acpi_arch_get_root_pointer+0x8 (section: .text.unlikely) -> x86_init (section: .init.data)
WARNING: modpost: vmlinux: section mismatch in reference: efi_config_table_is_usable+0x16 (section: .text.unlikely) -> xen_efi_config_table_is_usable (section: .init.text)

This problem is somewhat fragile (though using either __always_inline
or __init will deterministically solve it), but we've tripped over
this before with GCC and the solution has usually been to just use
__always_inline and move on.

For x86 this means forcing several functions to be inline with
__always_inline.

Link: https://lore.kernel.org/r/20250724055029.3623499-2-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-24 16:55:11 -07:00
Kees Cook
65c430906e arm64: Handle KCOV __init vs inline mismatches
GCC appears to have kind of fragile inlining heuristics, in the
sense that it can change whether or not it inlines something based on
optimizations. It looks like the kcov instrumentation being added (or in
this case, removed) from a function changes the optimization results,
and some functions marked "inline" are _not_ inlined. In that case,
we end up with __init code calling a function not marked __init, and we
get the build warnings I'm trying to eliminate in the coming patch that
adds __no_sanitize_coverage to __init functions:

WARNING: modpost: vmlinux: section mismatch in reference: acpi_get_enable_method+0x1c (section: .text.unlikely) -> acpi_psci_present (section: .init.text)

This problem is somewhat fragile (though using either __always_inline
or __init will deterministically solve it), but we've tripped over
this before with GCC and the solution has usually been to just use
__always_inline and move on.

For arm64 this requires forcing one ACPI function to be inlined with
__always_inline.

Link: https://lore.kernel.org/r/20250724055029.3623499-1-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-24 16:55:11 -07:00
Kees Cook
c64d6be1a6 s390: Handle KCOV __init vs inline mismatches
When KCOV is enabled all functions get instrumented, unless
the __no_sanitize_coverage attribute is used. To prepare for
__no_sanitize_coverage being applied to __init functions, we have to
handle differences in how GCC's inline optimizations get resolved. For
s390 this exposed a place where the __init annotation was missing but
ended up being "accidentally correct". Fix this cases and force a couple
functions to be inline with __always_inline.

Acked-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/20250717232519.2984886-7-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21 21:44:01 -07:00
Kees Cook
2424fe1cac arm: Handle KCOV __init vs inline mismatches
When KCOV is enabled all functions get instrumented, unless
the __no_sanitize_coverage attribute is used. To prepare for
__no_sanitize_coverage being applied to __init functions, we have to
handle differences in how GCC's inline optimizations get resolved. For
arm this exposed several places where __init annotations were missing
but ended up being "accidentally correct". Fix these cases and force
several functions to be inline with __always_inline.

Acked-by: Nishanth Menon <nm@ti.com>
Acked-by: Lee Jones <lee@kernel.org>
Reviewed-by: Nishanth Menon <nm@ti.com>
Link: https://lore.kernel.org/r/20250717232519.2984886-5-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21 21:43:39 -07:00
Kees Cook
d01daf9d95 mips: Handle KCOV __init vs inline mismatch
When KCOV is enabled all functions get instrumented, unless
the __no_sanitize_coverage attribute is used. To prepare for
__no_sanitize_coverage being applied to __init functions, we
have to handle differences in how GCC's inline optimizations get
resolved. For mips this requires adding the __init annotation on
init_mips_clocksource().

Reviewed-by: Huacai Chen <chenhuacai@loongson.cn>
Link: https://lore.kernel.org/r/20250717232519.2984886-9-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21 21:42:49 -07:00
Ritesh Harjani (IBM)
645d1b6664 powerpc/mm/book3s64: Move kfence and debug_pagealloc related calls to __init section
Move a few kfence and debug_pagealloc related functions in hash_utils.c
and radix_pgtable.c to __init sections since these are only invoked once
by an __init function during system initialization.

i.e.
- hash_debug_pagealloc_alloc_slots()
- hash_kfence_alloc_pool()
- hash_kfence_map_pool()
  The above 3 functions only gets called by __init htab_initialize().

- alloc_kfence_pool()
- map_kfence_pool()
  The above 2 functions only gets called by __init radix_init_pgtable()

This should also help fix warning msgs like:

>> WARNING: modpost: vmlinux: section mismatch in reference:
hash_debug_pagealloc_alloc_slots+0xb0 (section: .text) ->
memblock_alloc_try_nid (section: .init.text)

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202504190552.mnFGs5sj-lkp@intel.com/
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/20250717232519.2984886-8-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21 21:42:21 -07:00
Kees Cook
437641a72d configs/hardening: Enable CONFIG_INIT_ON_FREE_DEFAULT_ON
To reduce stale data lifetimes, enable CONFIG_INIT_ON_FREE_DEFAULT_ON as
well. This matches the addition of CONFIG_STACKLEAK=y, which is doing
similar for stack memory.

Link: https://lore.kernel.org/r/20250717232519.2984886-13-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21 21:41:57 -07:00
Kees Cook
4c56d9f7e7 configs/hardening: Enable CONFIG_KSTACK_ERASE
Since we can wipe the stack with both Clang and GCC plugins, enable this
for the "hardening.config" for wider testing.

Link: https://lore.kernel.org/r/20250717232519.2984886-12-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21 21:41:48 -07:00
Kees Cook
76261fc7d1 stackleak: Split KSTACK_ERASE_CFLAGS from GCC_PLUGINS_CFLAGS
In preparation for Clang stack depth tracking for KSTACK_ERASE,
split the stackleak-specific cflags out of GCC_PLUGINS_CFLAGS into
KSTACK_ERASE_CFLAGS.

Link: https://lore.kernel.org/r/20250717232519.2984886-3-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21 21:40:57 -07:00
Kees Cook
9ea1e8d28a stackleak: Rename stackleak_track_stack to __sanitizer_cov_stack_depth
The Clang stack depth tracking implementation has a fixed name for
the stack depth tracking callback, "__sanitizer_cov_stack_depth", so
rename the GCC plugin function to match since the plugin has no external
dependencies on naming.

Link: https://lore.kernel.org/r/20250717232519.2984886-2-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21 21:40:39 -07:00
Kees Cook
57fbad15c2 stackleak: Rename STACKLEAK to KSTACK_ERASE
In preparation for adding Clang sanitizer coverage stack depth tracking
that can support stack depth callbacks:

- Add the new top-level CONFIG_KSTACK_ERASE option which will be
  implemented either with the stackleak GCC plugin, or with the Clang
  stack depth callback support.
- Rename CONFIG_GCC_PLUGIN_STACKLEAK as needed to CONFIG_KSTACK_ERASE,
  but keep it for anything specific to the GCC plugin itself.
- Rename all exposed "STACKLEAK" names and files to "KSTACK_ERASE" (named
  for what it does rather than what it protects against), but leave as
  many of the internals alone as possible to avoid even more churn.

While here, also split "prev_lowest_stack" into CONFIG_KSTACK_ERASE_METRICS,
since that's the only place it is referenced from.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250717232519.2984886-1-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21 21:35:01 -07:00
Kees Cook
fc07839203 seq_buf: Introduce KUnit tests
Add KUnit tests for the seq_buf API to ensure its correctness and
prevent future regressions, covering the following functions:
- seq_buf_init()
- DECLARE_SEQ_BUF()
- seq_buf_clear()
- seq_buf_puts()
- seq_buf_putc()
- seq_buf_printf()
- seq_buf_get_buf()
- seq_buf_commit()

$ tools/testing/kunit/kunit.py run seq_buf
=================== seq_buf (9 subtests) ===================
[PASSED] seq_buf_init_test
[PASSED] seq_buf_declare_test
[PASSED] seq_buf_clear_test
[PASSED] seq_buf_puts_test
[PASSED] seq_buf_puts_overflow_test
[PASSED] seq_buf_putc_test
[PASSED] seq_buf_printf_test
[PASSED] seq_buf_printf_overflow_test
[PASSED] seq_buf_get_buf_commit_test
===================== [PASSED] seq_buf =====================

Link: https://lore.kernel.org/r/20250717085156.work.363-kees@kernel.org
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-19 23:03:24 -07:00
Andy Shevchenko
2d8ae9a4f1 string: Group str_has_prefix() and strstarts()
The two str_has_prefix() and strstarts() are about the same
with a slight difference on what they return. Group them in
the header.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20250711085514.1294428-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-17 16:39:49 -07:00
Kees Cook
10299c07c9 kunit/fortify: Add back "volatile" for sizeof() constants
It seems the Clang can see through OPTIMIZER_HIDE_VAR when the constant
is coming from sizeof. Adding "volatile" back to these variables solves
this false positive without reintroducing the issues that originally led
to switching to OPTIMIZER_HIDE_VAR in the first place[1].

Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://github.com/ClangBuiltLinux/linux/issues/2075 [1]
Cc: Jannik Glückert <jannik.glueckert@gmail.com>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Fixes: 6ee149f61b ("kunit/fortify: Replace "volatile" with OPTIMIZER_HIDE_VAR()")
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20250628234034.work.800-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-14 22:43:52 -07:00
Gustavo A. R. Silva
5e54510a93 acpi: nfit: intel: avoid multiple -Wflex-array-member-not-at-end warnings
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.

Use the new TRAILING_OVERLAP() helper to fix a dozen instances of
the following type of warning:

drivers/acpi/nfit/intel.c:692:35: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

Acked-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/aF7pF4kej8VQapyR@kspp
Signed-off-by: Kees Cook <kees@kernel.org>
2025-06-27 13:27:23 -07:00
Gustavo A. R. Silva
29bb79e9db stddef: Introduce TRAILING_OVERLAP() helper macro
Add new TRAILING_OVERLAP() helper macro to create a union between
a flexible-array member (FAM) and a set of members that would
otherwise follow it. This overlays the trailing members onto the
FAM while preserving the original memory layout.

Co-developed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/aFG8gEwKXAWWIvX0@kspp
Signed-off-by: Kees Cook <kees@kernel.org>
2025-06-18 14:30:53 -07:00
Thorsten Blum
4bfbc2691d mux: Convert mux_control_ops to a flex array member in mux_chip
Convert mux_control_ops to a flexible array member at the end of the
mux_chip struct and add the __counted_by() compiler attribute to
improve access bounds-checking via CONFIG_UBSAN_BOUNDS and
CONFIG_FORTIFY_SOURCE.

Use struct_size() to calculate the number of bytes to allocate for a new
mux chip and to remove the following Coccinelle/coccicheck warning:

  WARNING: Use struct_size

Use size_add() to safely add any extra bytes.

No functional changes intended.

Link: https://github.com/KSPP/linux/issues/83
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://lore.kernel.org/r/20250610104106.1948-2-thorsten.blum@linux.dev
Signed-off-by: Kees Cook <kees@kernel.org>
2025-06-18 14:20:32 -07:00
Linus Torvalds
e04c78d86a Linux 6.16-rc2 v6.16-rc2 2025-06-15 13:49:41 -07:00
Linus Torvalds
08215f5486 Merge tag 'kbuild-fixes-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:

 - Move warnings about linux/export.h from W=1 to W=2

 - Fix structure type overrides in gendwarfksyms

* tag 'kbuild-fixes-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  gendwarfksyms: Fix structure type overrides
  kbuild: move warnings about linux/export.h from W=1 to W=2
2025-06-15 09:14:27 -07:00
Sami Tolvanen
2f6b47b295 gendwarfksyms: Fix structure type overrides
As we always iterate through the entire die_map when expanding
type strings, recursively processing referenced types in
type_expand_child() is not actually necessary. Furthermore,
the type_string kABI rule added in commit c9083467f7
("gendwarfksyms: Add a kABI rule to override type strings") can
fail to override type strings for structures due to a missing
kabi_get_type_string() check in this function.

Fix the issue by dropping the unnecessary recursion and moving
the override check to type_expand(). Note that symbol versions
are otherwise unchanged with this patch.

Fixes: c9083467f7 ("gendwarfksyms: Add a kABI rule to override type strings")
Reported-by: Giuliano Procida <gprocida@google.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-06-16 00:49:48 +09:00
Masahiro Yamada
a6a7946bd6 kbuild: move warnings about linux/export.h from W=1 to W=2
This hides excessive warnings, as nobody builds with W=2.

Fixes: a934a57a42 ("scripts/misc-check: check missing #include <linux/export.h> when W=1")
Fixes: 7d95680d64 ("scripts/misc-check: check unnecessary #include <linux/export.h> when W=1")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
2025-06-16 00:41:40 +09:00
Linus Torvalds
8c6bc74c7f Merge tag 'v6.16-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:

 - SMB3.1.1 POSIX extensions fix for char remapping

 - Fix for repeated directory listings when directory leases enabled

 - deferred close handle reuse fix

* tag 'v6.16-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb: improve directory cache reuse for readdir operations
  smb: client: fix perf regression with deferred closes
  smb: client: disable path remapping with POSIX extensions
2025-06-14 10:13:32 -07:00
Linus Torvalds
ac91b4de44 Merge tag 'iommu-fixes-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu fix from Joerg Roedel:

 - Fix PTE size calculation for NVidia Tegra

* tag 'iommu-fixes-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
  iommu/tegra: Fix incorrect size calculation
2025-06-14 10:01:47 -07:00
Linus Torvalds
f713ffa363 Merge tag 'block-6.16-20250614' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:

 - Fix for a deadlock on queue freeze with zoned writes

 - Fix for zoned append emulation

 - Two bio folio fixes, for sparsemem and for very large folios

 - Fix for a performance regression introduced in 6.13 when plug
   insertion was changed

 - Fix for NVMe passthrough handling for polled IO

 - Document the ublk auto registration feature

 - loop lockdep warning fix

* tag 'block-6.16-20250614' of git://git.kernel.dk/linux:
  nvme: always punt polled uring_cmd end_io work to task_work
  Documentation: ublk: Separate UBLK_F_AUTO_BUF_REG fallback behavior sublists
  block: Fix bvec_set_folio() for very large folios
  bio: Fix bio_first_folio() for SPARSEMEM without VMEMMAP
  block: use plug request list tail for one-shot backmerge attempt
  block: don't use submit_bio_noacct_nocheck in blk_zone_wplug_bio_work
  block: Clear BIO_EMULATES_ZONE_APPEND flag on BIO completion
  ublk: document auto buffer registration(UBLK_F_AUTO_BUF_REG)
  loop: move lo_set_size() out of queue freeze
2025-06-14 09:25:22 -07:00
Linus Torvalds
6d13760ea3 Merge tag 'io_uring-6.16-20250614' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:

 - Fix for a race between SQPOLL exit and fdinfo reading.

   It's slim and I was only able to reproduce this with an artificial
   delay in the kernel. Followup sparse fix as well to unify the access
   to ->thread.

 - Fix for multiple buffer peeking, avoiding truncation if possible.

 - Run local task_work for IOPOLL reaping when the ring is exiting.

   This currently isn't done due to an assumption that polled IO will
   never need task_work, but a fix on the block side is going to change
   that.

* tag 'io_uring-6.16-20250614' of git://git.kernel.dk/linux:
  io_uring: run local task_work from ring exit IOPOLL reaping
  io_uring/kbuf: don't truncate end buffer for multiple buffer peeks
  io_uring: consistently use rcu semantics with sqpoll thread
  io_uring: fix use-after-free of sq->thread in __io_uring_show_fdinfo()
2025-06-14 08:44:54 -07:00
Linus Torvalds
588adb24b7 Merge tag 'rust-fixes-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull Rust fix from Miguel Ojeda:

  - 'hrtimer': fix future compile error when the 'impl_has_hr_timer!'
    macro starts to get called

* tag 'rust-fixes-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
  rust: time: Fix compile error in impl_has_hr_timer macro
2025-06-14 08:38:34 -07:00
Linus Torvalds
27b9989b87 Merge tag 'mm-hotfixes-stable-2025-06-13-21-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
 "9 hotfixes. 3 are cc:stable and the remainder address post-6.15 issues
  or aren't considered necessary for -stable kernels. Only 4 are for MM"

* tag 'mm-hotfixes-stable-2025-06-13-21-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm: add mmap_prepare() compatibility layer for nested file systems
  init: fix build warnings about export.h
  MAINTAINERS: add Barry as a THP reviewer
  drivers/rapidio/rio_cm.c: prevent possible heap overwrite
  mm: close theoretical race where stale TLB entries could linger
  mm/vma: reset VMA iterator on commit_merge() OOM failure
  docs: proc: update VmFlags documentation in smaps
  scatterlist: fix extraneous '@'-sign kernel-doc notation
  selftests/mm: skip failed memfd setups in gup_longterm
2025-06-14 08:18:09 -07:00
Linus Torvalds
4774cfe354 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "All fixes for drivers.

  The core change in the error handler is simply to translate an ALUA
  specific sense code into a retry the ALUA components can handle and
  won't impact any other devices"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: error: alua: I/O errors for ALUA state transitions
  scsi: storvsc: Increase the timeouts to storvsc_timeout
  scsi: s390: zfcp: Ensure synchronous unit_add
  scsi: iscsi: Fix incorrect error path labels for flashnode operations
  scsi: mvsas: Fix typos in per-phy comments and SAS cmd port registers
  scsi: core: ufs: Fix a hang in the error handler
2025-06-13 16:49:39 -07:00
Linus Torvalds
25294cb8a4 Merge tag 'drm-fixes-2025-06-14' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "Quiet week, only two pull requests came my way, xe has a couple of
  fixes and then a bunch of fixes across the board, vc4 probably fixes
  the biggest problem:

  vc4:
   - Fix infinite EPROBE_DEFER loop in vc4 probing

  amdxdna:
   - Fix amdxdna firmware size

  meson:
   - modesetting fixes

  sitronix:
   - Kconfig fix for st7171-i2c

  dma-buf:
   - Fix -EBUSY WARN_ON_ONCE in dma-buf

  udmabuf:
   - Use dma_sync_sgtable_for_cpu in udmabuf

  xe:
   - Fix regression disallowing 64K SVM migration
   - Use a bounce buffer for WA BB"

* tag 'drm-fixes-2025-06-14' of https://gitlab.freedesktop.org/drm/kernel:
  drm/xe/lrc: Use a temporary buffer for WA BB
  udmabuf: use sgtable-based scatterlist wrappers
  dma-buf: fix compare in WARN_ON_ONCE
  drm/sitronix: st7571-i2c: Select VIDEOMODE_HELPERS
  drm/meson: fix more rounding issues with 59.94Hz modes
  drm/meson: use vclk_freq instead of pixel_freq in debug print
  drm/meson: fix debug log statement when setting the HDMI clocks
  drm/vc4: fix infinite EPROBE_DEFER loop
  drm/xe/svm: Fix regression disallowing 64K SVM migration
  accel/amdxdna: Fix incorrect PSP firmware size
2025-06-13 16:27:27 -07:00
Jens Axboe
b62e0efd8a io_uring: run local task_work from ring exit IOPOLL reaping
In preparation for needing to shift NVMe passthrough to always use
task_work for polled IO completions, ensure that those are suitably
run at exit time. See commit:

9ce6c9875f ("nvme: always punt polled uring_cmd end_io work to task_work")

for details on why that is necessary.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13 15:26:17 -06:00
Jens Axboe
9ce6c9875f nvme: always punt polled uring_cmd end_io work to task_work
Currently NVMe uring_cmd completions will complete locally, if they are
polled. This is done because those completions are always invoked from
task context. And while that is true, there's no guarantee that it's
invoked under the right ring context, or even task. If someone does
NVMe passthrough via multiple threads and with a limited number of
poll queues, then ringA may find completions from ringB. For that case,
completing the request may not be sound.

Always just punt the passthrough completions via task_work, which will
redirect the completion, if needed.

Cc: stable@vger.kernel.org
Fixes: 585079b6e4 ("nvme: wire up async polling for io passthrough commands")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13 15:18:34 -06:00
Linus Torvalds
18531f4d1c Merge tag 'acpi-6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
 "These fix an ACPI APEI error injection driver failure that started to
  occur after switching it over to using a faux device, address an EC
  driver issue related to invalid ECDT tables, clean up the usage of
  mwait_idle_with_hints() in the ACPI PAD driver, add a new IRQ override
  quirk, and fix a NULL pointer dereference related to nosmp:

   - Update the faux device handling code in the driver core and address
     an ACPI APEI error injection driver failure that started to occur
     after switching it over to using a faux device on top of that (Dan
     Williams)

   - Update data types of variables passed as arguments to
     mwait_idle_with_hints() in the ACPI PAD (processor aggregator
     device) driver to match the function definition after recent
     changes (Uros Bizjak)

   - Fix a NULL pointer dereference in the ACPI CPPC library that occurs
     when nosmp is passed to the kernel in the command line (Yunhui Cui)

   - Ignore ECDT tables with an invalid ID string to prevent using an
     incorrect GPE for signaling events on some systems (Armin Wolf)

   - Add a new IRQ override quirk for MACHENIKE 16P (Wentao Guan)"

* tag 'acpi-6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: resource: Use IRQ override on MACHENIKE 16P
  ACPI: EC: Ignore ECDT tables with an invalid ID string
  ACPI: CPPC: Fix NULL pointer dereference when nosmp is used
  ACPI: PAD: Update arguments of mwait_idle_with_hints()
  ACPI: APEI: EINJ: Do not fail einj_init() on faux_device_create() failure
  driver core: faux: Quiet probe failures
  driver core: faux: Suppress bind attributes
2025-06-13 13:39:15 -07:00
Linus Torvalds
f688b599d7 Merge tag 'pm-6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
 "These fix the cpupower utility installation, fix up the recently added
  Rust abstractions for cpufreq and OPP, restore the x86 update
  eliminating mwait_play_dead_cpuid_hint() that has been reverted during
  the 6.16 merge window along with preventing the failure caused by it
  from happening, and clean up mwait_idle_with_hints() usage in
  intel_idle:

   - Implement CpuId Rust abstraction and use it to fix doctest failure
     related to the recently introduced cpumask abstraction (Viresh
     Kumar)

   - Do minor cleanups in the `# Safety` sections for cpufreq
     abstractions added recently (Viresh Kumar)

   - Unbreak cpupower systemd service units installation on some systems
     by adding a unitdir variable for specifying the location to install
     them (Francesco Poli)

   - Eliminate mwait_play_dead_cpuid_hint() again after reverting its
     elimination during the 6.16 merge window due to a problem with
     handling "dead" SMT siblings, but this time prevent leaving them in
     C1 after initialization by taking them online and back offline when
     a proper cpuidle driver for the platform has been registered
     (Rafael Wysocki)

   - Update data types of variables passed as arguments to
     mwait_idle_with_hints() to match the function definition after
     recent changes (Uros Bizjak)"

* tag 'pm-6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  rust: cpu: Add CpuId::current() to retrieve current CPU ID
  rust: Use CpuId in place of raw CPU numbers
  rust: cpu: Introduce CpuId abstraction
  intel_idle: Update arguments of mwait_idle_with_hints()
  cpufreq: Convert `/// SAFETY` lines to `# Safety` sections
  cpupower: split unitdir from libdir in Makefile
  Reapply "x86/smp: Eliminate mwait_play_dead_cpuid_hint()"
  ACPI: processor: Rescan "dead" SMT siblings during initialization
  intel_idle: Rescan "dead" SMT siblings during initialization
  x86/smp: PM/hibernate: Split arch_resume_nosmt()
  intel_idle: Use subsys_initcall_sync() for initialization
2025-06-13 13:27:41 -07:00
Rafael J. Wysocki
28b069933d Merge branches 'acpi-pad', 'acpi-cppc', 'acpi-ec' and 'acpi-resource'
Merge assorted ACPI updates for 6.16-rc2:

 - Update data types of variables passed as arguments to
   mwait_idle_with_hints() in the ACPI PAD (processor aggregator device)
   driver to match the function definition after recent changes (Uros
   Bizjak).

 - Fix a NULL pointer dereference in the ACPI CPPC library that occurs
   when nosmp is passed to the kernel in the command line (Yunhui Cui).

 - Ignore ECDT tables with an invalid ID string to prevent using an
   incorrect GPE for signaling events on some systems (Armin Wolf).

 - Add a new IRQ override quirk for MACHENIKE 16P (Wentao Guan).

* acpi-pad:
  ACPI: PAD: Update arguments of mwait_idle_with_hints()

* acpi-cppc:
  ACPI: CPPC: Fix NULL pointer dereference when nosmp is used

* acpi-ec:
  ACPI: EC: Ignore ECDT tables with an invalid ID string

* acpi-resource:
  ACPI: resource: Use IRQ override on MACHENIKE 16P
2025-06-13 21:55:30 +02:00
Rafael J. Wysocki
dd3581853c Merge branch 'pm-cpuidle'
Merge cpuidle updates for 6.16-rc2:

 - Update data types of variables passed as arguments to
   mwait_idle_with_hints() to match the function definition
   after recent changes (Uros Bizjak).

 - Eliminate mwait_play_dead_cpuid_hint() again after reverting its
   elimination during the merge window due to a problem with handling
   "dead" SMT siblings, but this time prevent leaving them in C1 after
   initialization by taking them online and back offline when a proper
   cpuidle driver for the platform has been registered (Rafael Wysocki).

* pm-cpuidle:
  intel_idle: Update arguments of mwait_idle_with_hints()
  Reapply "x86/smp: Eliminate mwait_play_dead_cpuid_hint()"
  ACPI: processor: Rescan "dead" SMT siblings during initialization
  intel_idle: Rescan "dead" SMT siblings during initialization
  x86/smp: PM/hibernate: Split arch_resume_nosmt()
  intel_idle: Use subsys_initcall_sync() for initialization
2025-06-13 21:28:07 +02:00
Rafael J. Wysocki
ea2867608b Merge branch 'pm-tools'
Merge a cpupower utility fix for 6.16-rc2 that unbreaks systemd service
units installation on some sysems (Francesco Poli).

* pm-tools:
  cpupower: split unitdir from libdir in Makefile
2025-06-13 21:25:38 +02:00
Linus Torvalds
02adc1490e Merge tag 'spi-fix-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "A collection of driver specific fixes, most minor apart from the OMAP
  ones which disable some recent performance optimisations in some
  non-standard cases where we could start driving the bus incorrectly.

  The change to the stm32-ospi driver to use the newer reset APIs is a
  fix for interactions with other IP sharing the same reset line in some
  SoCs"

* tag 'spi-fix-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spi-pci1xxxx: Drop MSI-X usage as unsupported by DMA engine
  spi: stm32-ospi: clean up on error in probe()
  spi: stm32-ospi: Make usage of reset_control_acquire/release() API
  spi: offload: check offload ops existence before disabling the trigger
  spi: spi-pci1xxxx: Fix error code in probe
  spi: loongson: Fix build warnings about export.h
  spi: omap2-mcspi: Disable multi-mode when the previous message kept CS asserted
  spi: omap2-mcspi: Disable multi mode when CS should be kept asserted after message
2025-06-13 11:01:44 -07:00
Linus Torvalds
601dddb6c5 Merge tag 'regulator-fix-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
 "One minor fix for a leak in the DT parsing code in the max20086 driver"

* tag 'regulator-fix-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: max20086: Fix refcount leak in max20086_parse_regulators_dt()
2025-06-13 11:00:19 -07:00
Oleg Nesterov
f90fff1e15 posix-cpu-timers: fix race between handle_posix_cpu_timers() and posix_cpu_timer_del()
If an exiting non-autoreaping task has already passed exit_notify() and
calls handle_posix_cpu_timers() from IRQ, it can be reaped by its parent
or debugger right after unlock_task_sighand().

If a concurrent posix_cpu_timer_del() runs at that moment, it won't be
able to detect timer->it.cpu.firing != 0: cpu_timer_task_rcu() and/or
lock_task_sighand() will fail.

Add the tsk->exit_state check into run_posix_cpu_timers() to fix this.

This fix is not needed if CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y, because
exit_task_work() is called before exit_notify(). But the check still
makes sense, task_work_add(&tsk->posix_cputimers_work.work) will fail
anyway in this case.

Cc: stable@vger.kernel.org
Reported-by: Benoît Sevens <bsevens@google.com>
Fixes: 0bdd2ed413 ("sched: run_posix_cpu_timers: Don't check ->exit_state, use lock_task_sighand()")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-06-13 10:55:49 -07:00
Linus Torvalds
3ca933aad0 Merge tag 'trace-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fix from Steven Rostedt:

 - Do not free "head" variable in filter_free_subsystem_filters()

   The first error path jumps to "free_now" label but first frees the
   newly allocated "head" variable. But the "free_now" code checks this
   variable, and if it is not NULL, it will iterate the list. As this
   list variable was already initialized, the "free_now" code will not
   do anything as it is empty. But freeing it will cause a UAF bug.

   The error path should simply jump to the "free_now" label and leave
   the "head" variable alone.

* tag 'trace-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Do not free "head" on error path of filter_free_subsystem_filters()
2025-06-13 10:51:11 -07:00
Linus Torvalds
dde6379705 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - Rework of system register accessors for system registers that are
     directly writen to memory, so that sanitisation of the in-memory
     value happens at the correct time (after the read, or before the
     write). For convenience, RMW-style accessors are also provided.

   - Multiple fixes for the so-called "arch-timer-edge-cases' selftest,
     which was always broken.

  x86:

   - Make KVM_PRE_FAULT_MEMORY stricter for TDX, allowing userspace to
     pass only the "untouched" addresses and flipping the shared/private
     bit in the implementation.

   - Disable SEV-SNP support on initialization failure

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86/mmu: Reject direct bits in gpa passed to KVM_PRE_FAULT_MEMORY
  KVM: x86/mmu: Embed direct bits into gpa for KVM_PRE_FAULT_MEMORY
  KVM: SEV: Disable SEV-SNP support on initialization failure
  KVM: arm64: selftests: Determine effective counter width in arch_timer_edge_cases
  KVM: arm64: selftests: Fix xVAL init in arch_timer_edge_cases
  KVM: arm64: selftests: Fix thread migration in arch_timer_edge_cases
  KVM: arm64: selftests: Fix help text for arch_timer_edge_cases
  KVM: arm64: Make __vcpu_sys_reg() a pure rvalue operand
  KVM: arm64: Don't use __vcpu_sys_reg() to get the address of a sysreg
  KVM: arm64: Add RMW specific sysreg accessor
  KVM: arm64: Add assignment-specific sysreg accessor
2025-06-13 10:05:31 -07:00
Jens Axboe
26ec15e4b0 io_uring/kbuf: don't truncate end buffer for multiple buffer peeks
If peeking a bunch of buffers, normally io_ring_buffers_peek() will
truncate the end buffer. This isn't optimal as presumably more data will
be arriving later, and hence it's better to stop with the last full
buffer rather than truncate the end buffer.

Cc: stable@vger.kernel.org
Fixes: 35c8711c8f ("io_uring/kbuf: add helpers for getting/peeking multiple buffers")
Reported-by: Christian Mazakas <christian.mazakas@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13 11:01:49 -06:00