10413 Commits

Author SHA1 Message Date
Christophe Leroy
4322c8f81c lib/strn*,uaccess: Use masked_user_{read/write}_access_begin when required
Properly use masked_user_read_access_begin() and
masked_user_write_access_begin() instead of masked_user_access_begin() in
order to match user_read_access_end() and user_write_access_end().  This is
important for architectures like PowerPC that enable separately user reads
and user writes.

That means masked_user_read_access_begin() is used when user memory is
exclusively read during the window and masked_user_write_access_begin()
is used when user memory is exclusively writen during the window.
masked_user_access_begin() remains and is used when both reads and
writes are performed during the open window. Each of them is expected
to be terminated by the matching user_read_access_end(),
user_write_access_end() and user_access_end().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/cb5e4b0fa49ea9c740570949d5e3544423389757.1763396724.git.christophe.leroy@csgroup.eu
2025-11-18 15:27:35 +01:00
Christophe Leroy
803abedbd5 iov_iter: Add missing speculation barrier to copy_from_user_iter()
The results of "access_ok()" can be mis-speculated.  The result is that
the CPU can end speculatively:

	if (access_ok(from, size))
		// Right here

For the same reason as done in copy_from_user() in commit 74e19ef0ff
("uaccess: Add speculation barrier to copy_from_user()"), add a speculation
barrier to copy_from_user_iter().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/6b73e69cc7168c89df4eab0a216e3ed4cca36b0a.1763396724.git.christophe.leroy@csgroup.eu
2025-11-18 15:27:34 +01:00
Christophe Leroy
4db1df7a72 iov_iter: Convert copy_from_user_iter() to masked user access
copy_from_user_iter() lacks a speculation barrier, which will degrade
performance on some architecture like x86, which would be unfortunate as
copy_from_user_iter() is a critical hotpath function.

Convert copy_from_user_iter() to using masked user access on architecture
that support it. This allows to add the speculation barrier without
impacting performance.

This is similar to what was done for copy_from_user() in commit
0fc810ae3a ("x86/uaccess: Avoid barrier_nospec() in 64-bit
copy_from_user()")

[ tglx: Massage change log ]

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/58e4b07d469ca68a2b9477fe2c1ccc8a44cef131.1763396724.git.christophe.leroy@csgroup.eu
2025-11-18 15:27:34 +01:00
Bartosz Golaszewski
197b3f3c70 string: provide strends()
Implement a function for checking if a string ends with a different
string and add its kunit test cases.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20251112-gpio-shared-v4-1-b51f97b1abd8@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-11-17 10:15:32 +01:00
Uladzislau Rezki (Sony)
e781c1c0a9 lib/test_vmalloc: remove xfail condition check
A test marked with "xfail = true" is expected to fail but that does not
mean it is predetermined to fail.  Remove "xfail" condition check for
tests which pass successfully.

Link: https://lkml.kernel.org/r/20251007122035.56347-3-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-16 17:27:53 -08:00
Uladzislau Rezki (Sony)
9ff86ca1cc lib/test_vmalloc: add no_block_alloc_test case
Patch series "__vmalloc()/kvmalloc() and no-block support", v4.


This patch (of 10):

Introduce a new test case "no_block_alloc_test" that verifies non-blocking
allocations using __vmalloc() with GFP_ATOMIC and GFP_NOWAIT flags.

It is recommended to build kernel with CONFIG_DEBUG_ATOMIC_SLEEP enabled
to help catch "sleeping while atomic" issues.  This test ensures that
memory allocation logic under atomic constraints does not inadvertently
sleep.

Link: https://lkml.kernel.org/r/20251007122035.56347-2-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-16 17:27:52 -08:00
Pasha Tatashin
a26ec8f3d4 lib/test_kho: check if KHO is enabled
We must check whether KHO is enabled prior to issuing KHO commands,
otherwise KHO internal data structures are not initialized.

Link: https://lkml.kernel.org/r/20251106220635.2608494-1-pasha.tatashin@soleen.com
Fixes: b753522bed ("kho: add test for kexec handover")
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202511061629.e242724-lkp@intel.com
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-15 10:52:01 -08:00
Alexei Starovoitov
e47b68bda4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after 6.18-rc5+
Cross-merge BPF and other fixes after downstream PR.

Minor conflict in kernel/bpf/helpers.c

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-14 17:43:41 -08:00
Thomas Weißschuh
7bc16e72dd kunit: Make filter parameters configurable via Kconfig
Enable the preset of filter parameters from kconfig options, similar to
how other KUnit configuration parameters are handled already.
This is useful to run a subset of tests even if the cmdline is not
readily modifyable.

Link: https://lore.kernel.org/r/20251106-kunit-filter-kconfig-v1-1-d723fb7ac221@linutronix.de
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-11-14 11:02:34 -07:00
Ingo Molnar
d851f2b2b2 Merge tag 'v6.18-rc5' into objtool/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-11-13 07:58:43 +01:00
Eric Biggers
5dc8d27752 Merge tag 'arm64-fpsimd-on-stack-for-v6.19' into libcrypto-fpsimd-on-stack
Pull fpsimd-on-stack changes from Ard Biesheuvel:

  "Shared tag/branch for arm64 FP/SIMD changes going through libcrypto"

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-12 10:15:07 -08:00
Ard Biesheuvel
8dcac98a47 lib/crypto: arm64: Move remaining algorithms to scoped ksimd API
Move the arm64 implementations of SHA-3 and POLYVAL to the newly
introduced scoped ksimd API, which replaces kernel_neon_begin() and
kernel_neon_end(). On arm64, this is needed because the latter API
will change in an incompatible manner.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-12 10:14:11 -08:00
Dr. David Alan Gilbert
a0b8c6af29 lib/xxhash: remove more unused xxh functions
xxh32_reset() and xxh32_copy_state() are unused, and with those gone, the
xxh32_state struct is also unused.

xxh64_copy_state() is also unused.

Remove them all.

(Also fixes a comment above the xxh64_state that referred to it as
xxh32_state).

Link: https://lkml.kernel.org/r/20251024205120.454508-1-linux@treblig.org
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-12 10:00:17 -08:00
Ye Bin
6c2e6e2c1a dynamic_debug: add support for print stack
In practical problem diagnosis, especially during the boot phase, it is
often desirable to know the call sequence.  However, currently, apart from
adding print statements and recompiling the kernel, there seems to be no
good alternative.  If dynamic_debug supported printing the call stack, it
would be very helpful for diagnosing issues.  This patch add support '+d'
for dump stack.

Link: https://lkml.kernel.org/r/20251025080003.312536-1-yebin@huaweicloud.com
Signed-off-by: Ye Bin <yebin10@huawei.com>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-12 10:00:16 -08:00
Yury Norov (NVIDIA)
d99dc586ca uaccess: decouple INLINE_COPY_FROM_USER and CONFIG_RUST
Commit 1f9a8286bc ("uaccess: always export _copy_[from|to]_user with
CONFIG_RUST") exports _copy_{from,to}_user() unconditionally, if RUST is
enabled.  This pollutes exported symbols namespace, and spreads RUST
ifdefery in core files.

It's better to declare a corresponding helper under the rust/helpers,
similarly to how non-underscored copy_{from,to}_user() is handled.

[yury.norov@gmail.com: drop rust part of comment for _copy_from_user(), per Alice]
  Link: https://lkml.kernel.org/r/20251024154754.99768-1-yury.norov@gmail.com
Link: https://lkml.kernel.org/r/20251023171607.1171534-1-yury.norov@gmail.com
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Tested-by: Alice Ryhl <aliceryhl@google.com>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Trevor Gross <tmgross@umich.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-12 10:00:16 -08:00
Ankan Biswas
57f3d89691 lib/xz: remove dead IA-64 (Itanium) support code
Support for the IA-64 (Itanium) architecture was removed in commit
cf8e865810 ("arch: Remove Itanium (IA-64) architecture").

This patch drops the IA-64 specific decompression code from lib/xz, which
was conditionally compiled with the now-obsolete CONFIG_XZ_DEC_IA64
option.

Link: https://lkml.kernel.org/r/20251014052738.31185-1-spyjetfayed@gmail.com
Signed-off-by: Ankan Biswas <spyjetfayed@gmail.com>
Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: Khalid Aziz <khalid@kernel.org>
Acked-by: Lasse Collin <lasse.collin@tukaani.org>
Cc: David Hunter <david.hunter.linux@gmail.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-12 10:00:15 -08:00
Li RongQing
9544f9e694 hung_task: panic when there are more than N hung tasks at the same time
The hung_task_panic sysctl is currently a blunt instrument: it's all or
nothing.

Panicking on a single hung task can be an overreaction to a transient
glitch.  A more reliable indicator of a systemic problem is when
multiple tasks hang simultaneously.

Extend hung_task_panic to accept an integer threshold, allowing the
kernel to panic only when N hung tasks are detected in a single scan. 
This provides finer control to distinguish between isolated incidents
and system-wide failures.

The accepted values are:
- 0: Don't panic (unchanged)
- 1: Panic on the first hung task (unchanged)
- N > 1: Panic after N hung tasks are detected in a single scan

The original behavior is preserved for values 0 and 1, maintaining full
backward compatibility.

[lance.yang@linux.dev: new changelog]
Link: https://lkml.kernel.org/r/20251015063615.2632-1-lirongqing@baidu.com
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Tested-by: Lance Yang <lance.yang@linux.dev>
Acked-by: Andrew Jeffery <andrew@codeconstruct.com.au> [aspeed_g5_defconfig]
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Florian Wesphal <fw@strlen.de>
Cc: Jakub Kacinski <kuba@kernel.org>
Cc: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Simon Horman <horms@kernel.org>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-12 10:00:14 -08:00
Lukas Bulwahn
cd4eaccc00 treewide: drop outdated compiler version remarks in Kconfig help texts
As of writing, Documentation/Changes states the minimal versions of GNU C
being 8.1, Clang being 15.0.0 and binutils being 2.30.  A few Kconfig help
texts are pointing out that specific GCC and Clang versions are needed,
but by now, those pointers to versions, such later than 4.0, later than
4.4, or clang later than 5.0, are obsolete and unlikely to be found by
users configuring their kernel builds anyway.

Drop these outdated remarks in Kconfig help texts referring to older
compiler and binutils versions.  No functional change.

Link: https://lkml.kernel.org/r/20251010082138.185752-1-lukas.bulwahn@redhat.com
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Russel King <linux@armlinux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-12 10:00:14 -08:00
Ard Biesheuvel
c0d597e016 lib/crypto: arm/blake2b: Move to scoped ksimd API
Even though ARM's versions of kernel_neon_begin()/_end() are not being
changed, update the newly migrated ARM blake2b to the scoped ksimd API
so that all ARM and arm64 in lib/crypto remains consistent in this
manner.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-12 09:57:52 -08:00
Eric Biggers
065f040010 Merge tag 'scoped-ksimd-for-arm-arm64' into libcrypto-fpsimd-on-stack
Pull scoped ksimd API for ARM and arm64 from Ard Biesheuvel:

  "Introduce a more strict replacement API for
   kernel_neon_begin()/kernel_neon_end() on both ARM and arm64, and
   replace occurrences of the latter pair appearing in lib/crypto"

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-12 09:55:55 -08:00
Andy Shevchenko
372a12bd5d lib/vsprintf: Check pointer before dereferencing in time_and_date()
The pointer may be invalid when gets to the printf(). In particular
the time_and_date() dereferencing it in some cases without checking.

Move the check from rtc_str() to time_and_date() to cover all cases.

Fixes: 7daac5b2fd ("lib/vsprintf: Print time64_t in human readable format")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://patch.msgid.link/20251110132118.4113976-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
2025-11-12 11:36:20 +01:00
Ard Biesheuvel
3142ec4af2 raid6: Move to more abstract 'ksimd' guard API
Move away from calling kernel_neon_begin() and kernel_neon_end()
directly, and instead, use the newly introduced scoped_ksimd() API. This
permits arm64 to modify the kernel mode NEON API without affecting code
that is shared between ARM and arm64.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:01 +01:00
Ard Biesheuvel
4fb623074e lib/crc: Switch ARM and arm64 to 'ksimd' scoped guard API
Before modifying the prototypes of kernel_neon_begin() and
kernel_neon_end() to accommodate kernel mode FP/SIMD state buffers
allocated on the stack, move arm64 to the new 'ksimd' scoped guard API,
which encapsulates the calls to those functions.

For symmetry, do the same for 32-bit ARM too.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:52:01 +01:00
Ard Biesheuvel
f53d18a4e6 lib/crypto: Switch ARM and arm64 to 'ksimd' scoped guard API
Before modifying the prototypes of kernel_neon_begin() and
kernel_neon_end() to accommodate kernel mode FP/SIMD state buffers
allocated on the stack, move arm64 to the new 'ksimd' scoped guard API,
which encapsulates the calls to those functions.

For symmetry, do the same for 32-bit ARM too.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-12 09:51:13 +01:00
Eric Biggers
b3aed551b3 lib/crypto: tests: Add KUnit tests for POLYVAL
Add a test suite for the POLYVAL library, including:

- All the standard tests and the benchmark from hash-test-template.h
- Comparison with a test vector from the RFC
- Test with key and message containing all one bits
- Additional tests related to the key struct

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:07:52 -08:00
Eric Biggers
b2210f3516 lib/crypto: tests: Add additional SHAKE tests
Add the following test cases to cover gaps in the SHAKE testing:

    - test_shake_all_lens_up_to_4096()
    - test_shake_multiple_squeezes()
    - test_shake_with_guarded_bufs()

Remove test_shake256_tiling() and test_shake256_tiling2() since they are
superseded by test_shake_multiple_squeezes().  It provides better test
coverage by using randomized testing.  E.g., it's able to generate a
zero-length squeeze followed by a nonzero-length squeeze, which the
first 7 versions of the SHA-3 patchset handled incorrectly.

Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:07:36 -08:00
David Howells
15c64c47e4 lib/crypto: tests: Add SHA3 kunit tests
Add a SHA3 kunit test suite, providing the following:

 (*) A simple test of each of SHA3-224, SHA3-256, SHA3-384, SHA3-512,
     SHAKE128 and SHAKE256.

 (*) NIST 0- and 1600-bit test vectors for SHAKE128 and SHAKE256.

 (*) Output tiling (multiple squeezing) tests for SHAKE256.

 (*) Standard hash template test for SHA3-256.  To make this possible,
     gen-hash-testvecs.py is modified to support sha3-256.

 (*) Standard benchmark test for SHA3-256.

[EB: dropped some unnecessary changes to gen-hash-testvecs.py, moved
     addition of Testing section in doc file into this commit, and
     other small cleanups]

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20251026055032.1413733-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:07:36 -08:00
Eric Biggers
6401fd334d lib/crypto: tests: Add KUnit tests for BLAKE2b
Add a KUnit test suite for the BLAKE2b library API, mirroring the
BLAKE2s test suite very closely.

As with the BLAKE2s test suite, a benchmark is included.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:07:36 -08:00
Eric Biggers
4d8da35579 lib/crypto: x86/polyval: Migrate optimized code into library
Migrate the x86_64 implementation of POLYVAL into lib/crypto/, wiring it
up to the POLYVAL library interface.  This makes the POLYVAL library be
properly optimized on x86_64.

This drops the x86_64 optimizations of polyval in the crypto_shash API.
That's fine, since polyval will be removed from crypto_shash entirely
since it is unneeded there.  But even if it comes back, the crypto_shash
API could just be implemented on top of the library API, as usual.

Adjust the names and prototypes of the assembly functions to align more
closely with the rest of the library code.

Also replace a movaps instruction with movups to remove the assumption
that the key struct is 16-byte aligned.  Users can still align the key
if they want (and at least in this case, movups is just as fast as
movaps), but it's inconvenient to require it.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:03:38 -08:00
Eric Biggers
37919e239e lib/crypto: arm64/polyval: Migrate optimized code into library
Migrate the arm64 implementation of POLYVAL into lib/crypto/, wiring it
up to the POLYVAL library interface.  This makes the POLYVAL library be
properly optimized on arm64.

This drops the arm64 optimizations of polyval in the crypto_shash API.
That's fine, since polyval will be removed from crypto_shash entirely
since it is unneeded there.  But even if it comes back, the crypto_shash
API could just be implemented on top of the library API, as usual.

Adjust the names and prototypes of the assembly functions to align more
closely with the rest of the library code.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:03:38 -08:00
Eric Biggers
3d176751e5 lib/crypto: polyval: Add POLYVAL library
Add support for POLYVAL to lib/crypto/.

This will replace the polyval crypto_shash algorithm and its use in the
hctr2 template, simplifying the code and reducing overhead.

Specifically, this commit introduces the POLYVAL library API and a
generic implementation of it.  Later commits will migrate the existing
architecture-optimized implementations of POLYVAL into lib/crypto/ and
add a KUnit test suite.

I've also rewritten the generic implementation completely, using a more
modern approach instead of the traditional table-based approach.  It's
now constant-time, requires no precomputation or dynamic memory
allocations, decreases the per-key memory usage from 4096 bytes to 16
bytes, and is faster than the old polyval-generic even on bulk data
reusing the same key (at least on x86_64, where I measured 15% faster).
We should do this for GHASH too, but for now just do it for POLYVAL.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:03:38 -08:00
Menglong Dong
08ed5c81f6 lib/test_fprobe: add testcase for mixed fprobe
Add the testcase for the fprobe, which will hook the same target with two
fprobe: entry, entry+exit. And the two fprobes will be registered with
different order.

fgraph and ftrace are both used for the fprobe, and this testcase is for
the mixed situation.

Link: https://lore.kernel.org/all/20251015083238.2374294-3-dongml2@chinatelecom.cn/

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2025-11-11 22:32:09 +09:00
Martin Kaiser
91a5409002 maple_tree: fix tracepoint string pointers
maple_tree tracepoints contain pointers to function names. Such a pointer
is saved when a tracepoint logs an event. There's no guarantee that it's
still valid when the event is parsed later and the pointer is dereferenced.

The kernel warns about these unsafe pointers.

	event 'ma_read' has unsafe pointer field 'fn'
	WARNING: kernel/trace/trace.c:3779 at ignore_event+0x1da/0x1e4

Mark the function names as tracepoint_string() to fix the events.

One case that doesn't work without my patch would be trace-cmd record
to save the binary ringbuffer and trace-cmd report to parse it in
userspace.  The address of __func__ can't be dereferenced from
userspace but tracepoint_string will add an entry to
/sys/kernel/tracing/printk_formats

Link: https://lkml.kernel.org/r/20251030155537.87972-1-martin@kaiser.cx
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-09 21:19:45 -08:00
Linus Torvalds
a1388fcb52 Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library fixes from Eric Biggers:
 "Two Curve25519 related fixes:

   - Re-enable KASAN support on curve25519-hacl64.c with gcc.

   - Disable the arm optimized Curve25519 code on CPU_BIG_ENDIAN
     kernels. It has always been broken in that configuration"

* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  lib/crypto: arm/curve25519: Disable on CPU_BIG_ENDIAN
  lib/crypto: curve25519-hacl64: Fix older clang KASAN workaround for GCC
2025-11-06 12:48:18 -08:00
Andy Shevchenko
6f15c3d715 bitops: Update kernel-doc in hweight.c to fix the issues with it
The kernel-doc in lib/hweight.c is global to  the file and
currently has issues:

Warning: lib/hweight.c:13 expecting prototype for hweightN(). Prototype was for __sw_hweight32() instead
Warning: lib/hweight.c:13 function parameter 'w' not described in '__sw_hweight32'

Update it accordingly.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
2025-11-06 11:51:04 -05:00
Eric Biggers
8ba60c5914 lib/crypto: x86/blake2s: Use vpternlogd for 3-input XORs
AVX-512 supports 3-input XORs via the vpternlogd (or vpternlogq)
instruction with immediate 0x96.  This approach, vs. the alternative of
two vpxor instructions, is already used in the CRC, AES-GCM, and AES-XTS
code, since it reduces the instruction count and is faster on some CPUs.
Make blake2s_compress_avx512() take advantage of it too.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:52 -08:00
Eric Biggers
cd5528621a lib/crypto: x86/blake2s: Avoid writing back unchanged 'f' value
Just before returning, blake2s_compress_ssse3() and
blake2s_compress_avx512() store updated values to the 'h', 't', and 'f'
fields of struct blake2s_ctx.  But 'f' is always unchanged (which is
correct; only the C code changes it).  So, there's no need to write to
'f'.  Use 64-bit stores (movq and vmovq) instead of 128-bit stores
(movdqu and vmovdqu) so that only 't' is written.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:52 -08:00
Eric Biggers
a7acd77ebd lib/crypto: x86/blake2s: Improve readability
Various cleanups for readability.  No change to the generated code:

- Add some comments
- Add #defines for arguments
- Rename some labels
- Use decimal constants instead of hex where it makes sense.
  (The pshufd immediates intentionally remain as hex.)
- Add blank lines when there's a logical break

The round loop still could use some work, but this is at least a start.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:52 -08:00
Eric Biggers
83c1a867c9 lib/crypto: x86/blake2s: Use local labels for data
Following the usual practice, prefix the names of the data labels with
".L" so that the assembler treats them as truly local.  This more
clearly expresses the intent and is less error-prone.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:52 -08:00
Eric Biggers
c19bdf24cc lib/crypto: x86/blake2s: Drop check for nblocks == 0
Since blake2s_compress() is always passed nblocks != 0, remove the
unnecessary check for nblocks == 0 from blake2s_compress_ssse3().

Note that this makes it consistent with blake2s_compress_avx512() in the
same file as well as the arm32 blake2s_compress().

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:52 -08:00
Eric Biggers
2f22115709 lib/crypto: x86/blake2s: Fix 32-bit arg treated as 64-bit
In the C code, the 'inc' argument to the assembly functions
blake2s_compress_ssse3() and blake2s_compress_avx512() is declared with
type u32, matching blake2s_compress().  The assembly code then reads it
from the 64-bit %rcx.  However, the ABI doesn't guarantee zero-extension
to 64 bits, nor do gcc or clang guarantee it.  Therefore, fix these
functions to read this argument from the 32-bit %ecx.

In theory, this bug could have caused the wrong 'inc' value to be used,
causing incorrect BLAKE2s hashes.  In practice, probably not: I've fixed
essentially this same bug in many other assembly files too, but there's
never been a real report of it having caused a problem.  In x86_64, all
writes to 32-bit registers are zero-extended to 64 bits.  That results
in zero-extension in nearly all situations.  I've only been able to
demonstrate a lack of zero-extension with a somewhat contrived example
involving truncation, e.g. when the C code has a u64 variable holding
0x1234567800000040 and passes it as a u32 expecting it to be truncated
to 0x40 (64).  But that's not what the real code does, of course.

Fixes: ed0356eda1 ("crypto: blake2s - x86_64 SIMD implementation")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:51 -08:00
Eric Biggers
95ce85de0b lib/crypto: arm, arm64: Drop filenames from file comments
Remove self-references to filenames from assembly files in
lib/crypto/arm/ and lib/crypto/arm64/.  This follows the recommended
practice and eliminates an outdated reference to sha2-ce-core.S.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102014809.170713-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:51 -08:00
Eric Biggers
b8b816ec04 lib/crypto: arm/blake2s: Fix some comments
Fix the indices in some comments in blake2s-core.S.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102021553.176587-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:51 -08:00
Eric Biggers
862445d3b9 lib/crypto: s390/sha3: Add optimized one-shot SHA-3 digest functions
Some z/Architecture processors can compute a SHA-3 digest in a single
instruction.  arch/s390/crypto/ already uses this capability to optimize
the SHA-3 crypto_shash algorithms.

Use this capability to implement the sha3_224(), sha3_256(), sha3_384(),
and sha3_512() library functions too.

SHA3-256 benchmark results provided by Harald Freudenberger
(https://lore.kernel.org/r/4188d18bfcc8a64941c5ebd8de10ede2@linux.ibm.com/)
on a z/Architecture machine with "facility 86" (MSA level 12):

    Length (bytes)    Before (MB/s)   After (MB/s)
    ==============    =============   ============
          16                212             225
          64                820             915
         256               1850            3350
        1024               5400            8300
        4096              11200           11300

Note: the original data from Harald was given in the form of a graph for
each length, showing the distribution of throughputs from 500 runs.  I
guesstimated the peak of each one.

Harald also reported that the generic SHA-3 code was at most 259 MB/s
(https://lore.kernel.org/r/c39f6b6c110def0095e5da5becc12085@linux.ibm.com/).
So as expected, the earlier commit that optimized sha3_absorb_blocks()
and sha3_keccakf() is the more important one; it optimized the Keccak
permutation which is the most performance-critical part of SHA-3.
Still, this additional commit does notably improve performance further
on some lengths.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20251026055032.1413733-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:41 -08:00
Eric Biggers
0354d3c1f1 lib/crypto: sha3: Support arch overrides of one-shot digest functions
Add support for architecture-specific overrides of sha3_224(),
sha3_256(), sha3_384(), and sha3_512().  This will be used to implement
these functions more efficiently on s390 than is possible via the usual
init + update + final flow.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20251026055032.1413733-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
Eric Biggers
04171105d3 lib/crypto: s390/sha3: Add optimized Keccak functions
Implement sha3_absorb_blocks() and sha3_keccakf() using the hardware-
accelerated SHA-3 support in Message-Security-Assist Extension 6.

This accelerates the SHA3-224, SHA3-256, SHA3-384, SHA3-512, and
SHAKE256 library functions.

Note that arch/s390/crypto/ already has SHA-3 code that uses this
extension, but it is exposed only via crypto_shash.  This commit brings
the same acceleration to the SHA-3 library.  The arch/s390/crypto/
version will become redundant and be removed in later changes.

Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
Eric Biggers
1e29a75057 lib/crypto: arm64/sha3: Migrate optimized code into library
Instead of exposing the arm64-optimized SHA-3 code via arm64-specific
crypto_shash algorithms, instead just implement the sha3_absorb_blocks()
and sha3_keccakf() library functions.  This is much simpler, it makes
the SHA-3 library functions be arm64-optimized, and it fixes the
longstanding issue where the arm64-optimized SHA-3 code was disabled by
default.  SHA-3 still remains available through crypto_shash, but
individual architectures no longer need to handle it.

Note: to see the diff from arch/arm64/crypto/sha3-ce-glue.c to
lib/crypto/arm64/sha3.h, view this commit with 'git show -M10'.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
Eric Biggers
6fa873641c lib/crypto: sha3: Add FIPS cryptographic algorithm self-test
Since the SHA-3 algorithms are FIPS-approved, add the boot-time
self-test which is apparently required.  This closely follows the
corresponding SHA-1, SHA-256, and SHA-512 tests.

Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
David Howells
c0db39e253 lib/crypto: sha3: Move SHA3 Iota step mapping into round function
In crypto/sha3_generic.c, the keccakf() function calls keccakf_round()
to do four of Keccak-f's five step mappings.  However, it does not do
the Iota step mapping - presumably because that is dependent on round
number, whereas Theta, Rho, Pi and Chi are not.

Note that the keccakf_round() function needs to be explicitly
non-inlined on certain architectures as gcc's produced output will (or
used to) use over 1KiB of stack space if inlined.

Now, this code was copied more or less verbatim into lib/crypto/sha3.c,
so that has the same aesthetic issue.  Fix this there by passing the
round number into sha3_keccakf_one_round_generic() and doing the Iota
step mapping there.

crypto/sha3_generic.c is left untouched as that will be converted to use
lib/crypto/sha3.c at some point.

Suggested-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
David Howells
0593447248 lib/crypto: sha3: Add SHA-3 support
Add SHA-3 support to lib/crypto/.  All six algorithms in the SHA-3
family are supported: four digests (SHA3-224, SHA3-256, SHA3-384, and
SHA3-512) and two extendable-output functions (SHAKE128 and SHAKE256).

The SHAKE algorithms will be required for ML-DSA.

[EB: simplified the API to use fewer types and functions, fixed bug that
     sometimes caused incorrect SHAKE output, cleaned up the
     documentation, dropped an ad-hoc test that was inconsistent with
     the rest of lib/crypto/, and many other cleanups]

Signed-off-by: David Howells <dhowells@redhat.com>
Co-developed-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:32 -08:00