Commit Graph

1397375 Commits

Author SHA1 Message Date
Eric Biggers
d35abc0b1d crypto: hctr2 - Convert to use POLYVAL library
The "hash function" in hctr2 is fixed at POLYVAL; it can never vary.
Just use the POLYVAL library, which is much easier to use than the
crypto_shash API.  It's faster, uses fixed-size structs, and never fails
(all the functions return void).

Note that this eliminates the only known user of the polyval support in
crypto_shash.  A later commit will remove support for polyval from
crypto_shash, given that the library API is sufficient.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:03:38 -08:00
Eric Biggers
4d8da35579 lib/crypto: x86/polyval: Migrate optimized code into library
Migrate the x86_64 implementation of POLYVAL into lib/crypto/, wiring it
up to the POLYVAL library interface.  This makes the POLYVAL library be
properly optimized on x86_64.

This drops the x86_64 optimizations of polyval in the crypto_shash API.
That's fine, since polyval will be removed from crypto_shash entirely
since it is unneeded there.  But even if it comes back, the crypto_shash
API could just be implemented on top of the library API, as usual.

Adjust the names and prototypes of the assembly functions to align more
closely with the rest of the library code.

Also replace a movaps instruction with movups to remove the assumption
that the key struct is 16-byte aligned.  Users can still align the key
if they want (and at least in this case, movups is just as fast as
movaps), but it's inconvenient to require it.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:03:38 -08:00
Eric Biggers
37919e239e lib/crypto: arm64/polyval: Migrate optimized code into library
Migrate the arm64 implementation of POLYVAL into lib/crypto/, wiring it
up to the POLYVAL library interface.  This makes the POLYVAL library be
properly optimized on arm64.

This drops the arm64 optimizations of polyval in the crypto_shash API.
That's fine, since polyval will be removed from crypto_shash entirely
since it is unneeded there.  But even if it comes back, the crypto_shash
API could just be implemented on top of the library API, as usual.

Adjust the names and prototypes of the assembly functions to align more
closely with the rest of the library code.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:03:38 -08:00
Eric Biggers
3d176751e5 lib/crypto: polyval: Add POLYVAL library
Add support for POLYVAL to lib/crypto/.

This will replace the polyval crypto_shash algorithm and its use in the
hctr2 template, simplifying the code and reducing overhead.

Specifically, this commit introduces the POLYVAL library API and a
generic implementation of it.  Later commits will migrate the existing
architecture-optimized implementations of POLYVAL into lib/crypto/ and
add a KUnit test suite.

I've also rewritten the generic implementation completely, using a more
modern approach instead of the traditional table-based approach.  It's
now constant-time, requires no precomputation or dynamic memory
allocations, decreases the per-key memory usage from 4096 bytes to 16
bytes, and is faster than the old polyval-generic even on bulk data
reusing the same key (at least on x86_64, where I measured 15% faster).
We should do this for GHASH too, but for now just do it for POLYVAL.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:03:38 -08:00
Eric Biggers
e1c3608497 crypto: polyval - Rename conflicting functions
Rename polyval_init() and polyval_update(), in preparation for adding
library functions with the same name to <crypto/polyval.h>.

Note that polyval-generic.c will be removed later, as it will be
superseded by the library.  This commit just keeps the kernel building
for the initial introduction of the library.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:03:38 -08:00
Eric Biggers
8ba60c5914 lib/crypto: x86/blake2s: Use vpternlogd for 3-input XORs
AVX-512 supports 3-input XORs via the vpternlogd (or vpternlogq)
instruction with immediate 0x96.  This approach, vs. the alternative of
two vpxor instructions, is already used in the CRC, AES-GCM, and AES-XTS
code, since it reduces the instruction count and is faster on some CPUs.
Make blake2s_compress_avx512() take advantage of it too.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:52 -08:00
Eric Biggers
cd5528621a lib/crypto: x86/blake2s: Avoid writing back unchanged 'f' value
Just before returning, blake2s_compress_ssse3() and
blake2s_compress_avx512() store updated values to the 'h', 't', and 'f'
fields of struct blake2s_ctx.  But 'f' is always unchanged (which is
correct; only the C code changes it).  So, there's no need to write to
'f'.  Use 64-bit stores (movq and vmovq) instead of 128-bit stores
(movdqu and vmovdqu) so that only 't' is written.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:52 -08:00
Eric Biggers
a7acd77ebd lib/crypto: x86/blake2s: Improve readability
Various cleanups for readability.  No change to the generated code:

- Add some comments
- Add #defines for arguments
- Rename some labels
- Use decimal constants instead of hex where it makes sense.
  (The pshufd immediates intentionally remain as hex.)
- Add blank lines when there's a logical break

The round loop still could use some work, but this is at least a start.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:52 -08:00
Eric Biggers
83c1a867c9 lib/crypto: x86/blake2s: Use local labels for data
Following the usual practice, prefix the names of the data labels with
".L" so that the assembler treats them as truly local.  This more
clearly expresses the intent and is less error-prone.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:52 -08:00
Eric Biggers
c19bdf24cc lib/crypto: x86/blake2s: Drop check for nblocks == 0
Since blake2s_compress() is always passed nblocks != 0, remove the
unnecessary check for nblocks == 0 from blake2s_compress_ssse3().

Note that this makes it consistent with blake2s_compress_avx512() in the
same file as well as the arm32 blake2s_compress().

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:52 -08:00
Eric Biggers
2f22115709 lib/crypto: x86/blake2s: Fix 32-bit arg treated as 64-bit
In the C code, the 'inc' argument to the assembly functions
blake2s_compress_ssse3() and blake2s_compress_avx512() is declared with
type u32, matching blake2s_compress().  The assembly code then reads it
from the 64-bit %rcx.  However, the ABI doesn't guarantee zero-extension
to 64 bits, nor do gcc or clang guarantee it.  Therefore, fix these
functions to read this argument from the 32-bit %ecx.

In theory, this bug could have caused the wrong 'inc' value to be used,
causing incorrect BLAKE2s hashes.  In practice, probably not: I've fixed
essentially this same bug in many other assembly files too, but there's
never been a real report of it having caused a problem.  In x86_64, all
writes to 32-bit registers are zero-extended to 64 bits.  That results
in zero-extension in nearly all situations.  I've only been able to
demonstrate a lack of zero-extension with a somewhat contrived example
involving truncation, e.g. when the C code has a u64 variable holding
0x1234567800000040 and passes it as a u32 expecting it to be truncated
to 0x40 (64).  But that's not what the real code does, of course.

Fixes: ed0356eda1 ("crypto: blake2s - x86_64 SIMD implementation")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:51 -08:00
Eric Biggers
95ce85de0b lib/crypto: arm, arm64: Drop filenames from file comments
Remove self-references to filenames from assembly files in
lib/crypto/arm/ and lib/crypto/arm64/.  This follows the recommended
practice and eliminates an outdated reference to sha2-ce-core.S.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102014809.170713-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:51 -08:00
Eric Biggers
b8b816ec04 lib/crypto: arm/blake2s: Fix some comments
Fix the indices in some comments in blake2s-core.S.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102021553.176587-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:51 -08:00
Eric Biggers
496df7cd64 crypto: s390/sha3 - Remove superseded SHA-3 code
The SHA-3 library now utilizes the same s390 SHA-3 acceleration
capabilities as the arch/s390/crypto/ SHA-3 crypto_shash algorithms.
Moreover, crypto/sha3.c now uses the SHA-3 library.  The result is that
all SHA-3 APIs are now s390-accelerated without any need for the old
SHA-3 code in arch/s390/crypto/.  Remove this superseded code.

Also update the s390 defconfig and debug_defconfig files to enable
CONFIG_CRYPTO_SHA3 instead of CONFIG_CRYPTO_SHA3_256_S390 and
CONFIG_CRYPTO_SHA3_512_S390.  This makes it so that the s390-optimized
SHA-3 continues to be built when either of these defconfigs is used.

Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-16-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:51 -08:00
Eric Biggers
f1799d1728 crypto: sha3 - Reimplement using library API
Replace sha3_generic.c with a new file sha3.c which implements the SHA-3
crypto_shash algorithms on top of the SHA-3 library API.

Change the driver name suffix from "-generic" to "-lib" to reflect that
these algorithms now just use the (possibly arch-optimized) library.

This closely mirrors crypto/{md5,sha1,sha256,sha512,blake2b}.c.

Implement export_core and import_core, since crypto/hmac.c expects these
to be present.  (Note that there is no security purpose in wrapping
SHA-3 with HMAC.  HMAC was designed for older algorithms that don't
resist length extension attacks.  But since someone could be using
"hmac(sha3-*)" via crypto_shash anyway, keep supporting it for now.)

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20251026055032.1413733-15-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:51 -08:00
Eric Biggers
d280d4d56a crypto: jitterentropy - Use default sha3 implementation
Make jitterentropy use "sha3-256" instead of "sha3-256-generic", as the
ability to explicitly request the generic code is going away.  It's not
worth providing a special generic API just for jitterentropy.  There are
many other solutions available to it, such as doing more iterations or
using a more effective jitter collection method.

Moreover, the status quo is that SHA-3 is quite slow anyway.  Currently
only arm64 and s390 have architecture-optimized SHA-3 code.  I'm not
familiar with the performance of the s390 one, but the arm64 one isn't
actually that much faster than the generic code anyway.

Note that jitterentropy should just use the library API instead of
crypto_shash.  But that belongs in a separate change later.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20251026055032.1413733-14-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:51 -08:00
Eric Biggers
862445d3b9 lib/crypto: s390/sha3: Add optimized one-shot SHA-3 digest functions
Some z/Architecture processors can compute a SHA-3 digest in a single
instruction.  arch/s390/crypto/ already uses this capability to optimize
the SHA-3 crypto_shash algorithms.

Use this capability to implement the sha3_224(), sha3_256(), sha3_384(),
and sha3_512() library functions too.

SHA3-256 benchmark results provided by Harald Freudenberger
(https://lore.kernel.org/r/4188d18bfcc8a64941c5ebd8de10ede2@linux.ibm.com/)
on a z/Architecture machine with "facility 86" (MSA level 12):

    Length (bytes)    Before (MB/s)   After (MB/s)
    ==============    =============   ============
          16                212             225
          64                820             915
         256               1850            3350
        1024               5400            8300
        4096              11200           11300

Note: the original data from Harald was given in the form of a graph for
each length, showing the distribution of throughputs from 500 runs.  I
guesstimated the peak of each one.

Harald also reported that the generic SHA-3 code was at most 259 MB/s
(https://lore.kernel.org/r/c39f6b6c110def0095e5da5becc12085@linux.ibm.com/).
So as expected, the earlier commit that optimized sha3_absorb_blocks()
and sha3_keccakf() is the more important one; it optimized the Keccak
permutation which is the most performance-critical part of SHA-3.
Still, this additional commit does notably improve performance further
on some lengths.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20251026055032.1413733-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:30:41 -08:00
Eric Biggers
0354d3c1f1 lib/crypto: sha3: Support arch overrides of one-shot digest functions
Add support for architecture-specific overrides of sha3_224(),
sha3_256(), sha3_384(), and sha3_512().  This will be used to implement
these functions more efficiently on s390 than is possible via the usual
init + update + final flow.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20251026055032.1413733-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
Eric Biggers
04171105d3 lib/crypto: s390/sha3: Add optimized Keccak functions
Implement sha3_absorb_blocks() and sha3_keccakf() using the hardware-
accelerated SHA-3 support in Message-Security-Assist Extension 6.

This accelerates the SHA3-224, SHA3-256, SHA3-384, SHA3-512, and
SHAKE256 library functions.

Note that arch/s390/crypto/ already has SHA-3 code that uses this
extension, but it is exposed only via crypto_shash.  This commit brings
the same acceleration to the SHA-3 library.  The arch/s390/crypto/
version will become redundant and be removed in later changes.

Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
Eric Biggers
1e29a75057 lib/crypto: arm64/sha3: Migrate optimized code into library
Instead of exposing the arm64-optimized SHA-3 code via arm64-specific
crypto_shash algorithms, instead just implement the sha3_absorb_blocks()
and sha3_keccakf() library functions.  This is much simpler, it makes
the SHA-3 library functions be arm64-optimized, and it fixes the
longstanding issue where the arm64-optimized SHA-3 code was disabled by
default.  SHA-3 still remains available through crypto_shash, but
individual architectures no longer need to handle it.

Note: to see the diff from arch/arm64/crypto/sha3-ce-glue.c to
lib/crypto/arm64/sha3.h, view this commit with 'git show -M10'.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
Eric Biggers
be755eb2b0 crypto: arm64/sha3 - Update sha3_ce_transform() to prepare for library
- Use size_t lengths, to match the library.

- Pass the block size instead of digest size, and add support for the
  block size that SHAKE128 uses.  This allows the code to be used with
  SHAKE128 and SHAKE256, which don't have the concept of a digest size.
  SHAKE256 has the same block size as SHA3-256, but SHAKE128 has a
  unique block size.  Thus, there are now 5 supported block sizes.

Don't bother changing the "glue" code arm64_sha3_update() too much, as
it gets deleted when the SHA-3 code is migrated into lib/crypto/ anyway.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
Eric Biggers
6fa873641c lib/crypto: sha3: Add FIPS cryptographic algorithm self-test
Since the SHA-3 algorithms are FIPS-approved, add the boot-time
self-test which is apparently required.  This closely follows the
corresponding SHA-1, SHA-256, and SHA-512 tests.

Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
David Howells
c0db39e253 lib/crypto: sha3: Move SHA3 Iota step mapping into round function
In crypto/sha3_generic.c, the keccakf() function calls keccakf_round()
to do four of Keccak-f's five step mappings.  However, it does not do
the Iota step mapping - presumably because that is dependent on round
number, whereas Theta, Rho, Pi and Chi are not.

Note that the keccakf_round() function needs to be explicitly
non-inlined on certain architectures as gcc's produced output will (or
used to) use over 1KiB of stack space if inlined.

Now, this code was copied more or less verbatim into lib/crypto/sha3.c,
so that has the same aesthetic issue.  Fix this there by passing the
round number into sha3_keccakf_one_round_generic() and doing the Iota
step mapping there.

crypto/sha3_generic.c is left untouched as that will be converted to use
lib/crypto/sha3.c at some point.

Suggested-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
David Howells
0593447248 lib/crypto: sha3: Add SHA-3 support
Add SHA-3 support to lib/crypto/.  All six algorithms in the SHA-3
family are supported: four digests (SHA3-224, SHA3-256, SHA3-384, and
SHA3-512) and two extendable-output functions (SHAKE128 and SHAKE256).

The SHAKE algorithms will be required for ML-DSA.

[EB: simplified the API to use fewer types and functions, fixed bug that
     sometimes caused incorrect SHAKE output, cleaned up the
     documentation, dropped an ad-hoc test that was inconsistent with
     the rest of lib/crypto/, and many other cleanups]

Signed-off-by: David Howells <dhowells@redhat.com>
Co-developed-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:32 -08:00
David Howells
4141211903 crypto: arm64/sha3 - Rename conflicting function
Rename the arm64 sha3_update() function to have an "arm64_" prefix to
avoid a name conflict with the upcoming SHA-3 library.

Note: this code will be superseded later.  This commit simply keeps the
kernel building for the initial introduction of the library.

[EB: dropped unnecessary rename of sha3_finup(), and improved commit
     message]

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-03 09:10:58 -08:00
David Howells
863ee5a3aa crypto: s390/sha3 - Rename conflicting functions
Rename the s390 sha3_*_init() functions to have an "s390_" prefix to
avoid a name conflict with the upcoming SHA-3 library functions.

Note: this code will be superseded later.  This commit simply keeps the
kernel building for the initial introduction of the library.

[EB: dropped unnecessary rename of import and export functions, and
     improved commit message]

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-03 09:10:58 -08:00
Eric Biggers
fa3ca9bfe3 crypto: blake2b - Reimplement using library API
Replace blake2b_generic.c with a new file blake2b.c which implements the
BLAKE2b crypto_shash algorithms on top of the BLAKE2b library API.

Change the driver name suffix from "-generic" to "-lib" to reflect that
these algorithms now just use the (possibly arch-optimized) library.

This closely mirrors crypto/{md5,sha1,sha256,sha512}.c.

Remove include/crypto/internal/blake2b.h since it is no longer used.
Likewise, remove struct blake2b_state from include/crypto/blake2b.h.

Omit support for import_core and export_core, since there are no legacy
drivers that need these for these algorithms.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Eric Biggers
ba6617bd47 lib/crypto: arm/blake2b: Migrate optimized code into library
Migrate the arm-optimized BLAKE2b code from arch/arm/crypto/ to
lib/crypto/arm/.  This makes the BLAKE2b library able to use it, and it
also simplifies the code because it's easier to integrate with the
library than crypto_shash.

This temporarily makes the arm-optimized BLAKE2b code unavailable via
crypto_shash.  A later commit reimplements the blake2b-* crypto_shash
algorithms on top of the BLAKE2b library API, making it available again.

Note that as per the lib/crypto/ convention, the optimized code is now
enabled by default.  So, this also fixes the longstanding issue where
the optimized BLAKE2b code was not enabled by default.

To see the diff from arch/arm/crypto/blake2b-neon-glue.c to
lib/crypto/arm/blake2b.h, view this commit with 'git show -M10'.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Eric Biggers
23a16c9533 lib/crypto: blake2b: Add BLAKE2b library functions
Add a library API for BLAKE2b, closely modeled after the BLAKE2s API.

This will allow in-kernel users such as btrfs to use BLAKE2b without
going through the generic crypto layer.  In addition, as usual the
BLAKE2b crypto_shash algorithms will be reimplemented on top of this.

Note: to create lib/crypto/blake2b.c I made a copy of
lib/crypto/blake2s.c and made the updates from BLAKE2s => BLAKE2b.  This
way, the BLAKE2s and BLAKE2b code is kept consistent.  Therefore, it
borrows the SPDX-License-Identifier and Copyright from
lib/crypto/blake2s.c rather than crypto/blake2b_generic.c.

The library API uses 'struct blake2b_ctx', consistent with other
lib/crypto/ APIs.  The existing 'struct blake2b_state' will be removed
once the blake2b crypto_shash algorithms are updated to stop using it.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Eric Biggers
c99d307060 byteorder: Add le64_to_cpu_array() and cpu_to_le64_array()
Add le64_to_cpu_array() and cpu_to_le64_array().  These mirror the
corresponding 32-bit functions.

These will be used by the BLAKE2b code.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Eric Biggers
b95d4471cb lib/crypto: blake2s: Document the BLAKE2s library API
Add kerneldoc for the BLAKE2s library API.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Eric Biggers
5385bcbffe lib/crypto: blake2s: Drop excessive const & rename block => data
A couple more small cleanups to the BLAKE2s code before these things get
propagated into the BLAKE2b code:

- Drop 'const' from some non-pointer function parameters.  It was a bit
  excessive and not conventional.

- Rename 'block' argument of blake2s_compress*() to 'data'.  This is for
  consistency with the SHA-* code, and also to avoid the implication
  that it points to a singular "block".

No functional changes.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Eric Biggers
5e0ec8e46d lib/crypto: blake2s: Rename blake2s_state to blake2s_ctx
For consistency with the SHA-1, SHA-2, SHA-3 (in development), and MD5
library APIs, rename blake2s_state to blake2s_ctx.

As a refresher, the ctx name:

- Is a bit shorter.
- Avoids confusion with the compression function state, which is also
  often called the state (but is just part of the full context).
- Is consistent with OpenSSL.

Not a big deal, of course.  But consistency is nice.  With a BLAKE2b
library API about to be added, this is a convenient time to update this.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Eric Biggers
50b8e36994 lib/crypto: blake2s: Adjust parameter order of blake2s()
Reorder the parameters of blake2s() from (out, in, key, outlen, inlen,
keylen) to (key, keylen, in, inlen, out, outlen).

This aligns BLAKE2s with the common conventions of pairing buffers and
their lengths, and having outputs follow inputs.  This is widely used
elsewhere in lib/crypto/ and crypto/, and even elsewhere in the BLAKE2s
code itself such as blake2s_init_key() and blake2s_final().  So
blake2s() was a bit of an exception.

Notably, this results in the same order as hmac_*_usingrawkey().

Note that since the type signature changed, it's not possible for a
blake2s() call site to be silently missed.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Eric Biggers
04cadb4fe0 lib/crypto: Add FIPS self-tests for SHA-1 and SHA-2
Add FIPS cryptographic algorithm self-tests for all SHA-1 and SHA-2
algorithms.  Following the "Implementation Guidance for FIPS 140-3"
document, to achieve this it's sufficient to just test a single test
vector for each of HMAC-SHA1, HMAC-SHA256, and HMAC-SHA512.

Just run these tests in the initcalls, following the example of e.g.
crypto/kdf_sp800108.c.  Note that this should meet the FIPS self-test
requirement even in the built-in case, given that the initcalls run
before userspace, storage, network, etc. are accessible.

This does not fix a regression, seeing as lib/ has had SHA-1 support
since 2005 and SHA-256 support since 2018.  Neither ever had FIPS
self-tests.  Moreover, fips=1 support has always been an unfinished
feature upstream.  However, with lib/ now being used more widely, it's
now seeing more scrutiny and people seem to want these now [1][2].

[1] https://lore.kernel.org/r/3226361.1758126043@warthog.procyon.org.uk/
[2] https://lore.kernel.org/r/f31dbb22-0add-481c-aee0-e337a7731f8e@oracle.com/

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251011001047.51886-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Linus Torvalds
dcb6fa37fd Linux 6.18-rc3 v6.18-rc3 2025-10-26 15:59:49 -07:00
Linus Torvalds
4bb1f7e19c Merge tag 'char-misc-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
 "Here are some small char/misc/android driver fixes for 6.18-rc3 for
  reported issues. Included in here are:

   - rust binder fixes for reported issues

   - mei device id addition

   - mei driver fixes

   - comedi bugfix

   - most usb driver bugfixes

   - fastrpc memory leak fix

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  most: usb: hdm_probe: Fix calling put_device() before device initialization
  most: usb: Fix use-after-free in hdm_disconnect
  binder: remove "invalid inc weak" check
  mei: txe: fix initialization order
  comedi: fix divide-by-zero in comedi_buf_munge()
  mei: late_bind: Fix -Wincompatible-function-pointer-types-strict
  misc: fastrpc: Fix dma_buf object leak in fastrpc_map_lookup
  mei: me: add wildcat lake P DID
  misc: amd-sbi: Clarify that this is a BMC driver
  nvmem: rcar-efuse: add missing MODULE_DEVICE_TABLE
  binder: Fix missing kernel-doc entries in binder.c
  rust_binder: report freeze notification only when fully frozen
  rust_binder: don't delete FreezeListener if there are pending duplicates
  rust_binder: freeze_notif_done should resend if wrong state
  rust_binder: remove warning about orphan mappings
  rust_binder: clean `clippy::mem_replace_with_default` warning
2025-10-26 10:33:46 -07:00
Linus Torvalds
40282418e1 Merge tag 'staging-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
 "Here are some small staging driver fixes for the gpib subsystem to
  resolve some reported issues. Included in here are:

   - memory leak fixes

   - error code fixes

   - proper protocol fixes

  All of these have been in linux-next for almost 2 weeks now with no
  reported issues"

* tag 'staging-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: gpib: Fix device reference leak in fmh_gpib driver
  staging: gpib: Return -EINTR on device clear
  staging: gpib: Fix sending clear and trigger events
  staging: gpib: Fix no EOI on 1 and 2 byte writes
2025-10-26 10:29:45 -07:00
Linus Torvalds
aa6085a067 Merge tag 'tty-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
 "Here are some small tty and serial driver fixes for reported issues.
  Included in here are:

   - sh-sci serial driver fixes

   - 8250_dw and _mtk driver fixes

   - sc16is7xx driver bugfix

   - new 8250_exar device ids added

  All of these have been in linux-next this past week with no reported
  issues"

* tag 'tty-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: 8250_mtk: Enable baud clock and manage in runtime PM
  serial: 8250_dw: handle reset control deassert error
  dt-bindings: serial: sh-sci: Fix r8a78000 interrupts
  serial: sc16is7xx: remove useless enable of enhanced features
  serial: 8250_exar: add support for Advantech 2 port card with Device ID 0x0018
  tty: serial: sh-sci: fix RSCI FIFO overrun handling
2025-10-26 10:24:39 -07:00
Linus Torvalds
6190d0fa18 Merge tag 'usb-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB driver fixes from Greg KH:
 "Here are some small USB driver fixes and new device ids for 6.18-rc3.
  Included in here are:

   - new option serial driver device ids added

   - dt bindings fixes for numerous platforms

   - xhci bugfixes for many reported regressions

   - usbio dependency bugfix

   - dwc3 driver fix

   - raw-gadget bugfix

  All of these have been in linux-next this week with no reported issues"

* tag 'usb-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: serial: option: add Telit FN920C04 ECM compositions
  USB: serial: option: add Quectel RG255C
  tcpm: switch check for role_sw device with fw_node
  usb/core/quirks: Add Huawei ME906S to wakeup quirk
  usb: raw-gadget: do not limit transfer length
  USB: serial: option: add UNISOC UIS7720
  xhci: dbc: enable back DbC in resume if it was enabled before suspend
  xhci: dbc: fix bogus 1024 byte prefix if ttyDBC read races with stall event
  usb: xhci-pci: Fix USB2-only root hub registration
  dt-bindings: usb: qcom,snps-dwc3: Fix bindings for X1E80100
  usb: misc: Add x86 dependency for Intel USBIO driver
  dt-bindings: usb: switch: split out ports definition
  usb: dwc3: Don't call clk_bulk_disable_unprepare() twice
  dt-bindings: usb: dwc3-imx8mp: dma-range is required only for imx8mp
2025-10-26 10:21:13 -07:00
Linus Torvalds
dbfc6422a3 Merge tag 'x86_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:

 - Remove dead code leftovers after a recent mitigations cleanup which
   fail a Clang build

 - Make sure a Retbleed mitigation message is printed only when
   necessary

 - Correct the last Zen1 microcode revision for which Entrysign sha256
   check is needed

 - Fix a NULL ptr deref when mounting the resctrl fs on a system which
   supports assignable counters but where L3 total and local bandwidth
   monitoring has been disabled at boot

* tag 'x86_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/bugs: Remove dead code which might prevent from building
  x86/bugs: Qualify RETBLEED_INTEL_MSG
  x86/microcode: Fix Entrysign revision check for Zen1/Naples
  x86,fs/resctrl: Fix NULL pointer dereference with events force-disabled in mbm_event mode
2025-10-26 09:57:18 -07:00
Linus Torvalds
5fee0dafba Merge tag 'irq_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Borislav Petkov:

 - Restore the original buslock locking in a couple of places in the irq
   core subsystem after a rework

* tag 'irq_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/manage: Add buslock back in to enable_irq()
  genirq/manage: Add buslock back in to __disable_irq_nosync()
  genirq/chip: Add buslock back in to irq_set_handler()
2025-10-26 09:54:36 -07:00
Linus Torvalds
af8159515f Merge tag 'objtool_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fixes from Borislav Petkov:

 - Fix x32 build due to wrong format specifier on that sub-arch

 - Add one more Rust noreturn function to objtool's list

* tag 'objtool_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Fix failure when being compiled on x32 system
  objtool/rust: add one more `noreturn` Rust function
2025-10-26 09:44:36 -07:00
Linus Torvalds
1bc9743b64 Merge tag 'sched_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Borislav Petkov:

 - Make sure a CFS runqueue on a throttled hierarchy has its PELT clock
   throttled otherwise task movement and manipulation would lead to
   dangling cfs_rq references and an eventual crash

* tag 'sched_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Start a cfs_rq on throttled hierarchy with PELT clock throttled
2025-10-26 09:42:19 -07:00
Linus Torvalds
7ea5092f52 Merge tag 'timers_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Borislav Petkov:

 - Do not create more than eight (max supported) AUX clocks sysfs
   hierarchies

* tag 'timers_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timekeeping: Fix aux clocks sysfs initialization loop bound
2025-10-26 09:40:16 -07:00
Linus Torvalds
72761a7e31 Merge tag 'driver-core-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
Pull driver core fixes from Danilo Krummrich:

 - In Device::parent(), do not make any assumptions on the device
   context of the parent device

 - Check visibility before changing ownership of a sysfs attribute
   group

 - In topology_parse_cpu_capacity(), replace an incorrect usage of
   PTR_ERR_OR_ZERO() with IS_ERR_OR_NULL()

 - In devcoredump, fix a circular locking dependency between
   struct devcd_entry::mutex and kernfs

 - Do not warn about a pending fw_devlink sync state

* tag 'driver-core-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core:
  arch_topology: Fix incorrect error check in topology_parse_cpu_capacity()
  rust: device: fix device context of Device::parent()
  sysfs: check visibility before changing group attribute ownership
  devcoredump: Fix circular locking dependency with devcd->mutex.
  driver core: fw_devlink: Don't warn about sync_state() pending
2025-10-25 11:03:46 -07:00
Linus Torvalds
818444a61b Merge tag 'firewire-fixes-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire fixes from Takashi Sakamoto:
 "A small collection of FireWire fixes. This includes corrections to
  sparse and API documentation"

* tag 'firewire-fixes-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
  firewire: init_ohci1394_dma: add missing function parameter documentation
  firewire: core: fix __must_hold() annotation
2025-10-25 10:58:32 -07:00
Linus Torvalds
9bb956508c Merge tag 'riscv-for-linus-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:

 - Close a race during boot between userspace vDSO usage and some
   late-initialized vDSO data

 - Improve performance on systems with non-CPU-cache-coherent
   DMA-capable peripherals by enabling write combining on
   pgprot_dmacoherent() allocations

 - Add human-readable detail for RISC-V IPI tracing

 - Provide more information to zsmalloc on 64-bit RISC-V to improve
   allocation

 - Silence useless boot messages about CPUs that have been disabled in
   DT

 - Resolve some compiler and smatch warnings and remove a redundant
   macro

* tag 'riscv-for-linus-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: hwprobe: avoid uninitialized variable use in hwprobe_arch_id()
  riscv: cpufeature: avoid uninitialized variable in has_thead_homogeneous_vlenb()
  riscv: hwprobe: Fix stale vDSO data for late-initialized keys at boot
  riscv: add a forward declaration for cpuinfo_op
  RISC-V: Don't print details of CPUs disabled in DT
  riscv: Remove the PER_CPU_OFFSET_SHIFT macro
  riscv: mm: Define MAX_POSSIBLE_PHYSMEM_BITS for zsmalloc
  riscv: Register IPI IRQs with unique names
  ACPI: RIMT: Fix unused function warnings when CONFIG_IOMMU_API is disabled
  RISC-V: Define pgprot_dmacoherent() for non-coherent devices
2025-10-25 09:35:26 -07:00
Linus Torvalds
27c0b5c4f6 Merge tag 'xfs-fixes-6.18-rc3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Carlos Maiolino:
 "The main highlight here is a fix for a bug brought in by the removal
  of attr2 mount option, where some installations might actually have
  'attr2' explicitly configured in fstab preventing system to boot by
  not being able to remount the rootfs as RW.

  Besides that there are a couple fix to the zonefs implementation,
  changing XFS_ONLINE_SCRUB_STATS to depend on DEBUG_FS (was select
  before), and some other minor changes"

* tag 'xfs-fixes-6.18-rc3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix locking in xchk_nlinks_collect_dir
  xfs: loudly complain about defunct mount options
  xfs: always warn about deprecated mount options
  xfs: don't set bt_nr_sectors to a negative number
  xfs: don't use __GFP_NOFAIL in xfs_init_fs_context
  xfs: cache open zone in inode->i_private
  xfs: avoid busy loops in GCD
  xfs: XFS_ONLINE_SCRUB_STATS should depend on DEBUG_FS
  xfs: do not tightly pack-write large files
  xfs: Improve CONFIG_XFS_RT Kconfig help
2025-10-25 09:31:13 -07:00
Linus Torvalds
566771afc7 Merge tag 'v6.18-rc2-smb-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
 "smbdirect (RDMA) fixes in order avoid potential submission queue
  overflows:

   - free transport teardown fix

   - credit related fixes (five server related, one client related)"

* tag 'v6.18-rc2-smb-server-fixes' of git://git.samba.org/ksmbd:
  smb: server: let free_transport() wait for SMBDIRECT_SOCKET_DISCONNECTED
  smb: client: make use of smbdirect_socket.send_io.lcredits.*
  smb: server: make use of smbdirect_socket.send_io.lcredits.*
  smb: server: simplify sibling_list handling in smb_direct_flush_send_list/send_done
  smb: server: smb_direct_disconnect_rdma_connection() already wakes all waiters on error
  smb: smbdirect: introduce smbdirect_socket.send_io.lcredits.*
  smb: server: allocate enough space for RW WRs and ib_drain_qp()
2025-10-24 18:50:15 -07:00