Commit Graph

1335528 Commits

Author SHA1 Message Date
Eric Biggers
e9787deff4 crypto: x86/aes-gcm - use the new scatterwalk functions
In gcm_process_assoc(), use scatterwalk_next() which consolidates
scatterwalk_clamp() and scatterwalk_map().  Use scatterwalk_done_src()
which consolidates scatterwalk_unmap(), scatterwalk_advance(), and
scatterwalk_done().

Also rename some variables to avoid implying that anything is actually
mapped (it's not), or that the loop is going page by page (it is for
now, but nothing actually requires that to be the case).

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:19:44 +08:00
Eric Biggers
95c47514b9 crypto: stm32 - use the new scatterwalk functions
Replace calls to the deprecated function scatterwalk_copychunks() with
memcpy_from_scatterwalk(), memcpy_to_scatterwalk(), scatterwalk_skip(),
or scatterwalk_start_at_pos() as appropriate.

Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Maxime Méré <maxime.mere@foss.st.com>
Cc: Thomas Bourgoin <thomas.bourgoin@foss.st.com>
Cc: linux-stm32@st-md-mailman.stormreply.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:19:44 +08:00
Eric Biggers
323abf2569 crypto: s5p-sss - use the new scatterwalk functions
s5p_sg_copy_buf() open-coded a copy from/to a scatterlist using
scatterwalk_* functions that are planned for removal.  Replace it with
the new functions memcpy_from_sglist() and memcpy_to_sglist() instead.
Also take the opportunity to replace calls to scatterwalk_map_and_copy()
in the same file; this eliminates the confusing 'out' argument.

Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Vladimir Zapolskiy <vz@mleia.com>
Cc: linux-samsung-soc@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:19:44 +08:00
Eric Biggers
e7d5d8a86d crypto: s390/aes-gcm - use the new scatterwalk functions
Use scatterwalk_next() which consolidates scatterwalk_clamp() and
scatterwalk_map().  Use scatterwalk_done_src() and
scatterwalk_done_dst() which consolidate scatterwalk_unmap(),
scatterwalk_advance(), and scatterwalk_done().

Besides the new functions being a bit easier to use, this is necessary
because scatterwalk_done() is planned to be removed.

Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Cc: Holger Dengler <dengler@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:19:44 +08:00
Eric Biggers
422bf8fc99 crypto: nx - use the new scatterwalk functions
- In nx_walk_and_build(), use scatterwalk_start_at_pos() instead of a
  more complex way to achieve the same result.

- Also in nx_walk_and_build(), use the new functions scatterwalk_next()
  which consolidates scatterwalk_clamp() and scatterwalk_map(), and use
  scatterwalk_done_src() which consolidates scatterwalk_unmap(),
  scatterwalk_advance(), and scatterwalk_done().  Remove unnecessary
  code that seemed to be intended to advance to the next sg entry, which
  is already handled by the scatterwalk functions.

  Note that nx_walk_and_build() does not actually read or write the
  mapped virtual address, and thus it is misusing the scatter_walk API.
  It really should just access the scatterlist directly.  This patch
  does not try to address this existing issue.

- In nx_gca(), use memcpy_from_sglist() instead of a more complex way to
  achieve the same result.

- In various functions, replace calls to scatterwalk_map_and_copy() with
  memcpy_from_sglist() or memcpy_to_sglist() as appropriate.  Note that
  this eliminates the confusing 'out' argument (which this driver had
  tried to work around by defining the missing constants for it...)

Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N Rao <naveen@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:19:43 +08:00
Eric Biggers
8fd0eecd55 crypto: arm64 - use the new scatterwalk functions
Use scatterwalk_next() which consolidates scatterwalk_clamp() and
scatterwalk_map(), and use scatterwalk_done_src() which consolidates
scatterwalk_unmap(), scatterwalk_advance(), and scatterwalk_done().
Remove unnecessary code that seemed to be intended to advance to the
next sg entry, which is already handled by the scatterwalk functions.
Adjust variable naming slightly to keep things consistent.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:19:43 +08:00
Eric Biggers
5dc14e0bce crypto: arm/ghash - use the new scatterwalk functions
Use scatterwalk_next() which consolidates scatterwalk_clamp() and
scatterwalk_map(), and use scatterwalk_done_src() which consolidates
scatterwalk_unmap(), scatterwalk_advance(), and scatterwalk_done().
Remove unnecessary code that seemed to be intended to advance to the
next sg entry, which is already handled by the scatterwalk functions.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:19:43 +08:00
Eric Biggers
c89edd931a crypto: aegis - use the new scatterwalk functions
Use scatterwalk_next() which consolidates scatterwalk_clamp() and
scatterwalk_map(), and use scatterwalk_done_src() which consolidates
scatterwalk_unmap(), scatterwalk_advance(), and scatterwalk_done().

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:19:43 +08:00
Eric Biggers
cb25dbb605 crypto: skcipher - use scatterwalk_start_at_pos()
In skcipher_walk_aead_common(), use scatterwalk_start_at_pos() instead
of a sequence of scatterwalk_start(), scatterwalk_copychunks(..., 2),
and scatterwalk_done().  This is simpler and faster.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:19:43 +08:00
Eric Biggers
84b1576355 crypto: scatterwalk - add scatterwalk_get_sglist()
Add a function that creates a scatterlist that represents the remaining
data in a walk.  This will be used to replace chain_to_walk() in
net/tls/tls_device_fallback.c so that it will no longer need to reach
into the internals of struct scatter_walk.

Cc: Boris Pismenny <borisp@nvidia.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:19:43 +08:00
Eric Biggers
bb699e724f crypto: scatterwalk - add new functions for copying data
Add memcpy_from_sglist() and memcpy_to_sglist() which are more readable
versions of scatterwalk_map_and_copy() with the 'out' argument 0 and 1
respectively.  They follow the same argument order as memcpy_from_page()
and memcpy_to_page() from <linux/highmem.h>.  Note that in the case of
memcpy_from_sglist(), this also happens to be the same argument order
that scatterwalk_map_and_copy() uses.

The new code is also faster, mainly because it builds the scatter_walk
directly without creating a temporary scatterlist.  E.g., a 20%
performance improvement is seen for copying the AES-GCM auth tag.

Make scatterwalk_map_and_copy() be a wrapper around memcpy_from_sglist()
and memcpy_to_sglist().  Callers of scatterwalk_map_and_copy() should be
updated to call memcpy_from_sglist() or memcpy_to_sglist() directly, but
there are a lot of them so they aren't all being updated right away.

Also add functions memcpy_from_scatterwalk() and memcpy_to_scatterwalk()
which are similar but operate on a scatter_walk instead of a
scatterlist.  These will replace scatterwalk_copychunks() with the 'out'
argument 0 and 1 respectively.  Their behavior differs slightly from
scatterwalk_copychunks() in that they automatically take care of
flushing the dcache when needed, making them easier to use.

scatterwalk_copychunks() itself is left unchanged for now.  It will be
removed after its callers are updated to use other functions instead.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:19:43 +08:00
Eric Biggers
31b00fe1e2 crypto: scatterwalk - add new functions for iterating through data
Add scatterwalk_next() which consolidates scatterwalk_clamp() and
scatterwalk_map().  Also add scatterwalk_done_src() and
scatterwalk_done_dst() which consolidate scatterwalk_unmap(),
scatterwalk_advance(), and scatterwalk_done() or scatterwalk_pagedone().
A later patch will remove scatterwalk_done() and scatterwalk_pagedone().

The new code eliminates the error-prone 'more' parameter.  Advancing to
the next sg entry now only happens just-in-time in scatterwalk_next().

The new code also pairs the dcache flush more closely with the actual
write, similar to memcpy_to_page().  Previously it was paired with
advancing to the next page.  This is currently causing bugs where the
dcache flush is incorrectly being skipped, usually due to
scatterwalk_copychunks() being called without a following
scatterwalk_done().  The dcache flush may have been placed where it was
in order to not call flush_dcache_page() redundantly when visiting a
page more than once.  However, that case is rare in practice, and most
architectures either do not implement flush_dcache_page() anyway or
implement it lazily where it just clears a page flag.

Another limitation of the old code was that by the time the flush
happened, there was no way to tell if more than one page needed to be
flushed.  That has been sufficient because the code goes page by page,
but I would like to optimize that on !HIGHMEM platforms.  The new code
makes this possible, and a later patch will implement this optimization.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:19:43 +08:00
Eric Biggers
e21d01a2a3 crypto: scatterwalk - add new functions for skipping data
Add scatterwalk_skip() to skip the given number of bytes in a
scatter_walk.  Previously support for skipping was provided through
scatterwalk_copychunks(..., 2) followed by scatterwalk_done(), which was
confusing and less efficient.

Also add scatterwalk_start_at_pos() which starts a scatter_walk at the
given position, equivalent to scatterwalk_start() + scatterwalk_skip().
This addresses another common need in a more streamlined way.

Later patches will convert various users to use these functions.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:19:43 +08:00
Eric Biggers
3bd4b2c603 crypto: scatterwalk - move to next sg entry just in time
The scatterwalk_* functions are designed to advance to the next sg entry
only when there is more data from the request to process.  Compared to
the alternative of advancing after each step if !sg_is_last(sg), this
has the advantage that it doesn't cause problems if users accidentally
don't terminate their scatterlist with the end marker (which is an easy
mistake to make, and there are examples of this).

Currently, the advance to the next sg entry happens in
scatterwalk_done(), which is called after each "step" of the walk.  It
requires the caller to pass in a boolean 'more' that indicates whether
there is more data.  This works when the caller immediately knows
whether there is more data, though it adds some complexity.  However in
the case of scatterwalk_copychunks() it's not immediately known whether
there is more data, so the call to scatterwalk_done() has to happen
higher up the stack.  This is error-prone, and indeed the needed call to
scatterwalk_done() is not always made, e.g. scatterwalk_copychunks() is
sometimes called multiple times in a row.  This causes a zero-length
step to get added in some cases, which is unexpected and seems to work
only by accident.

This patch begins the switch to a less error-prone approach where the
advance to the next sg entry happens just in time instead.  For now,
that means just doing the advance in scatterwalk_clamp() if it's needed
there.  Initially this is redundant, but it's needed to keep the tree in
a working state as later patches change things to the final state.

Later patches will similarly move the dcache flushing logic out of
scatterwalk_done() and then remove scatterwalk_done() entirely.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:19:43 +08:00
Geert Uytterhoeven
2291399384 hwrng: Kconfig - Fix indentation of HW_RANDOM_CN10K help text
Change the indentation of the help text of the HW_RANDOM_CN10K symbol
from one TAB plus one space to one TAB plus two spaces, as is customary.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:19:43 +08:00
Arnd Bergmann
f307c87ea0 crypto: bpf - Add MODULE_DESCRIPTION for skcipher
All modules should have a description, building with extra warnings
enabled prints this outfor the for bpf_crypto_skcipher module:

WARNING: modpost: missing MODULE_DESCRIPTION() in crypto/bpf_crypto_skcipher.o

Add a description line.

Fixes: fda4f71282 ("bpf: crypto: add skcipher to bpf crypto")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:19:43 +08:00
Herbert Xu
9e01aaa103 crypto: ahash - Set default reqsize from ahash_alg
Add a reqsize field to struct ahash_alg and use it to set the
default reqsize so that algorithms with a static reqsize are
not forced to create an init_tfm function.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 16:09:05 +08:00
Herbert Xu
439963cdc3 crypto: ahash - Add virtual address support
This patch adds virtual address support to ahash.  Virtual addresses
were previously only supported through shash.  The user may choose
to use virtual addresses with ahash by calling ahash_request_set_virt
instead of ahash_request_set_crypt.

The API will take care of translating this to an SG list if necessary,
unless the algorithm declares that it supports chaining.  Therefore
in order for an ahash algorithm to support chaining, it must also
support virtual addresses directly.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 16:09:02 +08:00
Herbert Xu
c664f03417 crypto: tcrypt - Restore multibuffer ahash tests
This patch is a revert of commit 388ac25efc.

As multibuffer ahash is coming back in the form of request chaining,
restore the multibuffer ahash tests using the new interface.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 16:01:55 +08:00
Herbert Xu
f2ffe5a918 crypto: hash - Add request chaining API
This adds request chaining to the ahash interface.  Request chaining
allows multiple requests to be submitted in one shot.  An algorithm
can elect to receive chained requests by setting the flag
CRYPTO_ALG_REQ_CHAIN.  If this bit is not set, the API will break
up chained requests and submit them one-by-one.

A new err field is added to struct crypto_async_request to record
the return value for each individual request.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 16:01:53 +08:00
Herbert Xu
f407764621 crypto: x86/ghash - Use proper helpers to clone request
Rather than copying a request by hand with memcpy, use the correct
API helpers to setup the new request.  This will matter once the
API helpers start setting up chained requests as a simple memcpy
will break chaining.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 16:00:03 +08:00
Herbert Xu
075db21426 crypto: ahash - Only save callback and data in ahash_save_req
As unaligned operations are supported by the underlying algorithm,
ahash_save_req and ahash_restore_req can be greatly simplified to
only preserve the callback and data.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:59:59 +08:00
Christian Marangi
217460544a crypto: inside-secure/eip93 - Correctly handle return of for sg_nents_for_len
Fix smatch warning for sg_nents_for_len return value in Inside Secure
EIP93 driver.

The return value of sg_nents_for_len was assigned to an u32 and the
error was ignored and converted to a positive integer.

Rework the code to correctly handle the error from sg_nents_for_len to
mute smatch warning.

Fixes: 9739f5f93b ("crypto: eip93 - Add Inside Secure SafeXcel EIP-93 crypto engine support")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:03 +08:00
Herbert Xu
ee509efc74 crypto: skcipher - Zap type in crypto_alloc_sync_skcipher
The type needs to be zeroed as otherwise the user could use it to
allocate an asynchronous sync skcipher.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:03 +08:00
Małgorzata Mielnik
5ab6c06dff crypto: qat - refactor service parsing logic
The service parsing logic is used to parse the configuration string
provided by the user using the attribute qat/cfg_services in sysfs.
The logic relies on hard-coded strings. For example, the service
"sym;asym" is also replicated as "asym;sym".
This makes the addition of new services or service combinations
complex as it requires the addition of new hard-coded strings for all
possible combinations.

This commit addresses this issue by:
 * reducing the number of internal service strings to only the basic
   service representations.
 * modifying the service parsing logic to analyze the service string
   token by token instead of comparing a whole string with patterns.
 * introducing the concept of a service mask where each service is
   represented by a single bit.
 * dividing the parsing logic into several functions to allow for code
   reuse (e.g. by sysfs-related functions).
 * introducing a new, device generation-specific function to verify
   whether the requested service combination is supported by the
   currently used device.

Signed-off-by: Małgorzata Mielnik <malgorzata.mielnik@intel.com>
Co-developed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:03 +08:00
Giovanni Cabiddu
9057f824c1 crypto: qat - do not export adf_cfg_services
The symbol `adf_cfg_services` is only used on the intel_qat module.
There is no need to export it.

Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:03 +08:00
Herbert Xu
12a2b40d49 crypto: skcipher - Set tfm in SYNC_SKCIPHER_REQUEST_ON_STACK
Set the request tfm directly in SYNC_SKCIPHER_REQUEST_ON_STACK since
the tfm is already available.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:03 +08:00
Herbert Xu
7505436e29 crypto: api - Fix larval relookup type and mask
When the lookup is retried after instance construction, it uses
the type and mask from the larval, which may not match the values
used by the caller.  For example, if the caller is requesting for
a !NEEDS_FALLBACK algorithm, it may end up getting an algorithm
that needs fallbacks.

Fix this by making the caller supply the type/mask and using that
for the lookup.

Reported-by: Coiby Xu <coxu@redhat.com>
Fixes: 96ad595520 ("crypto: api - Remove instance larval fulfilment")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:03 +08:00
Abel Vesa
70c4a5c213 dt-bindings: crypto: qcom-qce: Document the X1E80100 crypto engine
Document the crypto engine on the X1E80100 Platform.

Signed-off-by: Abel Vesa <abel.vesa@linaro.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:03 +08:00
Herbert Xu
dcc47a028c crypto: null - Use spin lock instead of mutex
As the null algorithm may be freed in softirq context through
af_alg, use spin locks instead of mutexes to protect the default
null algorithm.

Reported-by: syzbot+b3e02953598f447d4d2a@syzkaller.appspotmail.com
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:03 +08:00
Herbert Xu
1047e21aec crypto: lib/Kconfig - Fix lib built-in failure when arch is modular
The HAVE_ARCH Kconfig options in lib/crypto try to solve the
modular versus built-in problem, but it still fails when the
the LIB option (e.g., CRYPTO_LIB_CURVE25519) is selected externally.

Fix this by introducing a level of indirection with ARCH_MAY_HAVE
Kconfig options, these then go on to select the ARCH_HAVE options
if the ARCH Kconfig options matches that of the LIB option.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202501230223.ikroNDr1-lkp@intel.com/
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:03 +08:00
Giovanni Cabiddu
3af4e7fa26 crypto: qat - reorder objects in qat_common Makefile
The objects in the qat_common Makefile are currently listed in a random
order.

Reorder the objects alphabetically to make it easier to find where to
add a new object.

Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Ahsan Atta <ahsan.atta@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:03 +08:00
Giovanni Cabiddu
a4f95a2d28 crypto: qat - fix object goals in Makefiles
Align with kbuild documentation by using <module_name>-y instead of
<module_name>-objs, following the kernel convention for building modules
from multiple object files.

Link: https://docs.kernel.org/kbuild/makefiles.html#loadable-module-goals-obj-m
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:03 +08:00
Thorsten Blum
844c683d1f crypto: aead - use str_yes_no() helper in crypto_aead_show()
Remove hard-coded strings by using the str_yes_no() helper function.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:03 +08:00
Thorsten Blum
4f95a6d274 crypto: bcm - set memory to zero only once
Use kmalloc_array() instead of kcalloc() because sg_init_table() already
sets the memory to zero. This avoids zeroing the memory twice.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:03 +08:00
Eric Biggers
0926d8ee08 crypto: x86/aes-xts - change license to Apache-2.0 OR BSD-2-Clause
As with the other AES modes I've implemented, I've received interest in
my AES-XTS assembly code being reused in other projects.  Therefore,
change the license to Apache-2.0 OR BSD-2-Clause like what I used for
AES-GCM.  Apache-2.0 is the license of OpenSSL and BoringSSL.

Note that it is difficult to *directly* share code between the kernel,
OpenSSL, and BoringSSL for various reasons such as perlasm vs. plain
asm, Windows ABI support, different divisions of responsibility between
C and asm in each project, etc.  So whether that will happen instead of
just doing ports is still TBD.  But this dual license should at least
make it possible to port changes between the projects.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:03 +08:00
Eric Biggers
8c4fc9ce40 crypto: x86/aes-ctr - rewrite AESNI+AVX optimized CTR and add VAES support
Delete aes_ctrby8_avx-x86_64.S and add a new assembly file
aes-ctr-avx-x86_64.S which follows a similar approach to
aes-xts-avx-x86_64.S in that it uses a "template" to provide AESNI+AVX,
VAES+AVX2, VAES+AVX10/256, and VAES+AVX10/512 code, instead of just
AESNI+AVX.  Wire it up to the crypto API accordingly.

This greatly improves the performance of AES-CTR and AES-XCTR on
VAES-capable CPUs, with the best case being AMD Zen 5 where an over 230%
increase in throughput is seen on long messages.  Performance on
non-VAES-capable CPUs remains about the same, and the non-AVX AES-CTR
code (aesni_ctr_enc) is also kept as-is for now.  There are some slight
regressions (less than 10%) on some short message lengths on some CPUs;
these are difficult to avoid, given how the previous code was so heavily
unrolled by message length, and they are not particularly important.
Detailed performance results are given in the tables below.

Both CTR and XCTR support is retained.  The main loop remains
8-vector-wide, which differs from the 4-vector-wide main loops that are
used in the XTS and GCM code.  A wider loop is appropriate for CTR and
XCTR since they have fewer other instructions (such as vpclmulqdq) to
interleave with the AES instructions.

Similar to what was the case for AES-GCM, the new assembly code also has
a much smaller binary size, as it fixes the excessive unrolling by data
length and key length present in the old code.  Specifically, the new
assembly file compiles to about 9 KB of text vs. 28 KB for the old file.
This is despite 4x as many implementations being included.

The tables below show the detailed performance results.  The tables show
percentage improvement in single-threaded throughput for repeated
encryption of the given message length; an increase from 6000 MB/s to
12000 MB/s would be listed as 100%.  They were collected by directly
measuring the Linux crypto API performance using a custom kernel module.
The tested CPUs were all server processors from Google Compute Engine
except for Zen 5 which was a Ryzen 9 9950X desktop processor.

Table 1: AES-256-CTR throughput improvement,
         CPU microarchitecture vs. message length in bytes:

                     | 16384 |  4096 |  4095 |  1420 |   512 |   500 |
---------------------+-------+-------+-------+-------+-------+-------+
AMD Zen 5            |  232% |  203% |  212% |  143% |   71% |   95% |
Intel Emerald Rapids |  116% |  116% |  117% |   91% |   78% |   79% |
Intel Ice Lake       |  109% |  103% |  107% |   81% |   54% |   56% |
AMD Zen 4            |  109% |   91% |  100% |   70% |   43% |   59% |
AMD Zen 3            |   92% |   78% |   87% |   57% |   32% |   43% |
AMD Zen 2            |    9% |    8% |   14% |   12% |    8% |   21% |
Intel Skylake        |    7% |    7% |    8% |    5% |    3% |    8% |

                     |   300 |   200 |    64 |    63 |    16 |
---------------------+-------+-------+-------+-------+-------+
AMD Zen 5            |   57% |   39% |   -9% |    7% |   -7% |
Intel Emerald Rapids |   37% |   42% |   -0% |   13% |   -8% |
Intel Ice Lake       |   39% |   30% |   -1% |   14% |   -9% |
AMD Zen 4            |   42% |   38% |   -0% |   18% |   -3% |
AMD Zen 3            |   38% |   35% |    6% |   31% |    5% |
AMD Zen 2            |   24% |   23% |    5% |   30% |    3% |
Intel Skylake        |    9% |    1% |   -4% |   10% |   -7% |

Table 2: AES-256-XCTR throughput improvement,
         CPU microarchitecture vs. message length in bytes:

                     | 16384 |  4096 |  4095 |  1420 |   512 |   500 |
---------------------+-------+-------+-------+-------+-------+-------+
AMD Zen 5            |  240% |  201% |  216% |  151% |   75% |  108% |
Intel Emerald Rapids |  100% |   99% |  102% |   91% |   94% |  104% |
Intel Ice Lake       |   93% |   89% |   92% |   74% |   50% |   64% |
AMD Zen 4            |   86% |   75% |   83% |   60% |   41% |   52% |
AMD Zen 3            |   73% |   63% |   69% |   45% |   21% |   33% |
AMD Zen 2            |   -2% |   -2% |    2% |    3% |   -1% |   11% |
Intel Skylake        |   -1% |   -1% |    1% |    2% |   -1% |    9% |

                     |   300 |   200 |    64 |    63 |    16 |
---------------------+-------+-------+-------+-------+-------+
AMD Zen 5            |   78% |   56% |   -4% |   38% |   -2% |
Intel Emerald Rapids |   61% |   55% |    4% |   32% |   -5% |
Intel Ice Lake       |   57% |   42% |    3% |   44% |   -4% |
AMD Zen 4            |   35% |   28% |   -1% |   17% |   -3% |
AMD Zen 3            |   26% |   23% |   -3% |   11% |   -6% |
AMD Zen 2            |   13% |   24% |   -1% |   14% |   -3% |
Intel Skylake        |   16% |    8% |   -4% |   35% |   -3% |

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:03 +08:00
Thorsten Blum
77cb2f63ad crypto: ahash - use str_yes_no() helper in crypto_ahash_show()
Remove hard-coded strings by using the str_yes_no() helper function.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:03 +08:00
Herbert Xu
ea6f861a3c crypto: inside-secure - Eliminate duplication in top-level Makefile
Instead of having two entries for inside-secure in the top-level
Makefile, make it just a single one.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:02 +08:00
Devaraj Rangasamy
6cb345939b crypto: ccp - Add support for PCI device 0x1134
PCI device 0x1134 shares same register features as PCI device 0x17E0.
Hence reuse same data for the new PCI device ID 0x1134.

Signed-off-by: Devaraj Rangasamy <Devaraj.Rangasamy@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:02 +08:00
Wenkai Lin
f4f353cb7a crypto: hisilicon/sec2 - fix for sec spec check
During encryption and decryption, user requests
must be checked first, if the specifications that
are not supported by the hardware are used, the
software computing is used for processing.

Fixes: 2f072d75d1 ("crypto: hisilicon - Add aead support on SEC2")
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:02 +08:00
Wenkai Lin
a49cc71e21 crypto: hisilicon/sec2 - fix for aead authsize alignment
The hardware only supports authentication sizes
that are 4-byte aligned. Therefore, the driver
switches to software computation in this case.

Fixes: 2f072d75d1 ("crypto: hisilicon - Add aead support on SEC2")
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:02 +08:00
Wenkai Lin
1b284ffc30 crypto: hisilicon/sec2 - fix for aead auth key length
According to the HMAC RFC, the authentication key
can be 0 bytes, and the hardware can handle this
scenario. Therefore, remove the incorrect validation
for this case.

Fixes: 2f072d75d1 ("crypto: hisilicon - Add aead support on SEC2")
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:02 +08:00
Nicolas Frattaroli
d454874773 MAINTAINERS: add Nicolas Frattaroli to rockchip-rng maintainers
I maintain the rockchip,rk3588-rng bindings, and I guess also the part
of the driver that implements support for it. Therefore, add me to the
MAINTAINERS for this driver and these bindings.

Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:02 +08:00
Nicolas Frattaroli
8eff8eb83f hwrng: rockchip - add support for rk3588's standalone TRNG
The RK3588 SoC includes several TRNGs, one part of the Crypto IP block,
and the other one (referred to as "trngv1") as a standalone new IP.

Add support for this new standalone TRNG to the driver by both
generalising it to support multiple different rockchip RNGs and then
implementing the required functionality for the new hardware.

This work was partly based on the downstream vendor driver by Rockchip's
Lin Jinhan, which is why they are listed as a Co-author.

While the hardware does support notifying the CPU with an IRQ when the
random data is ready, I've discovered while implementing the code to use
this interrupt that this results in significantly slower throughput of
the TRNG even when under heavy CPU load. I assume this is because with
only 32 bytes of data per invocation, the overhead of reinitialising a
completion, enabling the interrupt, sleeping and then triggering the
completion in the IRQ handler is way more expensive than busylooping.

Speaking of busylooping, the poll interval for reading the ISTAT is an
atomic read with a delay of 0. In my testing, I've found that this gives
us the largest throughput, and it appears the random data is ready
pretty much the moment we begin polling, as increasing the poll delay
leads to a drop in throughput significant enough to not just be due to
the poll interval missing the ideal timing by a microsecond or two.

According to downstream, the IP should take 1024 clock cycles to
generate 56 bits of random data, which at 150MHz should work out to
6.8us. I did not test whether the data really does take 256/56*6.8us
to arrive, though changing the readl to a __raw_readl makes no
difference in throughput, and this data does pass the rngtest FIPS
checks, so I'm not entirely sure what's going on but I presume it's got
something to do with the AHB bus speed and the memory barriers that
mainline's readl/writel functions insert.

The only other current SoC that uses this new IP is the Rockchip RV1106,
but that SoC does not have mainline support as of the time of writing,
so we make no effort to declare it as supported for now.

Co-developed-by: Lin Jinhan <troy.lin@rock-chips.com>
Signed-off-by: Lin Jinhan <troy.lin@rock-chips.com>
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:02 +08:00
Nicolas Frattaroli
24aaa42ed6 hwrng: rockchip - eliminate some unnecessary dereferences
Despite assigning a temporary variable the value of &pdev->dev early on
in the probe function, the probe function then continues to use this
construct when it could just use the local dev variable instead.

Simplify this by using the local dev variable directly.

Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:02 +08:00
Nicolas Frattaroli
8bb8609293 hwrng: rockchip - store dev pointer in driver struct
The rockchip rng driver does a dance to store the dev pointer in the
hwrng's unsigned long "priv" member. However, since the struct hwrng
member of rk_rng is not a pointer, we can use container_of to get the
struct rk_rng instance from just the struct hwrng*, which means we don't
have to subvert what little there is in C of a type system and can
instead store a pointer to the device struct in the rk_rng itself.

Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:02 +08:00
Nicolas Frattaroli
e00fc3d6e7 dt-bindings: rng: add binding for Rockchip RK3588 RNG
The Rockchip RK3588 SoC has two hardware RNGs accessible to the
non-secure world: an RNG in the Crypto IP, and a standalone RNG that is
new to this SoC.

Add a binding for this new standalone RNG. It is distinct hardware from
the existing rockchip,rk3568-rng, and therefore gets its own binding as
the two hardware IPs are unrelated other than both being made by the
same vendor.

The RNG is capable of firing an interrupt when entropy is ready.

The reset is optional, as the hardware does a power-on reset, and
functions without the software manually resetting it.

Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:02 +08:00
Nicolas Frattaroli
849d9db170 dt-bindings: reset: Add SCMI reset IDs for RK3588
When TF-A is used to assert/deassert the resets through SCMI, the
IDs communicated to it are different than the ones mainline Linux uses.

Import the list of SCMI reset IDs from mainline TF-A so that devicetrees
can use these IDs more easily.

Co-developed-by: XiaoDong Huang <derrick.huang@rock-chips.com>
Signed-off-by: XiaoDong Huang <derrick.huang@rock-chips.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:02 +08:00
Lukas Wunner
62d027fb49 crypto: virtio - Drop superfluous [as]kcipher_req pointer
The request context virtio_crypto_{akcipher,sym}_request contains a
pointer to the [as]kcipher_request itself.

The pointer is superfluous as it can be calculated with container_of().

Drop the superfluous pointer.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-02-22 15:56:02 +08:00