Commit Graph

10251 Commits

Author SHA1 Message Date
Linus Torvalds
6f46e6fb4e Merge tag 'linux_kselftest-kunit-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit updates from Shuah Khan:
 "Correct MODULE_IMPORT_NS() syntax documentation, make kunit_test
  timeout configurable via a module parameter and a Kconfig option, fix
  longest symbol length test, add a test for static stub, and adjust
  kunit_test timeout based on test_{suite,case} speed"

* tag 'linux_kselftest-kunit-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: fix longest symbol length test
  kunit: Make default kunit_test timeout configurable via both a module parameter and a Kconfig option
  kunit: Adjust kunit_test timeout based on test_{suite,case} speed
  kunit: Add test for static stub
  Documentation: kunit: Correct MODULE_IMPORT_NS() syntax
2025-07-29 12:43:10 -07:00
Linus Torvalds
0d5ec7919f Merge tag 'char-misc-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc / IIO / other driver updates from Greg KH:
 "Here is the big set of char/misc/iio and other smaller driver
  subsystems for 6.17-rc1. It's a big set this time around, with the
  huge majority being in the iio subsystem with new drivers and dts
  files being added there.

  Highlights include:
   - IIO driver updates, additions, and changes making more code const
     and cleaning up some init logic
   - bus_type constant conversion changes
   - misc device test functions added
   - rust miscdevice minor fixup
   - unused function removals for some drivers
   - mei driver updates
   - mhi driver updates
   - interconnect driver updates
   - Android binder updates and test infrastructure added
   - small cdx driver updates
   - small comedi fixes
   - small nvmem driver updates
   - small pps driver updates
   - some acrn virt driver fixes for printk messages
   - other small driver updates

  All of these have been in linux-next with no reported issues"

* tag 'char-misc-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (292 commits)
  binder: Use seq_buf in binder_alloc kunit tests
  binder: Add copyright notice to new kunit files
  misc: ti_fpc202: Switch to of_fwnode_handle()
  bus: moxtet: Use dev_fwnode()
  pc104: move PC104 option to drivers/Kconfig
  drivers: virt: acrn: Don't use %pK through printk
  comedi: fix race between polling and detaching
  interconnect: qcom: Add Milos interconnect provider driver
  dt-bindings: interconnect: document the RPMh Network-On-Chip Interconnect in Qualcomm Milos SoC
  mei: more prints with client prefix
  mei: bus: use cldev in prints
  bus: mhi: host: pci_generic: Add Telit FN990B40 modem support
  bus: mhi: host: Detect events pointing to unexpected TREs
  bus: mhi: host: pci_generic: Add Foxconn T99W696 modem
  bus: mhi: host: Use str_true_false() helper
  bus: mhi: host: pci_generic: Add support for EM929x and set MRU to 32768 for better performance.
  bus: mhi: host: Fix endianness of BHI vector table
  bus: mhi: host: pci_generic: Disable runtime PM for QDU100
  bus: mhi: host: pci_generic: Fix the modem name of Foxconn T99W640
  dt-bindings: interconnect: qcom,msm8998-bwmon: Allow 'nonposted-mmio'
  ...
2025-07-29 09:52:01 -07:00
Linus Torvalds
f2f573ebd4 Merge tag 'libcrypto-tests-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library test updates from Eric Biggers:
 "Add KUnit test suites for the Poly1305, SHA-1, SHA-224, SHA-256,
  SHA-384, and SHA-512 library functions.

  These are the first KUnit tests for lib/crypto/. So in addition to
  being useful tests for these specific algorithms, they also establish
  some conventions for lib/crypto/ testing going forwards.

  The new tests are fairly comprehensive: more comprehensive than the
  generic crypto infrastructure's tests. They use a variety of
  techniques to check for the types of implementation bugs that tend to
  occur in the real world, rather than just naively checking some test
  vectors. (Interestingly, poly1305_kunit found a bug in QEMU)

  The core test logic is shared by all six algorithms, rather than being
  duplicated for each algorithm.

  Each algorithm's test suite also optionally includes a benchmark"

* tag 'libcrypto-tests-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  lib/crypto: tests: Annotate worker to be on stack
  lib/crypto: tests: Add KUnit tests for SHA-1 and HMAC-SHA1
  lib/crypto: tests: Add KUnit tests for Poly1305
  lib/crypto: tests: Add KUnit tests for SHA-384 and SHA-512
  lib/crypto: tests: Add KUnit tests for SHA-224 and SHA-256
  lib/crypto: tests: Add hash-test-template.h and gen-hash-testvecs.py
2025-07-28 18:02:58 -07:00
Linus Torvalds
13150742b0 Merge tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers:
 "This is the main crypto library pull request for 6.17. The main focus
  this cycle is on reorganizing the SHA-1 and SHA-2 code, providing
  high-quality library APIs for SHA-1 and SHA-2 including HMAC support,
  and establishing conventions for lib/crypto/ going forward:

   - Migrate the SHA-1 and SHA-512 code (and also SHA-384 which shares
     most of the SHA-512 code) into lib/crypto/. This includes both the
     generic and architecture-optimized code. Greatly simplify how the
     architecture-optimized code is integrated. Add an easy-to-use
     library API for each SHA variant, including HMAC support. Finally,
     reimplement the crypto_shash support on top of the library API.

   - Apply the same reorganization to the SHA-256 code (and also SHA-224
     which shares most of the SHA-256 code). This is a somewhat smaller
     change, due to my earlier work on SHA-256. But this brings in all
     the same additional improvements that I made for SHA-1 and SHA-512.

  There are also some smaller changes:

   - Move the architecture-optimized ChaCha, Poly1305, and BLAKE2s code
     from arch/$(SRCARCH)/lib/crypto/ to lib/crypto/$(SRCARCH)/. For
     these algorithms it's just a move, not a full reorganization yet.

   - Fix the MIPS chacha-core.S to build with the clang assembler.

   - Fix the Poly1305 functions to work in all contexts.

   - Fix a performance regression in the x86_64 Poly1305 code.

   - Clean up the x86_64 SHA-NI optimized SHA-1 assembly code.

  Note that since the new organization of the SHA code is much simpler,
  the diffstat of this pull request is negative, despite the addition of
  new fully-documented library APIs for multiple SHA and HMAC-SHA
  variants.

  These APIs will allow further simplifications across the kernel as
  users start using them instead of the old-school crypto API. (I've
  already written a lot of such conversion patches, removing over 1000
  more lines of code. But most of those will target 6.18 or later)"

* tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (67 commits)
  lib/crypto: arm64/sha512-ce: Drop compatibility macros for older binutils
  lib/crypto: x86/sha1-ni: Convert to use rounds macros
  lib/crypto: x86/sha1-ni: Minor optimizations and cleanup
  crypto: sha1 - Remove sha1_base.h
  lib/crypto: x86/sha1: Migrate optimized code into library
  lib/crypto: sparc/sha1: Migrate optimized code into library
  lib/crypto: s390/sha1: Migrate optimized code into library
  lib/crypto: powerpc/sha1: Migrate optimized code into library
  lib/crypto: mips/sha1: Migrate optimized code into library
  lib/crypto: arm64/sha1: Migrate optimized code into library
  lib/crypto: arm/sha1: Migrate optimized code into library
  crypto: sha1 - Use same state format as legacy drivers
  crypto: sha1 - Wrap library and add HMAC support
  lib/crypto: sha1: Add HMAC support
  lib/crypto: sha1: Add SHA-1 library functions
  lib/crypto: sha1: Rename sha1_init() to sha1_init_raw()
  crypto: x86/sha1 - Rename conflicting symbol
  lib/crypto: sha2: Add hmac_sha*_init_usingrawkey()
  lib/crypto: arm/poly1305: Remove unneeded empty weak function
  lib/crypto: x86/poly1305: Fix performance regression on short messages
  ...
2025-07-28 17:58:52 -07:00
Linus Torvalds
a578dd095d Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC updates from Eric Biggers:

 - Reorganize the architecture-optimized CRC code

   It now lives in lib/crc/$(SRCARCH)/ rather than arch/$(SRCARCH)/lib/,
   and it is no longer artificially split into separate generic and arch
   modules. This allows better inlining and dead code elimination

   The generic CRC code is also no longer exported, simplifying the API.
   (This mirrors the similar changes to SHA-1 and SHA-2 in lib/crypto/,
   which can be found in the "Crypto library updates" pull request)

 - Improve crc32c() performance on newer x86_64 CPUs on long messages by
   enabling the VPCLMULQDQ optimized code

 - Simplify the crypto_shash wrappers for crc32_le() and crc32c()

   Register just one shash algorithm for each that uses the (fully
   optimized) library functions, instead of unnecessarily providing
   direct access to the generic CRC code

 - Remove unused and obsolete drivers for hardware CRC engines

 - Remove CRC-32 combination functions that are no longer used

 - Add kerneldoc for crc32_le(), crc32_be(), and crc32c()

 - Convert the crc32() macro to an inline function

* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (26 commits)
  lib/crc: x86/crc32c: Enable VPCLMULQDQ optimization where beneficial
  lib/crc: x86: Reorganize crc-pclmul static_call initialization
  lib/crc: crc64: Add include/linux/crc64.h to kernel-api.rst
  lib/crc: crc32: Change crc32() from macro to inline function and remove cast
  nvmem: layouts: Switch from crc32() to crc32_le()
  lib/crc: crc32: Document crc32_le(), crc32_be(), and crc32c()
  lib/crc: Explicitly include <linux/export.h>
  lib/crc: Remove ARCH_HAS_* kconfig symbols
  lib/crc: x86: Migrate optimized CRC code into lib/crc/
  lib/crc: sparc: Migrate optimized CRC code into lib/crc/
  lib/crc: s390: Migrate optimized CRC code into lib/crc/
  lib/crc: riscv: Migrate optimized CRC code into lib/crc/
  lib/crc: powerpc: Migrate optimized CRC code into lib/crc/
  lib/crc: mips: Migrate optimized CRC code into lib/crc/
  lib/crc: loongarch: Migrate optimized CRC code into lib/crc/
  lib/crc: arm64: Migrate optimized CRC code into lib/crc/
  lib/crc: arm: Migrate optimized CRC code into lib/crc/
  lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/
  lib/crc: Move files into lib/crc/
  lib/crc32: Remove unused combination support
  ...
2025-07-28 17:43:29 -07:00
Linus Torvalds
8e736a2eea Merge tag 'hardening-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:

 - Introduce and start using TRAILING_OVERLAP() helper for fixing
   embedded flex array instances (Gustavo A. R. Silva)

 - mux: Convert mux_control_ops to a flex array member in mux_chip
   (Thorsten Blum)

 - string: Group str_has_prefix() and strstarts() (Andy Shevchenko)

 - Remove KCOV instrumentation from __init and __head (Ritesh Harjani,
   Kees Cook)

 - Refactor and rename stackleak feature to support Clang

 - Add KUnit test for seq_buf API

 - Fix KUnit fortify test under LTO

* tag 'hardening-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (22 commits)
  sched/task_stack: Add missing const qualifier to end_of_stack()
  kstack_erase: Support Clang stack depth tracking
  kstack_erase: Add -mgeneral-regs-only to silence Clang warnings
  init.h: Disable sanitizer coverage for __init and __head
  kstack_erase: Disable kstack_erase for all of arm compressed boot code
  x86: Handle KCOV __init vs inline mismatches
  arm64: Handle KCOV __init vs inline mismatches
  s390: Handle KCOV __init vs inline mismatches
  arm: Handle KCOV __init vs inline mismatches
  mips: Handle KCOV __init vs inline mismatch
  powerpc/mm/book3s64: Move kfence and debug_pagealloc related calls to __init section
  configs/hardening: Enable CONFIG_INIT_ON_FREE_DEFAULT_ON
  configs/hardening: Enable CONFIG_KSTACK_ERASE
  stackleak: Split KSTACK_ERASE_CFLAGS from GCC_PLUGINS_CFLAGS
  stackleak: Rename stackleak_track_stack to __sanitizer_cov_stack_depth
  stackleak: Rename STACKLEAK to KSTACK_ERASE
  seq_buf: Introduce KUnit tests
  string: Group str_has_prefix() and strstarts()
  kunit/fortify: Add back "volatile" for sizeof() constants
  acpi: nfit: intel: avoid multiple -Wflex-array-member-not-at-end warnings
  ...
2025-07-28 17:16:12 -07:00
Linus Torvalds
6e11664f14 Merge tag 'for-6.17/block-20250728' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:

 - MD pull request via Yu:
      - call del_gendisk synchronously (Xiao)
      - cleanup unused variable (John)
      - cleanup workqueue flags (Ryo)
      - fix faulty rdev can't be removed during resync (Qixing)

 - NVMe pull request via Christoph:
      - try PCIe function level reset on init failure (Keith Busch)
      - log TLS handshake failures at error level (Maurizio Lombardi)
      - pci-epf: do not complete commands twice if nvmet_req_init()
        fails (Rick Wertenbroek)
      - misc cleanups (Alok Tiwari)

 - Removal of the pktcdvd driver

   This has been more than a decade coming at this point, and some
   recently revealed breakages that had it causing issues even for cases
   where it isn't required made me re-pull the trigger on this one. It's
   known broken and nobody has stepped up to maintain the code

 - Series for ublk supporting batch commands, enabling the use of
   multishot where appropriate

 - Speed up ublk exit handling

 - Fix for the two-stage elevator fixing which could leak data

 - Convert NVMe to use the new IOVA based API

 - Increase default max transfer size to something more reasonable

 - Series fixing write operations on zoned DM devices

 - Add tracepoints for zoned block device operations

 - Prep series working towards improving blk-mq queue management in the
   presence of isolated CPUs

 - Don't allow updating of the block size of a loop device that is
   currently under exclusively ownership/open

 - Set chunk sectors from stacked device stripe size and use it for the
   atomic write size limit

 - Switch to folios in bcache read_super()

 - Fix for CD-ROM MRW exit flush handling

 - Various tweaks, fixes, and cleanups

* tag 'for-6.17/block-20250728' of git://git.kernel.dk/linux: (94 commits)
  block: restore two stage elevator switch while running nr_hw_queue update
  cdrom: Call cdrom_mrw_exit from cdrom_release function
  sunvdc: Balance device refcount in vdc_port_mpgroup_check
  nvme-pci: try function level reset on init failure
  dm: split write BIOs on zone boundaries when zone append is not emulated
  block: use chunk_sectors when evaluating stacked atomic write limits
  dm-stripe: limit chunk_sectors to the stripe size
  md/raid10: set chunk_sectors limit
  md/raid0: set chunk_sectors limit
  block: sanitize chunk_sectors for atomic write limits
  ilog2: add max_pow_of_two_factor()
  nvmet: pci-epf: Do not complete commands twice if nvmet_req_init() fails
  nvme-tcp: log TLS handshake failures at error level
  docs: nvme: fix grammar in nvme-pci-endpoint-target.rst
  nvme: fix typo in status code constant for self-test in progress
  nvmet: remove redundant assignment of error code in nvmet_ns_enable()
  nvme: fix incorrect variable in io cqes error message
  nvme: fix multiple spelling and grammar issues in host drivers
  block: fix blk_zone_append_update_request_bio() kernel-doc
  md/raid10: fix set but not used variable in sync_request_write()
  ...
2025-07-28 16:43:54 -07:00
Joel Granados
88eddb0502 uevent: mv uevent_helper into kobject_uevent.c
Move both uevent_helper table into lib/kobject_uevent.c. Place the
registration early in the initcall order with postcore_initcall.

This is part of a greater effort to move ctl tables into their
respective subsystems which will reduce the merge conflicts in
kernel/sysctl.c.

Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-07-23 11:56:02 +02:00
Kees Cook
57fbad15c2 stackleak: Rename STACKLEAK to KSTACK_ERASE
In preparation for adding Clang sanitizer coverage stack depth tracking
that can support stack depth callbacks:

- Add the new top-level CONFIG_KSTACK_ERASE option which will be
  implemented either with the stackleak GCC plugin, or with the Clang
  stack depth callback support.
- Rename CONFIG_GCC_PLUGIN_STACKLEAK as needed to CONFIG_KSTACK_ERASE,
  but keep it for anything specific to the GCC plugin itself.
- Rename all exposed "STACKLEAK" names and files to "KSTACK_ERASE" (named
  for what it does rather than what it protects against), but leave as
  many of the internals alone as possible to avoid even more churn.

While here, also split "prev_lowest_stack" into CONFIG_KSTACK_ERASE_METRICS,
since that's the only place it is referenced from.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250717232519.2984886-1-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21 21:35:01 -07:00
Guenter Roeck
8cd876e783 lib/crypto: tests: Annotate worker to be on stack
The following warning traceback is seen if object debugging is enabled
with the new crypto test code.

ODEBUG: object 9000000106237c50 is on stack 9000000106234000, but NOT annotated.
------------[ cut here ]------------
WARNING: lib/debugobjects.c:655 at lookup_object_or_alloc.part.0+0x19c/0x1f4, CPU#0: kunit_try_catch/468
...

This also results in a boot stall when running the code in qemu:loongarch.

Initializing the worker with INIT_WORK_ONSTACK() fixes the problem.

Fixes: 950a81224e ("lib/crypto: tests: Add hash-test-template.h and gen-hash-testvecs.py")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250721231917.3182029-1-linux@roeck-us.net
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-21 20:10:36 -07:00
Eric Biggers
debc1e5a43 lib/crypto: arm64/sha512-ce: Drop compatibility macros for older binutils
Now that the oldest supported binutils version is 2.30, the macros that
emit the SHA-512 instructions as '.inst' words are no longer needed.  So
drop them.  No change in the generated machine code.

Changed from the original patch by Ard Biesheuvel:
(https://lore.kernel.org/r/20250515142702.2592942-2-ardb+git@google.com):
 - Reduced scope to just SHA-512
 - Added comment that explains why "sha3" is used instead of "sha2"

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250718220706.475240-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-20 21:43:27 -07:00
Eric Biggers
42e3376e09 lib/crypto: x86/sha1-ni: Convert to use rounds macros
The assembly code that does all 80 rounds of SHA-1 is highly repetitive.
Replace it with 20 expansions of a macro that does 4 rounds, using the
macro arguments and .if directives to handle the slight variations
between rounds.  This reduces the length of sha1-ni-asm.S by 129 lines
while still producing the exact same object file.  This mirrors
sha256-ni-asm.S which uses this same strategy.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250718191900.42877-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-20 21:42:42 -07:00
Eric Biggers
f88ed14aa0 lib/crypto: x86/sha1-ni: Minor optimizations and cleanup
- Store the previous state in %xmm8-%xmm9 instead of spilling it to the
  stack.  There are plenty of unused XMM registers here, so there is no
  reason to spill to the stack.  (While 32-bit code is limited to
  %xmm0-%xmm7, this is 64-bit code, so it's free to use %xmm8-%xmm15.)

- Remove the unnecessary check for nblocks == 0.  sha1_ni_transform() is
  always passed a positive nblocks.

- To get an XMM register with 'e' in the high dword and the rest zeroes,
  just zeroize the register using pxor, then load 'e'.  Previously the
  code loaded 'e', then zeroized the lower dwords by AND-ing with a
  constant, which was slightly less efficient.

- Instead of computing &DATA_PTR[NBLOCKS << 6] and stopping when
  DATA_PTR reaches that value, instead just decrement NBLOCKS on each
  iteration and stop when it reaches 0.  This is fewer instructions.

- Rename DIGEST_PTR to STATE_PTR.  It points to the SHA-1 internal
  state, not a SHA-1 digest value.

This commit shrinks the code size of sha1_ni_transform() from 624 bytes
to 589 bytes and also shrinks rodata by 16 bytes.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250718191900.42877-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-20 21:42:34 -07:00
Eric Biggers
118da22eb6 lib/crc: x86/crc32c: Enable VPCLMULQDQ optimization where beneficial
Improve crc32c() performance on lengths >= 512 bytes by using
crc32_lsb_vpclmul_avx512() instead of crc32c_x86_3way(), when the CPU
supports VPCLMULQDQ and has a "good" implementation of AVX-512.  For now
that means AMD Zen 4 and later, and Intel Sapphire Rapids and later.
Pass crc32_lsb_vpclmul_avx512() the table of constants needed to make it
use the CRC-32C polynomial.

Rationale: VPCLMULQDQ performance has improved on newer CPUs, making
crc32_lsb_vpclmul_avx512() faster than crc32c_x86_3way(), even though
crc32_lsb_vpclmul_avx512() is designed for generic 32-bit CRCs and does
not utilize x86_64's dedicated CRC-32C instructions.

Performance results for len=4096 using crc_kunit:

    CPU                        Before (MB/s)     After (MB/s)
    ======================     =============     ============
    AMD Zen 4 (Genoa)                  19868            28618
    AMD Zen 5 (Ryzen AI 9 365)         24080            46940
    AMD Zen 5 (Turin)                  29566            58468
    Intel Sapphire Rapids              22340            73794
    Intel Emerald Rapids               24696            78666

Performance results for len=512 using crc_kunit:

    CPU                        Before (MB/s)     After (MB/s)
    ======================     =============     ============
    AMD Zen 4 (Genoa)                   7251             7758
    AMD Zen 5 (Ryzen AI 9 365)         17481            19135
    AMD Zen 5 (Turin)                  21332            25424
    Intel Sapphire Rapids              18886            29312
    Intel Emerald Rapids               19675            29045

That being said, in the above benchmarks the ZMM registers are "warm",
so they don't quite tell the whole story.  While significantly improved
from older Intel CPUs, Intel still has ~2000 ns of ZMM warm-up time
where 512-bit instructions execute 4 times more slowly than they
normally do.  In contrast, AMD does better and has virtually zero ZMM
warm-up time (at most ~60 ns).  Thus, while this change is always
beneficial on AMD, strictly speaking there are cases in which it is not
beneficial on Intel, e.g. a small number of 512-byte messages with
"cold" ZMM registers.  But typically, it is beneficial even on Intel.

Note that on AMD Zen 3--5, crc32c() performance could be further
improved with implementations that interleave crc32q and VPCLMULQDQ
instructions.  Unfortunately, it appears that a different such
implementation would be optimal on *each* of these microarchitectures.
Such improvements are left for future work.  This commit just improves
the way that we choose the implementations we already have.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250719224938.126512-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-20 20:52:34 -07:00
Eric Biggers
110628e55a lib/crc: x86: Reorganize crc-pclmul static_call initialization
Reorganize the crc-pclmul static_call initialization to place more of
the logic in the *_mod_init_arch() functions instead of in the
INIT_CRC_PCLMUL macro.  This provides the flexibility to do more than a
single static_call update for each CPU feature check.  Right away,
optimize crc64_mod_init_arch() to check the CPU features just once
instead of twice, doing both the crc64_msb and crc64_lsb static_call
updates together.  A later commit will also use this to initialize an
additional static_key when crc32_lsb_vpclmul_avx512() is enabled.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250719224938.126512-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-20 20:52:28 -07:00
Kees Cook
fc07839203 seq_buf: Introduce KUnit tests
Add KUnit tests for the seq_buf API to ensure its correctness and
prevent future regressions, covering the following functions:
- seq_buf_init()
- DECLARE_SEQ_BUF()
- seq_buf_clear()
- seq_buf_puts()
- seq_buf_putc()
- seq_buf_printf()
- seq_buf_get_buf()
- seq_buf_commit()

$ tools/testing/kunit/kunit.py run seq_buf
=================== seq_buf (9 subtests) ===================
[PASSED] seq_buf_init_test
[PASSED] seq_buf_declare_test
[PASSED] seq_buf_clear_test
[PASSED] seq_buf_puts_test
[PASSED] seq_buf_puts_overflow_test
[PASSED] seq_buf_putc_test
[PASSED] seq_buf_printf_test
[PASSED] seq_buf_printf_overflow_test
[PASSED] seq_buf_get_buf_commit_test
===================== [PASSED] seq_buf =====================

Link: https://lore.kernel.org/r/20250717085156.work.363-kees@kernel.org
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-19 23:03:24 -07:00
Herbert Xu
a9ed4422ad lib/raid6: update recov_rvv.c zero page usage
Update lib/raid6/recov_rvv.c, for 1857fcc847 ("lib/raid6: replace custom
zero page with ZERO_PAGE"), per Klara.

Link: https://lkml.kernel.org/r/aFkUnXWtxcgOTVkw@gondor.apana.org.au
Fixes: 1857fcc847 ("lib/raid6: replace custom zero page with ZERO_PAGE")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Song Liu <song@kernel.org>
Cc: Yu Kuai <yukuai3@huawei.com>
Cc: Klara Modin <klarasmodin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-19 19:08:29 -07:00
Kuan-Wei Chiu
b3d5fd6f82 lib/math/gcd: use static key to select implementation at runtime
Patch series "Optimize GCD performance on RISC-V by selecting
implementation at runtime", v3.

The current implementation of gcd() selects between the binary GCD and the
odd-even GCD algorithm at compile time, depending on whether
CONFIG_CPU_NO_EFFICIENT_FFS is set.  On platforms like RISC-V, however,
this compile-time decision can be misleading: even when the compiler emits
ctz instructions based on the assumption that they are efficient (as is
the case when CONFIG_RISCV_ISA_ZBB is enabled), the actual hardware may
lack support for the Zbb extension.  In such cases, ffs() falls back to a
software implementation at runtime, making the binary GCD algorithm
significantly slower than the odd-even variant.

To address this, we introduce a static key to allow runtime selection
between the binary and odd-even GCD implementations.  On RISC-V, the
kernel now checks for Zbb support during boot.  If Zbb is unavailable, the
static key is disabled so that gcd() consistently uses the more efficient
odd-even algorithm in that scenario.  Additionally, to further reduce code
size, we select CONFIG_CPU_NO_EFFICIENT_FFS automatically when
CONFIG_RISCV_ISA_ZBB is not enabled, avoiding compilation of the unused
binary GCD implementation entirely on systems where it would never be
executed.

This series ensures that the most efficient GCD algorithm is used in
practice and avoids compiling unnecessary code based on hardware
capabilities and kernel configuration.


This patch (of 3):

On platforms like RISC-V, the compiler may generate hardware FFS
instructions even if the underlying CPU does not actually support them. 
Currently, the GCD implementation is chosen at compile time based on
CONFIG_CPU_NO_EFFICIENT_FFS, which can result in suboptimal behavior on
such systems.

Introduce a static key, efficient_ffs_key, to enable runtime selection
between the binary GCD (using ffs) and the odd-even GCD implementation. 
This allows the kernel to default to the faster binary GCD when FFS is
efficient, while retaining the ability to fall back when needed.

Link: https://lkml.kernel.org/r/20250606134758.1308400-1-visitorckw@gmail.com
Link: https://lkml.kernel.org/r/20250606134758.1308400-2-visitorckw@gmail.com
Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-19 19:08:28 -07:00
Feng Tang
d747755917 panic: add 'panic_sys_info' sysctl to take human readable string parameter
Bitmap definition for 'panic_print' is hard to remember and decode.  Add
'panic_sys_info='sysctl to take human readable string like
"tasks,mem,timers,locks,ftrace,..." and translate it into bitmap.

The detailed mapping is:
	SYS_INFO_TASKS		"tasks"
	SYS_INFO_MEM		"mem"
	SYS_INFO_TIMERS		"timers"
	SYS_INFO_LOCKS		"locks"
	SYS_INFO_FTRACE		"ftrace"
	SYS_INFO_ALL_CPU_BT	"all_bt"
	SYS_INFO_BLOCKED_TASKS	"blocked_tasks"

[nathan@kernel.org: add __maybe_unused to sys_info_avail]
  Link: https://lkml.kernel.org/r/20250708-fix-clang-sys_info_avail-warning-v1-1-60d239eacd64@kernel.org
Link: https://lkml.kernel.org/r/20250703021004.42328-4-feng.tang@linux.alibaba.com
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-19 19:08:24 -07:00
Feng Tang
b76e89e50f panic: generalize panic_print's function to show sys info
'panic_print' was introduced to help debugging kernel panic by dumping
different kinds of system information like tasks' call stack, memory,
ftrace buffer, etc.  Actually this function could also be used to help
debugging other cases like task-hung, soft/hard lockup, etc.  where user
may need the snapshot of system info at that time.

Extract system info dump function related code from panic.c to separate
file sys_info.[ch], for wider usage by other kernel parts for debugging.

Also modify the macro names about singulars/plurals.

Link: https://lkml.kernel.org/r/20250703021004.42328-3-feng.tang@linux.alibaba.com
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-19 19:08:24 -07:00
Thomas Weißschuh
cd3557a761 vdso/gettimeofday: Add support for auxiliary clocks
Expose the auxiliary clocks through the vDSO.

Architectures not using the generic vDSO time framework,
namely SPARC64, are not supported.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-12-df7d9f87b9b8@linutronix.de
2025-07-18 14:09:39 +02:00
Thomas Weißschuh
562f03ed96 vdso/gettimeofday: Introduce vdso_get_timestamp()
This code is duplicated and with the introduction of auxiliary clocks will
be duplicated even more.

Introduce a helper.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-9-df7d9f87b9b8@linutronix.de
2025-07-18 13:45:32 +02:00
Thomas Weißschuh
381d96ccc1 vdso/gettimeofday: Introduce vdso_set_timespec()
This code is duplicated and with the introduction of auxiliary clocks will
be duplicated even more.

Introduce a helper.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-8-df7d9f87b9b8@linutronix.de
2025-07-18 13:45:32 +02:00
Thomas Weißschuh
1a1cd5fe88 vdso/gettimeofday: Introduce vdso_clockid_valid()
Move the clock ID validation check into a common helper.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-7-df7d9f87b9b8@linutronix.de
2025-07-18 13:45:32 +02:00
Thomas Weißschuh
fb61bdb27f vdso/gettimeofday: Return bool from clock_gettime() helpers
The internal helpers are effectively using boolean results,
while pretending to use error numbers.

Switch the return type to bool for more clarity.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-6-df7d9f87b9b8@linutronix.de
2025-07-18 13:45:32 +02:00
Jakub Kicinski
af2d6148d2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.16-rc7).

Conflicts:

Documentation/netlink/specs/ovpn.yaml
  880d43ca9a ("netlink: specs: clean up spaces in brackets")
  af52020fc5 ("ovpn: reject unexpected netlink attributes")

drivers/net/phy/phy_device.c
  a44312d58e ("net: phy: Don't register LEDs for genphy")
  f0f2b992d8 ("net: phy: Don't register LEDs for genphy")
https://lore.kernel.org/20250710114926.7ec3a64f@kernel.org

drivers/net/wireless/intel/iwlwifi/fw/regulatory.c
drivers/net/wireless/intel/iwlwifi/mld/regulatory.c
  5fde0fcbd7 ("wifi: iwlwifi: mask reserved bits in chan_state_active_bitmap")
  ea045a0de3 ("wifi: iwlwifi: add support for accepting raw DSM tables by firmware")

net/ipv6/mcast.c
  ae3264a25a ("ipv6: mcast: Delay put pmc->idev in mld_del_delrec()")
  a8594c956c ("ipv6: mcast: Avoid a duplicate pointer check in mld_del_delrec()")
https://lore.kernel.org/8cc52891-3653-4b03-a45e-05464fe495cf@kernel.org

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17 11:00:33 -07:00
Tiffany Yang
bdfa89c489 kunit: test: Export kunit_attach_mm()
Tests can allocate from virtual memory using kunit_vm_mmap(), which
transparently creates and attaches an mm_struct to the test runner if
one is not already attached. This is suitable for most cases, except for
when the code under test must access a task's mm before performing an
mmap. Expose kunit_attach_mm() as part of the interface for those
cases. This does not change the existing behavior.

Cc: David Gow <davidgow@google.com>
Signed-off-by: Tiffany Yang <ynaffit@google.com>
Reviewed-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20250714185321.2417234-4-ynaffit@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-16 14:11:58 +02:00
Kees Cook
10299c07c9 kunit/fortify: Add back "volatile" for sizeof() constants
It seems the Clang can see through OPTIMIZER_HIDE_VAR when the constant
is coming from sizeof. Adding "volatile" back to these variables solves
this false positive without reintroducing the issues that originally led
to switching to OPTIMIZER_HIDE_VAR in the first place[1].

Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://github.com/ClangBuiltLinux/linux/issues/2075 [1]
Cc: Jannik Glückert <jannik.glueckert@gmail.com>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Fixes: 6ee149f61b ("kunit/fortify: Replace "volatile" with OPTIMIZER_HIDE_VAR()")
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20250628234034.work.800-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-14 22:43:52 -07:00
Eric Biggers
66b1306079 lib/crypto: tests: Add KUnit tests for SHA-1 and HMAC-SHA1
Add a KUnit test suite for the SHA-1 library functions, including the
corresponding HMAC support.  The core test logic is in the
previously-added hash-test-template.h.  This commit just adds the actual
KUnit suite, and it adds the generated test vectors to the tree so that
gen-hash-testvecs.py won't have to be run at build time.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-16-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:29:36 -07:00
Eric Biggers
6dd4d9f791 lib/crypto: tests: Add KUnit tests for Poly1305
Add a KUnit test suite for the Poly1305 functions.  Most of its test
cases are instantiated from hash-test-template.h, which is also used by
the SHA-2 tests.  A couple additional test cases are also included to
test edge cases specific to Poly1305.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250709200112.258500-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:29:36 -07:00
Eric Biggers
571eaeddb6 lib/crypto: tests: Add KUnit tests for SHA-384 and SHA-512
Add KUnit test suites for the SHA-384 and SHA-512 library functions,
including the corresponding HMAC support.  The core test logic is in the
previously-added hash-test-template.h.  This commit just adds the actual
KUnit suites, and it adds the generated test vectors to the tree so that
gen-hash-testvecs.py won't have to be run at build time.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250709200112.258500-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:29:36 -07:00
Eric Biggers
4dcf6cadda lib/crypto: tests: Add KUnit tests for SHA-224 and SHA-256
Add KUnit test suites for the SHA-224 and SHA-256 library functions,
including the corresponding HMAC support.  The core test logic is in the
previously-added hash-test-template.h.  This commit just adds the actual
KUnit suites, and it adds the generated test vectors to the tree so that
gen-hash-testvecs.py won't have to be run at build time.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250709200112.258500-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:29:36 -07:00
Eric Biggers
950a81224e lib/crypto: tests: Add hash-test-template.h and gen-hash-testvecs.py
Add hash-test-template.h which generates the following KUnit test cases
for hash functions:

    test_hash_test_vectors
    test_hash_all_lens_up_to_4096
    test_hash_incremental_updates
    test_hash_buffer_overruns
    test_hash_overlaps
    test_hash_alignment_consistency
    test_hash_ctx_zeroization
    test_hash_interrupt_context_1
    test_hash_interrupt_context_2
    test_hmac  (when HMAC is supported)
    benchmark_hash  (when CONFIG_CRYPTO_LIB_BENCHMARK=y)

The initial use cases for this will be sha224_kunit, sha256_kunit,
sha384_kunit, sha512_kunit, and poly1305_kunit.

Add a Python script gen-hash-testvecs.py which generates the test
vectors required by test_hash_test_vectors,
test_hash_all_lens_up_to_4096, and test_hmac.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250709200112.258500-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:29:36 -07:00
Eric Biggers
f3d6cb3dc0 lib/crypto: x86/sha1: Migrate optimized code into library
Instead of exposing the x86-optimized SHA-1 code via x86-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function.  This is much simpler, it makes the SHA-1 library
functions be x86-optimized, and it fixes the longstanding issue where
the x86-optimized SHA-1 code was disabled by default.  SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.

To match sha1_blocks(), change the type of the nblocks parameter of the
assembly functions from int to size_t.  The assembly functions actually
already treated it as size_t.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-14-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:28:35 -07:00
Eric Biggers
c751059985 lib/crypto: sparc/sha1: Migrate optimized code into library
Instead of exposing the sparc-optimized SHA-1 code via sparc-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function.  This is much simpler, it makes the SHA-1 library
functions be sparc-optimized, and it fixes the longstanding issue where
the sparc-optimized SHA-1 code was disabled by default.  SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.

Note: to see the diff from arch/sparc/crypto/sha1_glue.c to
lib/crypto/sparc/sha1.h, view this commit with 'git show -M10'.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:11:49 -07:00
Eric Biggers
377982d561 lib/crypto: s390/sha1: Migrate optimized code into library
Instead of exposing the s390-optimized SHA-1 code via s390-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function.  This is much simpler, it makes the SHA-1 library
functions be s390-optimized, and it fixes the longstanding issue where
the s390-optimized SHA-1 code was disabled by default.  SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:11:49 -07:00
Eric Biggers
6b9ae8cfaa lib/crypto: powerpc/sha1: Migrate optimized code into library
Instead of exposing the powerpc-optimized SHA-1 code via
powerpc-specific crypto_shash algorithms, instead just implement the
sha1_blocks() library function.  This is much simpler, it makes the
SHA-1 library functions be powerpc-optimized, and it fixes the
longstanding issue where the powerpc-optimized SHA-1 code was disabled
by default.  SHA-1 still remains available through crypto_shash, but
individual architectures no longer need to handle it.

Note: to see the diff from arch/powerpc/crypto/sha1-spe-glue.c to
lib/crypto/powerpc/sha1.h, view this commit with 'git show -M10'.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:11:49 -07:00
Eric Biggers
b6ac1dac2f lib/crypto: mips/sha1: Migrate optimized code into library
Instead of exposing the mips-optimized SHA-1 code via mips-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function.  This is much simpler, it makes the SHA-1 library
functions be mips-optimized, and it fixes the longstanding issue where
the mips-optimized SHA-1 code was disabled by default.  SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.

Note: to see the diff from arch/mips/cavium-octeon/crypto/octeon-sha1.c
to lib/crypto/mips/sha1.h, view this commit with 'git show -M10'.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:11:49 -07:00
Eric Biggers
00d549bb89 lib/crypto: arm64/sha1: Migrate optimized code into library
Instead of exposing the arm64-optimized SHA-1 code via arm64-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function.  This is much simpler, it makes the SHA-1 library
functions be arm64-optimized, and it fixes the longstanding issue where
the arm64-optimized SHA-1 code was disabled by default.  SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.

Remove support for SHA-1 finalization from assembly code, since the
library does not yet support architecture-specific overrides of the
finalization.  (Support for that has been omitted for now, for
simplicity and because usually it isn't performance-critical.)

To match sha1_blocks(), change the type of the nblocks parameter and the
return value of __sha1_ce_transform() from int to size_t.  Update the
assembly code accordingly.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:11:48 -07:00
Eric Biggers
70cb6ca58f lib/crypto: arm/sha1: Migrate optimized code into library
Instead of exposing the arm-optimized SHA-1 code via arm-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function.  This is much simpler, it makes the SHA-1 library
functions be arm-optimized, and it fixes the longstanding issue where
the arm-optimized SHA-1 code was disabled by default.  SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.

To match sha1_blocks(), change the type of the nblocks parameter of the
assembly functions from int to size_t.  The assembly functions actually
already treated it as size_t.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:11:29 -07:00
Eric Biggers
4cbc84471b lib/crypto: sha1: Add HMAC support
Add HMAC support to the SHA-1 library, again following what was done for
SHA-2.  Besides providing the basis for a more streamlined "hmac(sha1)"
shash, this will also be useful for multiple in-kernel users such as
net/sctp/auth.c, net/ipv6/seg6_hmac.c, and
security/keys/trusted-keys/trusted_tpm1.c.  Those are currently using
crypto_shash, but using the library functions would be much simpler.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 08:59:20 -07:00
Eric Biggers
90860aef63 lib/crypto: sha1: Add SHA-1 library functions
Add a library interface for SHA-1, following the SHA-2 one.  As was the
case with SHA-2, this will be useful for various in-kernel users.  The
crypto_shash interface will be reimplemented on top of it as well.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 08:58:53 -07:00
Eric Biggers
9503ca2cca lib/crypto: sha1: Rename sha1_init() to sha1_init_raw()
Rename the existing sha1_init() to sha1_init_raw(), since it conflicts
with the upcoming library function.  This will later be removed, but
this keeps the kernel building for the introduction of the library.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 08:22:31 -07:00
Eric Biggers
7941ad6965 lib/crypto: sha2: Add hmac_sha*_init_usingrawkey()
While the HMAC library functions support both incremental and one-shot
computation and both prepared and raw keys, the combination of raw key
+ incremental was missing.  It turns out that several potential users of
the HMAC library functions (tpm2-sessions.c, smb2transport.c,
trusted_tpm1.c) want exactly that.

Therefore, add the missing functions hmac_sha*_init_usingrawkey().

Implement them in an optimized way that directly initializes the HMAC
context without a separate key preparation step.

Reimplement the one-shot raw key functions hmac_sha*_usingrawkey() on
top of the new functions, which makes them a bit more efficient.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250711215844.41715-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 08:20:37 -07:00
Eric Biggers
6e07c5e166 lib/crypto: arm/poly1305: Remove unneeded empty weak function
Fix poly1305-armv4.pl to not do '.globl poly1305_blocks_neon' when
poly1305_blocks_neon() is not defined.  Then, remove the empty __weak
definition of poly1305_blocks_neon(), which was still needed only
because of that unnecessary globl statement.  (It also used to be needed
because the compiler could generate calls to it when
CONFIG_KERNEL_MODE_NEON=n, but that has been fixed.)

Thanks to Arnd Bergmann for reporting that the globl statement in the
asm file was still depending on the weak symbol.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250711212822.6372-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 08:20:00 -07:00
Peter Zijlstra
8f2146159b Merge branch 'tip/sched/urgent'
Avoid merge conflicts

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2025-07-14 17:16:28 +02:00
Raghavendra K T
ee58e38489 lib/test_vmalloc.c: introduce xfail for failing tests
The test align_shift_alloc_test is expected to fail.  Reporting the test
as fail confuses to be a genuine failure.  Introduce widely used xfail
sematics to address the issue.

Note: a warn_alloc dump similar to below is still expected:

 Call Trace:
  <TASK>
  dump_stack_lvl+0x64/0x80
  warn_alloc+0x137/0x1b0
  ? __get_vm_area_node+0x134/0x140

Snippet of dmesg after change:

Summary: random_size_align_alloc_test passed: 1 failed: 0 xfailed: 0 ..
Summary: align_shift_alloc_test passed: 0 failed: 0 xfailed: 1 ..
Summary: pcpu_alloc_test passed: 1 failed: 0 xfailed: 0 ..

Link: https://lkml.kernel.org/r/20250702064319.885-1-raghavendra.kt@amd.com
Signed-off-by: Raghavendra K T <raghavendra.kt@amd.com>
Reviewed-by: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Cc: Dev Jain <dev.jain@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-13 16:38:32 -07:00
Dev Jain
526f36f3f4 maple tree: add some comments
Add comments explaining the fields for maple_metadata, since "end" is
ambiguous and "gap" can be confused as the largest gap, whereas it is
actually the offset of the largest gap.

Add comment for mas_ascend() to explain, whose min and max we are trying
to find.  Explain that, for example, if we are already on offset zero,
then the parent min is mas->min, otherwise we need to walk up to find the
implied pivot min.

Link: https://lkml.kernel.org/r/20250703063338.51509-1-dev.jain@arm.com
Signed-off-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-13 16:38:24 -07:00
Andrew Morton
cac3d177c0 Merge branch 'mm-hotfixes-stable' into mm-stable to pick up changes which
are required for a merge of the series "mm: folio_pte_batch()
improvements".
2025-07-12 14:48:26 -07:00
Eric Biggers
9f65592b7e lib/crypto: x86/poly1305: Fix performance regression on short messages
Restore the len >= 288 condition on using the AVX implementation, which
was incidentally removed by commit 318c53ae02 ("crypto: x86/poly1305 -
Add block-only interface").  This check took into account the overhead
in key power computation, kernel-mode "FPU", and tail handling
associated with the AVX code.  Indeed, restoring this check slightly
improves performance for len < 256 as measured using poly1305_kunit on
an "AMD Ryzen AI 9 365" (Zen 5) CPU:

    Length      Before       After
    ======  ==========  ==========
         1     30 MB/s     36 MB/s
        16    516 MB/s    598 MB/s
        64   1700 MB/s   1882 MB/s
       127   2265 MB/s   2651 MB/s
       128   2457 MB/s   2827 MB/s
       200   2702 MB/s   3238 MB/s
       256   3841 MB/s   3768 MB/s
       511   4580 MB/s   4585 MB/s
       512   5430 MB/s   5398 MB/s
      1024   7268 MB/s   7305 MB/s
      3173   8999 MB/s   8948 MB/s
      4096   9942 MB/s   9921 MB/s
     16384  10557 MB/s  10545 MB/s

While the optimal threshold for this CPU might be slightly lower than
288 (see the len == 256 case), other CPUs would need to be tested too,
and these sorts of benchmarks can underestimate the true cost of
kernel-mode "FPU".  Therefore, for now just restore the 288 threshold.

Fixes: 318c53ae02 ("crypto: x86/poly1305 - Add block-only interface")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-11 14:29:42 -07:00