Commit Graph

1233580 Commits

Author SHA1 Message Date
Alexandre Ghiti
66f962d893 riscv: Fix build error on rv32 + XIP
commit 66f1e68093 ("riscv: Make XIP bootable again") restricted page
offset to the sv39 page offset instead of the default sv57, which makes
sense since probably the platforms that target XIP kernels do not
support anything else than sv39 and we do not try to find out the
largest address space supported on XIP kernels (ie set_satp_mode()).

But PAGE_OFFSET_L3 is not defined for rv32, so fix the build error by
restoring the previous behaviour which picks CONFIG_PAGE_OFFSET for rv32.

Fixes: 66f1e68093 ("riscv: Make XIP bootable again")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Closes: https://lore.kernel.org/linux-riscv/344dca85-5c48-44e1-bc64-4fa7973edd12@infradead.org/T/#u
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Link: https://lore.kernel.org/r/20240118212120.2087803-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-18 13:51:37 -08:00
Maxim Kochetkov
080c4324fa riscv: optimize ELF relocation function in riscv
The patch can optimize the running times of insmod command by modify ELF
relocation function.
In the 5.10 and latest kernel, when install the riscv ELF drivers which
contains multiple symbol table items to be relocated, kernel takes a lot
of time to execute the relocation. For example, we install a 3+MB driver
need 180+s.
We focus on the riscv architecture handle R_RISCV_HI20 and R_RISCV_LO20
type items relocation function in the arch\riscv\kernel\module.c and
find that there are two-loops in the function. If we modify the begin
number in the second for-loops iteration, we could save significant time
for installation. We install the same 3+MB driver could just need 2s.

Signed-off-by: Amma Lee <lixiaoyun@binary-semi.com>
Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20231214063906.13612-1-fido_max@inbox.ru
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:21:10 -08:00
Samuel Ortiz
1024340105 RISC-V: Implement archrandom when Zkr is available
The Zkr extension is ratified and provides 16 bits of entropy seed when
reading the SEED CSR.

We can implement arch_get_random_seed_longs() by doing multiple csrrw to
that CSR and filling an unsigned long with valid entropy bits.

Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Samuel Ortiz <sameo@rivosinc.com>
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20231130111704.1319081-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:20:49 -08:00
Xiao Wang
55ca8d7aa2 riscv: Optimize hweight API with Zbb extension
The Hamming Weight of a number is the total number of bits set in it, so
the cpop/cpopw instruction from Zbb extension can be used to accelerate
hweight() API.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20231112095244.4015351-1-xiao.w.wang@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:18:40 -08:00
Masahiro Yamada
c4db7ff7a9 riscv: add dependency among Image(.gz), loader(.bin), and vmlinuz.efi
A common issue in Makefile is a race in parallel building.

You need to be careful to prevent multiple threads from writing to the
same file simultaneously.

Commit 3939f33450 ("ARM: 8418/1: add boot image dependencies to not
generate invalid images") addressed such a bad scenario.

A similar symptom occurs with the following command:

  $ make -j$(nproc) ARCH=riscv Image Image.gz loader loader.bin vmlinuz.efi
    [ snip ]
    SORTTAB vmlinux
    OBJCOPY arch/riscv/boot/Image
    OBJCOPY arch/riscv/boot/Image
    OBJCOPY arch/riscv/boot/Image
    OBJCOPY arch/riscv/boot/Image
    OBJCOPY arch/riscv/boot/Image
    GZIP    arch/riscv/boot/Image.gz
    AS      arch/riscv/boot/loader.o
    AS      arch/riscv/boot/loader.o
    Kernel: arch/riscv/boot/Image is ready
    PAD     arch/riscv/boot/vmlinux.bin
    GZIP    arch/riscv/boot/vmlinuz
    Kernel: arch/riscv/boot/loader is ready
    OBJCOPY arch/riscv/boot/loader.bin
    Kernel: arch/riscv/boot/loader.bin is ready
    Kernel: arch/riscv/boot/Image.gz is ready
    OBJCOPY arch/riscv/boot/vmlinuz.o
    LD      arch/riscv/boot/vmlinuz.efi.elf
    OBJCOPY arch/riscv/boot/vmlinuz.efi
    Kernel: arch/riscv/boot/vmlinuz.efi is ready

The log "OBJCOPY arch/riscv/boot/Image" is displayed 5 times.
(also "AS      arch/riscv/boot/loader.o" twice.)

It indicates that 5 threads simultaneously enter arch/riscv/boot/
and write to arch/riscv/boot/Image.

It occasionally leads to a build failure:

  $ make -j$(nproc) ARCH=riscv Image Image.gz loader loader.bin vmlinuz.efi
    [ snip ]
    SORTTAB vmlinux
    OBJCOPY arch/riscv/boot/Image
    OBJCOPY arch/riscv/boot/Image
    OBJCOPY arch/riscv/boot/Image
    OBJCOPY arch/riscv/boot/Image
    PAD     arch/riscv/boot/vmlinux.bin
  truncate: Invalid number: 'arch/riscv/boot/vmlinux.bin'
  make[2]: *** [drivers/firmware/efi/libstub/Makefile.zboot:13: arch/riscv/boot/vmlinux.bin] Error 1
  make[2]: *** Deleting file 'arch/riscv/boot/vmlinux.bin'
  make[1]: *** [arch/riscv/Makefile:167: vmlinuz.efi] Error 2
  make[1]: *** Waiting for unfinished jobs....
    Kernel: arch/riscv/boot/Image is ready
    GZIP    arch/riscv/boot/Image.gz
    AS      arch/riscv/boot/loader.o
    AS      arch/riscv/boot/loader.o
    Kernel: arch/riscv/boot/loader is ready
    OBJCOPY arch/riscv/boot/loader.bin
    Kernel: arch/riscv/boot/loader.bin is ready
    Kernel: arch/riscv/boot/Image.gz is ready
  make: *** [Makefile:234: __sub-make] Error 2

Image.gz, loader, vmlinuz.efi depend on Image. loader.bin depends
on loader. Such dependencies are not specified in arch/riscv/Makefile.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Tested-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20231119100024.2370992-1-masahiroy@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:18:20 -08:00
Palmer Dabbelt
3074e8b175 Merge patch series "riscv: ftrace: Miscellaneous ftrace improvements"
Björn Töpel <bjorn@kernel.org> says:

This series includes a three ftrace improvements for RISC-V:

1. Do not require to run recordmcount at build time (patch 1)
2. Simplification of the function graph functionality (patch 2)
3. Enable DYNAMIC_FTRACE_WITH_DIRECT_CALLS (patch 3 and 4)

The series has been tested on Qemu/rv64 virt/Debian sid with the
following test configs:
  CONFIG_FTRACE_SELFTEST=y
  CONFIG_FTRACE_STARTUP_TEST=y
  CONFIG_SAMPLE_FTRACE_DIRECT=m
  CONFIG_SAMPLE_FTRACE_DIRECT_MULTI=m
  CONFIG_SAMPLE_FTRACE_OPS=m

All tests pass.

* b4-shazam-merge:
  samples: ftrace: Add RISC-V support for SAMPLE_FTRACE_DIRECT[_MULTI]
  riscv: ftrace: Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support
  riscv: ftrace: Make function graph use ftrace directly
  riscv: select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY

Link: https://lore.kernel.org/r/20231130121531.1178502-1-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:17:29 -08:00
Song Shuai
629291dd84 samples: ftrace: Add RISC-V support for SAMPLE_FTRACE_DIRECT[_MULTI]
Add RISC-V variants of the ftrace-direct* samples.

Tested-by: Evgenii Shatokhin <e.shatokhin@yadro.com>
Signed-off-by: Song Shuai <suagrfillet@gmail.com>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@kernel.org>
Acked-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20231130121531.1178502-5-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:17:10 -08:00
Song Shuai
196c79f19a riscv: ftrace: Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support
Select the DYNAMIC_FTRACE_WITH_DIRECT_CALLS to provide the
register_ftrace_direct[_multi] interfaces allowing users to register
the customed trampoline (direct_caller) as the mcount for one or more
target functions. And modify_ftrace_direct[_multi] are also provided
for modifying direct_caller.

To make the direct_caller and the other ftrace hooks (e.g.
function/fgraph tracer, k[ret]probes) co-exist, a temporary register
is nominated to store the address of direct_caller in
ftrace_regs_caller. After the setting of the address direct_caller by
direct_ops->func and the RESTORE_REGS in ftrace_regs_caller,
direct_caller will be jumped to by the `jr` inst.

Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support for RISC-V.

Signed-off-by: Song Shuai <suagrfillet@gmail.com>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@kernel.org>
Acked-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20231130121531.1178502-4-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:17:09 -08:00
Song Shuai
35e61e8827 riscv: ftrace: Make function graph use ftrace directly
Similar to commit 0c0593b45c ("x86/ftrace: Make function graph use
ftrace directly") and commit c4a0ebf87c ("arm64/ftrace: Make
function graph use ftrace directly"), RISC-V has no need for a special
graph tracer hook. The graph_ops::func function can be used to install
the return_hooker.

This cleanup only changes the FTRACE_WITH_REGS implementation, leaving
the mcount-based implementation is unaffected.

Perform the simplification, and also cleanup the register save/restore
macros.

Signed-off-by: Song Shuai <suagrfillet@gmail.com>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@kernel.org>
Acked-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20231130121531.1178502-3-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:17:08 -08:00
Song Shuai
b546d6363a riscv: select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
In commit afc76b8b80 ("riscv: Using PATCHABLE_FUNCTION_ENTRY instead
of MCOUNT") RISC-V added support for -fpatchable-function-entry, which
removes the need for recordmcount.

Select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY to tell the build
system not to run recordmcount.

Link: https://lore.kernel.org/linux-riscv/CAAYs2=j3Eak9vU6xbAw0zPuoh00rh8v5C2U3fePkokZFibWs2g@mail.gmail.com/T/#t
Link: https://lore.kernel.org/linux-riscv/Y4jtfrJt+%2FQ5nMOz@spud/
Signed-off-by: Song Shuai <suagrfillet@gmail.com>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@kernel.org>
Acked-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20231130121531.1178502-2-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:17:07 -08:00
Palmer Dabbelt
448857ec53 Merge patch series "RISC-V: Disable DWARF5 with known broken LLVM versions"
Nathan Chancellor <nathan@kernel.org> says:

This series disables DWARF5 for LLVM versions where it is known to be
broken due to linker relaxation.

* b4-shazam-merge:
  lib/Kconfig.debug: Update AS_HAS_NON_CONST_LEB128 comment and name
  riscv: Restrict DWARF5 when building with LLVM to known working versions
  riscv: Hoist linker relaxation disabling logic into Kconfig

Link: bbc0f99f3b
Link: https://lore.kernel.org/r/20231205-riscv-restrict-dwarf5-llvm-v2-0-aedf00a382ac@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:08:30 -08:00
Nathan Chancellor
a4426641f0 lib/Kconfig.debug: Update AS_HAS_NON_CONST_LEB128 comment and name
Fangrui noted that the comment around CONFIG_AS_HAS_NON_CONST_LEB128
could be made more accurate because explicit .sleb128 directives are not
emitted, only .uleb128 directives are. Rename the symbol to
CONFIG_AS_HAS_NON_CONST_ULEB128 as a result.

Further clarifications include replacing "symbol deltas" with the more
accurate "label differences", noting that this issue has been resolved
in newer binutils (2.41+), and it only occurs when a port uses RISC-V
style linker relaxation.

Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20231205-riscv-restrict-dwarf5-llvm-v2-3-aedf00a382ac@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:08:28 -08:00
Nathan Chancellor
ae84ff9a14 riscv: Restrict DWARF5 when building with LLVM to known working versions
LLVM prior to 18.0.0 would generate incorrect debug info for DWARF5 due
to linker relaxation, which was worked around in clang by defaulting
RISC-V to DWARF4 [1]. Unfortunately, this workaround does not work for
the kernel because the DWARF version can be independently changed from
the default in Kconfig.

Do not allow DWARF5 to be selected for RISC-V when using linker
relaxation (ld.lld >= 15.0.0) and a version of LLVM that does not have
the fixes (the integrated assembler [2] and ld.lld [3] < 18.0.0)
necessary to generate the correct debug info.

Link: bbc0f99f3b [1]
Link: 1df5ea29b4 [2]
Link: 7ffabb61a5 [3]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20231205-riscv-restrict-dwarf5-llvm-v2-2-aedf00a382ac@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:08:27 -08:00
Nathan Chancellor
55b71d2ce1 riscv: Hoist linker relaxation disabling logic into Kconfig
Certain configurations may need to be disabled if linker relaxation is
in use, such as DWARF5 with ld.lld < 18. Hoist the logic of whether or
not linker relaxation is in use into Kconfig so decisions can be made at
configuration time.

Reviewed-by: Fangrui Song <maskray@google.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20231205-riscv-restrict-dwarf5-llvm-v2-1-aedf00a382ac@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:08:26 -08:00
Palmer Dabbelt
c640868491 Merge patch series "riscv: Add fine-tuned checksum functions"
Charlie Jenkins <charlie@rivosinc.com> says:

Each architecture generally implements fine-tuned checksum functions to
leverage the instruction set. This patch adds the main checksum
functions that are used in networking. Tested on QEMU, this series
allows the CHECKSUM_KUNIT tests to complete an average of 50.9% faster.

This patch takes heavy use of the Zbb extension using alternatives
patching.

To test this patch, enable the configs for KUNIT, then CHECKSUM_KUNIT.

I have attempted to make these functions as optimal as possible, but I
have not ran anything on actual riscv hardware. My performance testing
has been limited to inspecting the assembly, running the algorithms on
x86 hardware, and running in QEMU.

ip_fast_csum is a relatively small function so even though it is
possible to read 64 bits at a time on compatible hardware, the
bottleneck becomes the clean up and setup code so loading 32 bits at a
time is actually faster.

* b4-shazam-merge:
  kunit: Add tests for csum_ipv6_magic and ip_fast_csum
  riscv: Add checksum library
  riscv: Add checksum header
  riscv: Add static key for misaligned accesses
  asm-generic: Improve csum_fold

Link: https://lore.kernel.org/r/20240108-optimize_checksum-v15-0-1c50de5f2167@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:07:11 -08:00
Charlie Jenkins
6f4c45cbcb kunit: Add tests for csum_ipv6_magic and ip_fast_csum
Supplement existing checksum tests with tests for csum_ipv6_magic and
ip_fast_csum.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240108-optimize_checksum-v15-5-1c50de5f2167@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 17:52:33 -08:00
Charlie Jenkins
a04c192eab riscv: Add checksum library
Provide a 32 and 64 bit version of do_csum. When compiled for 32-bit
will load from the buffer in groups of 32 bits, and when compiled for
64-bit will load in groups of 64 bits.

Additionally provide riscv optimized implementation of csum_ipv6_magic.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Xiao Wang <xiao.w.wang@intel.com>
Link: https://lore.kernel.org/r/20240108-optimize_checksum-v15-4-1c50de5f2167@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 17:52:32 -08:00
Charlie Jenkins
e11e367e9f riscv: Add checksum header
Provide checksum algorithms that have been designed to leverage riscv
instructions such as rotate. In 64-bit, can take advantage of the larger
register to avoid some overflow checking.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Xiao Wang <xiao.w.wang@intel.com>
Link: https://lore.kernel.org/r/20240108-optimize_checksum-v15-3-1c50de5f2167@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 17:52:31 -08:00
Charlie Jenkins
2ce5729fce riscv: Add static key for misaligned accesses
Support static branches depending on the value of misaligned accesses.
This will be used by a later patch in the series. At any point in time,
this static branch will only be enabled if all online CPUs are
considered "fast".

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Link: https://lore.kernel.org/r/20240108-optimize_checksum-v15-2-1c50de5f2167@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 17:52:30 -08:00
Charlie Jenkins
1e7196fa5b asm-generic: Improve csum_fold
This csum_fold implementation introduced into arch/arc by Vineet Gupta
is better than the default implementation on at least arc, x86, and
riscv. Using GCC trunk and compiling non-inlined version, this
implementation has 41.6667%, 25% fewer instructions on riscv64, x86-64
respectively with -O3 optimization. Most implmentations override this
default in asm, but this should be more performant than all of those
other implementations except for arm which has barrel shifting and
sparc32 which has a carry flag.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: David Laight <david.laight@aculab.com>
Link: https://lore.kernel.org/r/20240108-optimize_checksum-v15-1-1c50de5f2167@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 17:52:29 -08:00
Andrew Jones
0de65288d7 RISC-V: selftests: cbo: Ensure asm operands match constraints
The 'i' constraint expects a constant operand, which fn and its
constant derivative MK_CBO(fn) are, but passing fn through a function
as a parameter and using a local variable for MK_CBO(fn) allow the
compiler to lose sight of that when no optimization is done. Use
a macro instead of a function and skip the local variable to ensure
the compiler uses constants, matching the asm constraints.

Reported-by: Yunhui Cui <cuiyunhui@bytedance.com>
Closes: https://lore.kernel.org/all/20240117082514.42967-1-cuiyunhui@bytedance.com
Fixes: a29e2a48af ("RISC-V: selftests: Add CBO tests")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240117130933.57514-2-ajones@ventanamicro.com
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 17:46:55 -08:00
Palmer Dabbelt
a894e8ed09 Merge patch series "riscv: support kernel-mode Vector"
Andy Chiu <andy.chiu@sifive.com> says:

This series provides support running Vector in kernel mode.
Additionally, kernel-mode Vector can be configured to run without
turnning off preemption on a CONFIG_PREEMPT kernel. Along with the
suport, we add Vector optimized copy_{to,from}_user. And provide a
simple threshold to decide when to run the vectorized functions.

We decided to drop vectorized memcpy/memset/memmove for the moment due
to the concern of memory side-effect in kernel_vector_begin(). The
detailed description can be found at v9[0]

This series is composed by 4 parts:
 patch 1-4: adds basic support for kernel-mode Vector
 patch 5: includes vectorized copy_{to,from}_user into the kernel
 patch 6: refactor context switch code in fpu [1]
 patch 7-10: provides some code refactors and support for preemptible
             kernel-mode Vector.

This series can be merged if we feel any part of {1~4, 5, 6, 7~10} is
mature enough.

This patch is tested on a QEMU with V and verified that booting, normal
userspace operations all work as usual with thresholds set to 0. Also,
we test by launching multiple kernel threads which continuously executes
and verifies Vector operations in the background. The module that tests
these operation is expected to be upstream later.

* b4-shazam-merge:
  riscv: vector: allow kernel-mode Vector with preemption
  riscv: vector: use kmem_cache to manage vector context
  riscv: vector: use a mask to write vstate_ctrl
  riscv: vector: do not pass task_struct into riscv_v_vstate_{save,restore}()
  riscv: fpu: drop SR_SD bit checking
  riscv: lib: vectorize copy_to_user/copy_from_user
  riscv: sched: defer restoring Vector context for user
  riscv: Add vector extension XOR implementation
  riscv: vector: make Vector always available for softirq context
  riscv: Add support for kernel mode vector

Link: https://lore.kernel.org/r/20240115055929.4736-1-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-16 07:14:04 -08:00
Andy Chiu
2080ff9493 riscv: vector: allow kernel-mode Vector with preemption
Add kernel_vstate to keep track of kernel-mode Vector registers when
trap introduced context switch happens. Also, provide riscv_v_flags to
let context save/restore routine track context status. Context tracking
happens whenever the core starts its in-kernel Vector executions. An
active (dirty) kernel task's V contexts will be saved to memory whenever
a trap-introduced context switch happens. Or, when a softirq, which
happens to nest on top of it, uses Vector. Context retoring happens when
the execution transfer back to the original Kernel context where it
first enable preempt_v.

Also, provide a config CONFIG_RISCV_ISA_V_PREEMPTIVE to give users an
option to disable preemptible kernel-mode Vector at build time. Users
with constraint memory may want to disable this config as preemptible
kernel-mode Vector needs extra space for tracking of per thread's
kernel-mode V context. Or, users might as well want to disable it if all
kernel-mode Vector code is time sensitive and cannot tolerate context
switch overhead.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20240115055929.4736-11-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-16 07:14:02 -08:00
Andy Chiu
bd446f5df5 riscv: vector: use kmem_cache to manage vector context
The allocation size of thread.vstate.datap is always riscv_v_vsize. So
it is possbile to use kmem_cache_* to manage the allocation. This gives
users more information regarding allocation of vector context via
/proc/slabinfo. And it potentially reduces the latency of the first-use
trap because of the allocation caches.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20240115055929.4736-10-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-16 07:14:01 -08:00
Andy Chiu
5b6048f2ff riscv: vector: use a mask to write vstate_ctrl
riscv_v_ctrl_set() should only touch bits within
PR_RISCV_V_VSTATE_CTRL_MASK. So, use the mask when we really set task's
vstate_ctrl.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20240115055929.4736-9-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-16 07:14:00 -08:00
Andy Chiu
d6c78f1ca3 riscv: vector: do not pass task_struct into riscv_v_vstate_{save,restore}()
riscv_v_vstate_{save,restore}() can operate only on the knowlege of
struct __riscv_v_ext_state, and struct pt_regs. Let the caller decides
which should be passed into the function. Meanwhile, the kernel-mode
Vector is going to introduce another vstate, so this also makes functions
potentially able to be reused.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20240115055929.4736-8-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-16 07:13:59 -08:00
Andy Chiu
a93fdaf183 riscv: fpu: drop SR_SD bit checking
SR_SD summarizes the dirty status of FS/VS/XS. However, the current code
structure does not fully utilize it because each extension specific code
is divided into an individual segment. So remove the SR_SD check for
now.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Song Shuai <songshuaishuai@tinylab.org>
Reviewed-by: Guo Ren <guoren@kernel.org>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20240115055929.4736-7-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-16 07:13:58 -08:00
Andy Chiu
c2a658d419 riscv: lib: vectorize copy_to_user/copy_from_user
This patch utilizes Vector to perform copy_to_user/copy_from_user. If
Vector is available and the size of copy is large enough for Vector to
perform better than scalar, then direct the kernel to do Vector copies
for userspace. Though the best programming practice for users is to
reduce the copy, this provides a faster variant when copies are
inevitable.

The optimal size for using Vector, copy_to_user_thres, is only a
heuristic for now. We can add DT parsing if people feel the need of
customizing it.

The exception fixup code of the __asm_vector_usercopy must fallback to
the scalar one because accessing user pages might fault, and must be
sleepable. Current kernel-mode Vector does not allow tasks to be
preemptible, so we must disactivate Vector and perform a scalar fallback
in such case.

The original implementation of Vector operations comes from
https://github.com/sifive/sifive-libc, which we agree to contribute to
Linux kernel.

Co-developed-by: Jerry Shih <jerry.shih@sifive.com>
Signed-off-by: Jerry Shih <jerry.shih@sifive.com>
Co-developed-by: Nick Knight <nick.knight@sifive.com>
Signed-off-by: Nick Knight <nick.knight@sifive.com>
Suggested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20240115055929.4736-6-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-16 07:13:57 -08:00
Andy Chiu
7df56cbc27 riscv: sched: defer restoring Vector context for user
User will use its Vector registers only after the kernel really returns
to the userspace. So we can delay restoring Vector registers as long as
we are still running in kernel mode. So, add a thread flag to indicates
the need of restoring Vector and do the restore at the last
arch-specific exit-to-user hook. This save the context restoring cost
when we switch over multiple processes that run V in kernel mode. For
example, if the kernel performs a context swicth from A->B->C, and
returns to C's userspace, then there is no need to restore B's
V-register.

Besides, this also prevents us from repeatedly restoring V context when
executing kernel-mode Vector multiple times.

The cost of this is that we must disable preemption and mark vector as
busy during vstate_{save,restore}. Because then the V context will not
get restored back immediately when a trap-causing context switch happens
in the middle of vstate_{save,restore}.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20240115055929.4736-5-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-16 07:13:56 -08:00
Greentime Hu
c5674d00ca riscv: Add vector extension XOR implementation
This patch adds support for vector optimized XOR and it is tested in
qemu.

Co-developed-by: Han-Kuan Chen <hankuan.chen@sifive.com>
Signed-off-by: Han-Kuan Chen <hankuan.chen@sifive.com>
Signed-off-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20240115055929.4736-4-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-16 07:13:55 -08:00
Andy Chiu
956895b9d8 riscv: vector: make Vector always available for softirq context
The goal of this patch is to provide full support of Vector in kernel
softirq context. So that some of the crypto alogrithms won't need scalar
fallbacks.

By disabling bottom halves in active kernel-mode Vector, softirq will
not be able to nest on top of any kernel-mode Vector. So, softirq
context is able to use Vector whenever it runs.

After this patch, Vector context cannot start with irqs disabled.
Otherwise local_bh_enable() may run in a wrong context.

Disabling bh is not enough for RT-kernel to prevent preeemption. So
we must disable preemption, which also implies disabling bh on RT.

Related-to: commit 696207d425 ("arm64/sve: Make kernel FPU protection RT friendly")
Related-to: commit 66c3ec5a71 ("arm64: neon: Forbid when irqs are disabled")
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20240115055929.4736-3-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-16 07:13:54 -08:00
Greentime Hu
ecd2ada8a5 riscv: Add support for kernel mode vector
Add kernel_vector_begin() and kernel_vector_end() function declarations
and corresponding definitions in kernel_mode_vector.c

These are needed to wrap uses of vector in kernel mode.

Co-developed-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20240115055929.4736-2-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-16 07:13:53 -08:00
Palmer Dabbelt
d4abde52b4 Merge patch series "riscv: mm: Fixup & Optimize COMPAT code"
guoren@kernel.org <guoren@kernel.org> says:

From: Guo Ren <guoren@linux.alibaba.com>

When the task is in COMPAT mode, the TASK_SIZE should be 2GB, so
STACK_TOP_MAX and arch_get_mmap_end must be limited to 2 GB. This series
fixes the problem made by commit: add2cc6b65 ("RISC-V: mm: Restrict
address space for sv39,sv48,sv57") and optimizes the related coding
convention of TASK_SIZE.

* b4-shazam-merge:
  riscv: mm: Fixup compat arch_get_mmap_end
  riscv: mm: Fixup compat mode boot failure

Link: https://lore.kernel.org/r/20231222115703.2404036-1-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-11 08:04:38 -08:00
Guo Ren
97b7ac69be riscv: mm: Fixup compat arch_get_mmap_end
When the task is in COMPAT mode, the arch_get_mmap_end should be 2GB,
not TASK_SIZE_64. The TASK_SIZE has contained is_compat_mode()
detection, so change the definition of STACK_TOP_MAX to TASK_SIZE
directly.

Cc: stable@vger.kernel.org
Fixes: add2cc6b65 ("RISC-V: mm: Restrict address space for sv39,sv48,sv57")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20231222115703.2404036-3-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-11 08:04:36 -08:00
Guo Ren
5f449e245e riscv: mm: Fixup compat mode boot failure
In COMPAT mode, the STACK_TOP is DEFAULT_MAP_WINDOW (0x80000000), but
the TASK_SIZE is 0x7fff000. When the user stack is upon 0x7fff000, it
will cause a user segment fault. Sometimes, it would cause boot
failure when the whole rootfs is rv32.

Freeing unused kernel image (initmem) memory: 2236K
Run /sbin/init as init process
Starting init: /sbin/init exists but couldn't execute it (error -14)
Run /etc/init as init process
...

Increase the TASK_SIZE to cover STACK_TOP.

Cc: stable@vger.kernel.org
Fixes: add2cc6b65 ("RISC-V: mm: Restrict address space for sv39,sv48,sv57")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20231222115703.2404036-2-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-11 08:04:35 -08:00
Christophe JAILLET
adb1f95d38 riscv: Fix an off-by-one in get_early_cmdline()
The ending NULL is not taken into account by strncat(), so switch to
strlcat() to correctly compute the size of the available memory when
appending CONFIG_CMDLINE to 'early_cmdline'.

Fixes: 26e7aacb83 ("riscv: Allow to downgrade paging mode from the command line")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/9f66d2b58c8052d4055e90b8477ee55d9a0914f9.1698564026.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-11 08:04:18 -08:00
Palmer Dabbelt
9b1d9abe24 Merge patch series "tools: selftests: riscv: Fix compiler warnings"
Christoph Muellner <christoph.muellner@vrull.eu> says:

From: Christoph Müllner <christoph.muellner@vrull.eu>

When building the RISC-V selftests with a riscv32 compiler I ran into
a couple of compiler warnings. While riscv32 support for these tests is
questionable, the fixes are so trivial that it is probably best to simply
apply them.

Note that the missing-include patch and some format string warnings
are also relevant for riscv64.

* b4-shazam-merge:
  tools: selftests: riscv: Fix compile warnings in mm tests
  tools: selftests: riscv: Fix compile warnings in vector tests
  tools: selftests: riscv: Add missing include for vector test
  tools: selftests: riscv: Fix compile warnings in cbo
  tools: selftests: riscv: Fix compile warnings in hwprobe

Link: https://lore.kernel.org/r/20231123185821.2272504-1-christoph.muellner@vrull.eu
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-11 08:02:55 -08:00
Christoph Müllner
12c1691965 tools: selftests: riscv: Fix compile warnings in mm tests
When building the mm tests with a riscv32 compiler, we see a range
of shift-count-overflow errors from shifting 1UL by more than 32 bits
in do_mmaps(). Since, the relevant code is only called from code that
is gated by `__riscv_xlen == 64`, we can just apply the same gating
to do_mmaps().

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231123185821.2272504-6-christoph.muellner@vrull.eu
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-11 08:02:51 -08:00
Christoph Müllner
e1baf5e68e tools: selftests: riscv: Fix compile warnings in vector tests
GCC prints a couple of format string warnings when compiling
the vector tests. Let's follow the recommendation in
Documentation/printk-formats.txt to fix these warnings.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231123185821.2272504-5-christoph.muellner@vrull.eu
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-11 08:02:50 -08:00
Christoph Müllner
b250c90898 tools: selftests: riscv: Add missing include for vector test
GCC raises the following warning:
  warning: 'status' may be used uninitialized
The warning comes from the fact, that the signature of waitpid() is
unknown and therefore the initialization of GCC cannot be guessed.
Let's add the relevant header to address this warning.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231123185821.2272504-4-christoph.muellner@vrull.eu
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-11 08:02:49 -08:00
Christoph Müllner
ac7b2a02d6 tools: selftests: riscv: Fix compile warnings in cbo
GCC prints a couple of format string warnings when compiling
the cbo test. Let's follow the recommendation in
Documentation/printk-formats.txt to fix these warnings.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231123185821.2272504-3-christoph.muellner@vrull.eu
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-11 08:02:48 -08:00
Christoph Müllner
b91c26fdb0 tools: selftests: riscv: Fix compile warnings in hwprobe
GCC prints a couple of format string warnings when compiling
the hwprobe test. Let's follow the recommendation in
Documentation/printk-formats.txt to fix these warnings.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231123185821.2272504-2-christoph.muellner@vrull.eu
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-11 08:02:47 -08:00
Alexandre Ghiti
54d7431af7 riscv: Add support for BATCHED_UNMAP_TLB_FLUSH
Allow to defer the flushing of the TLB when unmapping pages, which allows
to reduce the numbers of IPI and the number of sfence.vma.

The ubenchmarch used in commit 43b3dfdd04 ("arm64: support
batched/deferred tlb shootdown during page reclamation/migration") that
was multithreaded to force the usage of IPI shows good performance
improvement on all platforms:

* Unmatched: ~34%
* TH1520   : ~78%
* Qemu     : ~81%

In addition, perf on qemu reports an important decrease in time spent
dealing with IPIs:

Before:  68.17%  main     [kernel.kallsyms]            [k] __sbi_rfence_v02_call
After :   8.64%  main     [kernel.kallsyms]            [k] __sbi_rfence_v02_call

* Benchmark:

int stick_this_thread_to_core(int core_id) {
        int num_cores = sysconf(_SC_NPROCESSORS_ONLN);
        if (core_id < 0 || core_id >= num_cores)
           return EINVAL;

        cpu_set_t cpuset;
        CPU_ZERO(&cpuset);
        CPU_SET(core_id, &cpuset);

        pthread_t current_thread = pthread_self();
        return pthread_setaffinity_np(current_thread,
sizeof(cpu_set_t), &cpuset);
}

static void *fn_thread (void *p_data)
{
        int ret;
        pthread_t thread;

        stick_this_thread_to_core((int)p_data);

        while (1) {
                sleep(1);
        }

        return NULL;
}

int main()
{
        volatile unsigned char *p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
                                         MAP_SHARED | MAP_ANONYMOUS, -1, 0);
        pthread_t threads[4];
        int ret;

        for (int i = 0; i < 4; ++i) {
                ret = pthread_create(&threads[i], NULL, fn_thread, (void *)i);
                if (ret)
                {
                        printf("%s", strerror (ret));
                }
        }

        memset(p, 0x88, SIZE);

        for (int k = 0; k < 10000; k++) {
                /* swap in */
                for (int i = 0; i < SIZE; i += 4096) {
                        (void)p[i];
                }

                /* swap out */
                madvise(p, SIZE, MADV_PAGEOUT);
        }

        for (int i = 0; i < 4; i++)
        {
                pthread_cancel(threads[i]);
        }

        for (int i = 0; i < 4; i++)
        {
                pthread_join(threads[i], NULL);
        }

        return 0;
}

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Tested-by: Jisheng Zhang <jszhang@kernel.org> # Tested on TH1520
Tested-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20240108193640.344929-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-11 08:01:53 -08:00
Alexandre Ghiti
ff172d4818 riscv: Use hugepage mappings for vmemmap
This will allow better TLB utilization and then should be more performant.

Before:

---[ vmemmap start ]---
0xffff8d8002000000-0xffff8d8012000000    0x000000046ec00000       256M PTE .   ..     ..   D A G . . W R V
---[ vmemmap end ]---

After:

---[ vmemmap start ]---
0xffff8d8002000000-0xffff8d8012000000    0x000000046ec00000       256M PMD .   ..     ..   D A G . . W R V
---[ vmemmap end ]---

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20231214132935.212864-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-11 08:01:47 -08:00
Daniel Henrique Barboza
d3e591a38c dt-bindings: riscv: Document cbop-block-size
Following the examples of cbom-block-size and cboz-block-size,
cbop-block-size is the cache size of Zicbop (cbo.prefetch) operations.
The most common case is to have all cache block sizes to be the same
size (e.g. profiles such as rva22u64 mandates a 64 bytes size for all
cache operations), but there's no specification requirement for that,
and an implementation can have different cache sizes for each operation.

Cc: Rob Herring <robh@kernel.org>
Cc: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231029123500.739409-1-dbarboza@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-11 07:36:30 -08:00
Palmer Dabbelt
2e605741e9 Merge patch series "riscv: errata: thead: use riscv_nonstd_cache_ops for CMO"
Jisheng Zhang <jszhang@kernel.org> says:

Previously, we use alternative mechanism to dynamically patch
the CMO operations for THEAD C906/C910 during boot for performance
reason. But as pointed out by Arnd, "there is already a significant
cost in accessing the invalidated cache lines afterwards, which is
likely going to be much higher than the cost of an indirect branch".
And indeed, there's no performance difference with GMAC and EMMC per
my test on Sipeed Lichee Pi 4A board.

Use riscv_nonstd_cache_ops for THEAD C906/C910 CMO to simplify
the alternative code, and to acchieve Arnd's goal -- "I think
moving the THEAD ops at the same level as all nonstandard operations
makes sense, but I'd still leave CMO as an explicit fast path that
avoids the indirect branch. This seems like the right thing to do both
for readability and for platforms on which the indirect branch has a
noticeable overhead."

To make bisect easy, I use two patches here: patch1 does the conversion
which just mimics current CMO behavior via. riscv_nonstd_cache_ops, I
assume no functionalities changes. patch2 uses T-HEAD PA based CMO
instructions so that we don't need to covert PA to VA.

* b4-shazam-merge:
  riscv: errata: thead: use pa based instructions for CMO
  riscv: errata: thead: use riscv_nonstd_cache_ops for CMO

Link: https://lore.kernel.org/r/20231114143338.2406-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-11 07:36:30 -08:00
Conor Dooley
07df87c0f8 dt-bindings: riscv: permit numbers in "riscv,isa"
There are some extensions that contain numbers, such as Zve32f, which
are enabled by the "max" cpu type in QEMU.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20231208-uncolored-oxidant-5ab37dd3ab84@spud
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-11 07:36:29 -08:00
Samuel Holland
a452816132 dt-bindings: riscv: cpus: Clarify mmu-type interpretation
The current description implies that only a single address translation
mode is available to the operating system. However, some implementations
support multiple address translation modes, and the operating system is
free to choose between them.

Per the RISC-V privileged specification, Sv48 implementations must also
implement Sv39, and likewise Sv57 implies support for Sv48. This means
it is possible to describe all supported address translation modes using
a single value, by naming the largest supported mode. This appears to
have been the intended usage of the property, so note it explicitly.

Fixes: 4fd669a8c4 ("dt-bindings: riscv: convert cpu binding to json-schema")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20231227175739.1453782-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-11 07:36:28 -08:00
Palmer Dabbelt
951df4eb81 Merge patch series "RISC-V SBI debug console extension support"
Anup Patel <apatel@ventanamicro.com> says:

The SBI v2.0 specification is now frozen. The SBI v2.0 specification defines
SBI debug console (DBCN) extension which replaces the legacy SBI v0.1
functions sbi_console_putchar() and sbi_console_getchar().
(Refer v2.0-rc5 at https://github.com/riscv-non-isa/riscv-sbi-doc/releases)

This series adds support for SBI debug console (DBCN) extension in
Linux RISC-V.

To try these patches with KVM RISC-V, use KVMTOOL from the
riscv_zbx_zicntr_smstateen_condops_v1 branch at:
https://github.com/avpatel/kvmtool.git

* b4-shazam-merge:
  RISC-V: Enable SBI based earlycon support
  tty: Add SBI debug console support to HVC SBI driver
  tty/serial: Add RISC-V SBI debug console based earlycon
  RISC-V: Add SBI debug console helper routines
  RISC-V: Add stubs for sbi_console_putchar/getchar()

Link: https://lore.kernel.org/r/20231124070905.1043092-1-apatel@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-11 07:36:27 -08:00
Andrew Jones
4dc4af9ce3 riscv: sbi: Introduce system suspend support
When the SUSP SBI extension is present it implies that the standard
"suspend to RAM" type is available. Wire it up to the generic
platform suspend support, also applying the already present support
for non-retentive CPU suspend. When the kernel is built with
CONFIG_SUSPEND, one can do 'echo mem > /sys/power/state' to suspend.
Resumption will occur when a platform-specific wake-up event arrives.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Tested-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20231206110807.35882-4-ajones@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-11 07:36:26 -08:00