linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-09 07:51:16 -04:00

Author	SHA1	Message	Date
Jakub Kicinski	ea937f7720	net: netdevsim: don't try to destroy PHC on VFs PHC gets initialized in nsim_init_netdevsim(), which is only called if (nsim_dev_port_is_pf()). Create a counterpart of nsim_init_netdevsim() and move the mock_phc_destroy() there. This fixes a crash trying to destroy netdevsim with VFs instantiated, as caught by running the devlink.sh test: BUG: kernel NULL pointer dereference, address: 00000000000000b8 RIP: 0010:mock_phc_destroy+0xd/0x30 Call Trace: <TASK> nsim_destroy+0x4a/0x70 [netdevsim] __nsim_dev_port_del+0x47/0x70 [netdevsim] nsim_dev_reload_destroy+0x105/0x120 [netdevsim] nsim_drv_remove+0x2f/0xb0 [netdevsim] device_release_driver_internal+0x1a1/0x210 bus_remove_device+0xd5/0x120 device_del+0x159/0x490 device_unregister+0x12/0x30 del_device_store+0x11a/0x1a0 [netdevsim] kernfs_fop_write_iter+0x130/0x1d0 vfs_write+0x30b/0x4b0 ksys_write+0x69/0xf0 do_syscall_64+0xcc/0x1e0 entry_SYSCALL_64_after_hwframe+0x6f/0x77 Fixes: `b63e78fca8` ("net: netdevsim: use mock PHC driver") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-17 10:56:44 +00:00
Paolo Abeni	c0f5aec28e	mptcp: relax check on MPC passive fallback While testing the blamed commit below, I was able to miss (!) packetdrill failures in the fastopen test-cases. On passive fastopen the child socket is created by incoming TCP MPC syn, allow for both MPC_SYN and MPC_ACK header. Fixes: `724b00c129` ("mptcp: refine opt_mp_capable determination") Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-17 10:55:54 +00:00
Romain Gantois	c2945c435c	net: stmmac: Prevent DSA tags from breaking COE Some DSA tagging protocols change the EtherType field in the MAC header e.g. DSA_TAG_PROTO_(DSA/EDSA/BRCM/MTK/RTL4C_A/SJA1105). On TX these tagged frames are ignored by the checksum offload engine and IP header checker of some stmmac cores. On RX, the stmmac driver wrongly assumes that checksums have been computed for these tagged packets, and sets CHECKSUM_UNNECESSARY. Add an additional check in the stmmac TX and RX hotpaths so that COE is deactivated for packets with ethertypes that will not trigger the COE and IP header checks. Fixes: `6b2c6e4a93` ("net: stmmac: propagate feature flags to vlan") Cc: <stable@vger.kernel.org> Reported-by: Richard Tresidder <rtresidd@electromag.com.au> Link: https://lore.kernel.org/netdev/e5c6c75f-2dfa-4e50-a1fb-6bf4cdb617c2@electromag.com.au/ Reported-by: Romain Gantois <romain.gantois@bootlin.com> Link: https://lore.kernel.org/netdev/c57283ed-6b9b-b0e6-ee12-5655c1c54495@bootlin.com/ Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Romain Gantois <romain.gantois@bootlin.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-17 10:03:29 +00:00
Bartosz Golaszewski	efb8235bfd	gpiolib: revert the attempt to protect the GPIO device list with an rwsem This reverts commits `1979a28075` ("gpiolib: replace the GPIO device mutex with a read-write semaphore") and `65a828bab1` ("gpiolib: use a mutex to protect the list of GPIO devices"). Unfortunately the legacy GPIO API that's still used in older code has to translate numbers from the global GPIO numberspace to descriptors. This results in a GPIO device lookup in every call to legacy functions. Some of those functions - like gpio_set/get_value() - can be called from atomic context so taking a sleeping lock that is an RW semaphore results in an error. We'll probably have to protect this list with SRCU. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/linux-wireless/f7b5ff1e-8f34-4d98-a7be-b826cb897dc8@moroto.mountain/ Fixes: `1979a28075` ("gpiolib: replace the GPIO device mutex with a read-write semaphore") Fixes: `65a828bab1` ("gpiolib: use a mutex to protect the list of GPIO devices") Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-01-17 09:52:37 +01:00
Tiezhu Yang	6e441fa3ac	MAINTAINERS: Add BPF JIT for LOONGARCH entry After commit `5dc615520c` ("LoongArch: Add BPF JIT support"), there is no BPF JIT for LOONGARCH entry, in order to maintain the current code and the new features timely, just add it. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:13 +08:00
Huacai Chen	fc562925f5	LoongArch: Update Loongson-3 default config file 1, Increase NR_CPUS to 256. 2, Enable some cgroup options. 3, Enable some PREEMPT_DYNAMIC/SCHED_CORE options. 4, Enable some CMA/DMA_CMA options. 5, Enable some F2FS options. 6, Enable some DMABUF/UDMABUF options. 7, Enable some USB4 and NTB options. 8, Enable some networking options (MPTCP). 9, Enable Loongson-specific drivers: APB DMA, ASoC. 10, Enable PCI_HOST_GENERIC and SND_VIRTIO for virtual machine. 11, Remove obsolete SECURITY_SELINUX_DISABLE. 12, Regenerate the whole file to keep the order of options be the same as the latest source code. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:13 +08:00
Hengqi Chen	36a87385e3	LoongArch: BPF: Prevent out-of-bounds memory access The test_tag test triggers an unhandled page fault: # ./test_tag [ 130.640218] CPU 0 Unable to handle kernel paging request at virtual address ffff80001b898004, era == 9000000003137f7c, ra == 9000000003139e70 [ 130.640501] Oops[#3]: [ 130.640553] CPU: 0 PID: 1326 Comm: test_tag Tainted: G D O 6.7.0-rc4-loong-devel-gb62ab1a397cf #47 61985c1d94084daa2432f771daa45b56b10d8d2a [ 130.640764] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022 [ 130.640874] pc 9000000003137f7c ra 9000000003139e70 tp 9000000104cb4000 sp 9000000104cb7a40 [ 130.641001] a0 ffff80001b894000 a1 ffff80001b897ff8 a2 000000006ba210be a3 0000000000000000 [ 130.641128] a4 000000006ba210be a5 00000000000000f1 a6 00000000000000b3 a7 0000000000000000 [ 130.641256] t0 0000000000000000 t1 00000000000007f6 t2 0000000000000000 t3 9000000004091b70 [ 130.641387] t4 000000006ba210be t5 0000000000000004 t6 fffffffffffffff0 t7 90000000040913e0 [ 130.641512] t8 0000000000000005 u0 0000000000000dc0 s9 0000000000000009 s0 9000000104cb7ae0 [ 130.641641] s1 00000000000007f6 s2 0000000000000009 s3 0000000000000095 s4 0000000000000000 [ 130.641771] s5 ffff80001b894000 s6 ffff80001b897fb0 s7 9000000004090c50 s8 0000000000000000 [ 130.641900] ra: 9000000003139e70 build_body+0x1fcc/0x4988 [ 130.642007] ERA: 9000000003137f7c build_body+0xd8/0x4988 [ 130.642112] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 130.642261] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 130.642353] EUEN: 00000003 (+FPE +SXE -ASXE -BTE) [ 130.642458] ECFG: 00071c1c (LIE=2-4,10-12 VS=7) [ 130.642554] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) [ 130.642658] BADV: ffff80001b898004 [ 130.642719] PRID: 0014c010 (Loongson-64bit, Loongson-3A5000) [ 130.642815] Modules linked in: [last unloaded: bpf_testmod(O)] [ 130.642924] Process test_tag (pid: 1326, threadinfo=00000000f7f4015f, task=000000006499f9fd) [ 130.643062] Stack : 0000000000000000 9000000003380724 0000000000000000 0000000104cb7be8 [ 130.643213] 0000000000000000 25af8d9b6e600558 9000000106250ea0 9000000104cb7ae0 [ 130.643378] 0000000000000000 0000000000000000 9000000104cb7be8 90000000049f6000 [ 130.643538] 0000000000000090 9000000106250ea0 ffff80001b894000 ffff80001b894000 [ 130.643685] 00007ffffb917790 900000000313ca94 0000000000000000 0000000000000000 [ 130.643831] ffff80001b894000 0000000000000ff7 0000000000000000 9000000100468000 [ 130.643983] 0000000000000000 0000000000000000 0000000000000040 25af8d9b6e600558 [ 130.644131] 0000000000000bb7 ffff80001b894048 0000000000000000 0000000000000000 [ 130.644276] 9000000104cb7be8 90000000049f6000 0000000000000090 9000000104cb7bdc [ 130.644423] ffff80001b894000 0000000000000000 00007ffffb917790 90000000032acfb0 [ 130.644572] ... [ 130.644629] Call Trace: [ 130.644641] [<9000000003137f7c>] build_body+0xd8/0x4988 [ 130.644785] [<900000000313ca94>] bpf_int_jit_compile+0x228/0x4ec [ 130.644891] [<90000000032acfb0>] bpf_prog_select_runtime+0x158/0x1b0 [ 130.645003] [<90000000032b3504>] bpf_prog_load+0x760/0xb44 [ 130.645089] [<90000000032b6744>] __sys_bpf+0xbb8/0x2588 [ 130.645175] [<90000000032b8388>] sys_bpf+0x20/0x2c [ 130.645259] [<9000000003f6ab38>] do_syscall+0x7c/0x94 [ 130.645369] [<9000000003121c5c>] handle_syscall+0xbc/0x158 [ 130.645507] [ 130.645539] Code: 380839f6 380831f9 28412bae <24000ca6> 004081ad 0014cb50 004083e8 02bff34c 58008e91 [ 130.645729] [ 130.646418] ---[ end trace 0000000000000000 ]--- On my machine, which has CONFIG_PAGE_SIZE_16KB=y, the test failed at loading a BPF prog with 2039 instructions: prog = (struct bpf_prog )ffff80001b894000 insn = (struct bpf_insn )(prog->insnsi)ffff80001b894048 insn + 2039 = (struct bpf_insn *)ffff80001b898000 <- end of the page In the build_insn() function, we are trying to access next instruction unconditionally, i.e. `(insn + 1)->imm`. The address lies in the next page and can be not owned by the current process, thus an page fault is inevitable and then segfault. So, let's access next instruction only under `dst = imm64` context. With this fix, we have: # ./test_tag test_tag: OK (40945 tests) Fixes: `bbfddb904d` ("LoongArch: BPF: Avoid declare variables in switch-case") Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:13 +08:00
Hengqi Chen	21c5ae5cc1	LoongArch: BPF: Support 64-bit pointers to kfuncs Like commit `1cf3bfc60f` ("bpf: Support 64-bit pointers to kfuncs") for s390x, add support for 64-bit pointers to kfuncs for LoongArch. Since the infrastructure is already implemented in BPF core, the only thing need to be done is to override bpf_jit_supports_far_kfunc_call(). Before this change, several test_verifier tests failed: # ./test_verifier \| grep # \| grep FAIL #119/p calls: invalid kfunc call: ptr_to_mem to struct with non-scalar FAIL #120/p calls: invalid kfunc call: ptr_to_mem to struct with nesting depth > 4 FAIL #121/p calls: invalid kfunc call: ptr_to_mem to struct with FAM FAIL #122/p calls: invalid kfunc call: reg->type != PTR_TO_CTX FAIL #123/p calls: invalid kfunc call: void * not allowed in func proto without mem size arg FAIL #124/p calls: trigger reg2btf_ids[reg->type] for reg->type > __BPF_REG_TYPE_MAX FAIL #125/p calls: invalid kfunc call: reg->off must be zero when passed to release kfunc FAIL #126/p calls: invalid kfunc call: don't match first member type when passed to release kfunc FAIL #127/p calls: invalid kfunc call: PTR_TO_BTF_ID with negative offset FAIL #128/p calls: invalid kfunc call: PTR_TO_BTF_ID with variable offset FAIL #129/p calls: invalid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL #130/p calls: valid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL #486/p map_kptr: ref: reference state created and released on xchg FAIL This is because the kfuncs in the loaded module are far away from __bpf_call_base: ffff800002009440 t bpf_kfunc_call_test_fail1 [bpf_testmod] 9000000002e128d8 T __bpf_call_base The offset relative to __bpf_call_base does NOT fit in s32, which breaks the assumption in BPF core. Enable bpf_jit_supports_far_kfunc_call() lifts this limit. Note that to reproduce the above result, tools/testing/selftests/bpf/config should be applied, and run the test with JIT enabled, unpriv BPF enabled. With this change, the test_verifier tests now all passed: # ./test_verifier ... Summary: 777 PASSED, 0 SKIPPED, 0 FAILED Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:13 +08:00
Tiezhu Yang	91af17cd7d	LoongArch: Fix definition of ftrace_regs_set_instruction_pointer() The current definition of ftrace_regs_set_instruction_pointer() is not correct. Obviously, this function is used to set instruction pointer but not return value, so it should call instruction_pointer_set() instead of regs_set_return_value(). There is no side effect by now because it is only used for kernel live- patching which is not supported, so fix it to avoid failure when testing livepatch in the future. Fixes: `6fbff14a63` ("LoongArch: ftrace: Abstract DYNAMIC_FTRACE_WITH_ARGS accesses") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:08 +08:00
Youling Tang	78de91b458	LoongArch: Use generic interface to support crashkernel=X,[high,low] LoongArch already supports two crashkernel regions in kexec-tools, so we can directly use the common interface to support crashkernel=X,[high,low] after commit `0ab97169aa` ("crash_core: add generic function to do reservation"). With the help of newly changed function parse_crashkernel() and generic reserve_crashkernel_generic(), crashkernel reservation can be simplified by steps: 1) Add a new header file <asm/crash_core.h>, then define CRASH_ALIGN, CRASH_ADDR_LOW_MAX and CRASH_ADDR_HIGH_MAX and in <asm/crash_core.h>; 2) Add arch_reserve_crashkernel() to call parse_crashkernel() and reserve_crashkernel_generic(); 3) Add ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION Kconfig in arch/loongarch/Kconfig. One can reserve the crash kernel from high memory above DMA zone range by explicitly passing "crashkernel=X,high"; or reserve a memory range below 4G with "crashkernel=X,low". Besides, there are few rules need to take notice: 1) "crashkernel=X,[high,low]" will be ignored if "crashkernel=size" is specified. 2) "crashkernel=X,low" is valid only when "crashkernel=X,high" is passed and there is enough memory to be allocated under 4G. 3) When allocating crashkernel above 4G and no "crashkernel=X,low" is specified, a 128M low memory will be allocated automatically for swiotlb bounce buffer. See Documentation/admin-guide/kernel-parameters.txt for more information. Following test cases have been performed as expected: 1) crashkernel=256M //low=256M 2) crashkernel=1G //low=1G 3) crashkernel=4G //high=4G, low=128M(default) 4) crashkernel=4G crashkernel=256M,high //high=4G, low=128M(default), high is ignored 5) crashkernel=4G crashkernel=256M,low //high=4G, low=128M(default), low is ignored 6) crashkernel=4G,high //high=4G, low=128M(default) 7) crashkernel=256M,low //low=0M, invalid 8) crashkernel=4G,high crashkernel=256M,low //high=4G, low=256M 9) crashkernel=4G,high crashkernel=4G,low //high=0M, low=0M, invalid 10) crashkernel=512M@2560M //low=512M 11) crashkernel=1G,high crashkernel=0M,low //high=1G, low=0M Recommended usage in general: 1) In the case of small memory: crashkernel=512M 2) In the case of large memory: crashkernel=1024M,high crashkernel=128M,low Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:08 +08:00
Xi Ruoyao	c239665130	LoongArch: Fix and simplify fcsr initialization on execve() There has been a lingering bug in LoongArch Linux systems causing some GCC tests to intermittently fail (see Closes link). I've made a minimal reproducer: zsh% cat measure.s .align 4 .globl _start _start: movfcsr2gr $a0, $fcsr0 bstrpick.w $a0, $a0, 16, 16 beqz $a0, .ok break 0 .ok: li.w $a7, 93 syscall 0 zsh% cc mesaure.s -o measure -nostdlib zsh% echo $((1.0/3)) 0.33333333333333331 zsh% while ./measure; do ; done This while loop should not stop as POSIX is clear that execve must set fenv to the default, where FCSR should be zero. But in fact it will just stop after running for a while (normally less than 30 seconds). Note that "$((1.0/3))" is needed to reproduce this issue because it raises FE_INVALID and makes fcsr0 non-zero. The problem is we are currently relying on SET_PERSONALITY2() to reset current->thread.fpu.fcsr. But SET_PERSONALITY2() is executed before start_thread which calls lose_fpu(0). We can see if kernel preempt is enabled, we may switch to another thread after SET_PERSONALITY2() but before lose_fpu(0). Then bad thing happens: during the thread switch the value of the fcsr0 register is stored into current->thread.fpu.fcsr, making it dirty again. The issue can be fixed by setting current->thread.fpu.fcsr after lose_fpu(0) because lose_fpu() clears TIF_USEDFPU, then the thread switch won't touch current->thread.fpu.fcsr. The only other architecture setting FCSR in SET_PERSONALITY2() is MIPS. I've ran a similar test on MIPS with mainline kernel and it turns out MIPS is buggy, too. Anyway MIPS do this for supporting different FP flavors (NaN encodings, etc.) which do not exist on LoongArch. So for LoongArch, we can simply remove the current->thread.fpu.fcsr setting from SET_PERSONALITY2() and do it in start_thread(), after lose_fpu(0). The while loop failing with the mainline kernel has survived one hour after this change on LoongArch. Fixes: `803b0fc5c3` ("LoongArch: Add process management") Closes: https://github.com/loongson-community/discussions/issues/7 Link: https://lore.kernel.org/linux-mips/7a6aa1bbdbbe2e63ae96ff163fab0349f58f1b9e.camel@xry111.site/ Cc: stable@vger.kernel.org Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:08 +08:00
Huacai Chen	ce68ff3528	LoongArch: Let cores_io_master cover the largest NR_CPUS Now loongson_system_configuration::cores_io_master only covers 64 cpus, if NR_CPUS > 64 there will be memory corruption. So let cores_io_master cover the largest NR_CPUS (256). Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:08 +08:00
Huacai Chen	d23b77953f	LoongArch: Change SHMLBA from SZ_64K to PAGE_SIZE LoongArch has hardware page coloring for L1 Cache, so we don't have cache aliases. But SFB (Store Fill Buffer) still has aliases. So we define SHMLBA to SZ_64K previously. But there are losts of applications use PAGE_SIZE rather than SHMLBA to mmap() file pages and shared pages. Of course we can fix them one by one, but not easy. On the other hand, we can simply disable SFB for 4KB page size to fix cache alias (there will be performance decrease, but acceptable), and in future we will fix SFB in hardware. So we can safely define SHMLBA to PAGE_SIZE (use the generic shmparam.h) to make life easier. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:08 +08:00
Huacai Chen	9499daeade	LoongArch: Add a missing call to efi_esrt_init() ESRT (EFI System Resource Table) is needed for UEFI's "Capsule Update" feature. But ESRT initialization is missing on LoongArch now, so add a call to efi_esrt_init() at the end of efi_init(). Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:08 +08:00
Binbin Zhou	44a01f1f72	LoongArch: Parsing CPU-related information from DTS Generally, we can get cpu-related information, such as model name, from /proc/cpuinfo. For FDT-based systems, we need to parse the relevant information from DTS. BTW, set loongson_sysconf.cores_per_package to num_processors if SMBIOS doesn't provide a valid number (usually FDT-based systems). Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Hongliang Wang <wanghongliang@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:08 +08:00
Binbin Zhou	2905844f68	LoongArch: dts: DeviceTree for Loongson-2K2000 Add DeviceTree file for Loongson-2K2000 processor, which integrates two 64-bit 3-issue superscalar LA364 processor cores. Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:08 +08:00
Binbin Zhou	30a5532a32	LoongArch: dts: DeviceTree for Loongson-2K1000 Add DeviceTree file for Loongson-2K1000 processor, which integrates two 64-bit 2-issue superscalar LA264 processor cores. Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:08 +08:00
Binbin Zhou	0f66569c85	LoongArch: dts: DeviceTree for Loongson-2K0500 Add DeviceTree file for Loongson-2K0500 processor, which integrates one 64-bit 2-issue superscalar LA264 processor core. Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:07 +08:00
Binbin Zhou	5f346a6e59	LoongArch: Allow device trees be built into the kernel During the upstream progress of those DT-based drivers, DT properties are changed a lot so very different from those in existing bootloaders. It is inevitably that some existing systems do not provide a standard, canonical device tree to the kernel at boot time. So let's provide a device tree table in the kernel, keyed by the dts filename, containing the relevant DTBs. We can use the built-in dts files as references. Each SoC has only one built-in dts file which describes all possible device information of that SoC, so the dts files are good examples during development. And as a reference, our built-in dts file only enables the most basic bootable combinations (so it is generic enough), acts as an alternative in case the dts in the bootloader is unexpected. Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:00 +08:00
Binbin Zhou	db8ce24070	dt-bindings: interrupt-controller: loongson,liointc: Fix dtbs_check warning for interrupt-names The Loongson-2K0500/2K1000 CPUs have 64 interrupt sources as inputs, and a route-mapped node handles up to 32 interrupt sources, so two liointc nodes are defined in dts{i}. Of course, we have to make sure that the routing outputs ("intx") of the two nodes do not conflict, i.e. "int0" can only be used as a routing output for one of them. Therefore, "interrupt-names" should be defined as "pattern". In addition, since "interrupt-names" and "interrupts" are one-to-one correspondence, we pass it to get the corresponding interrupt number in the driver. Setting it to "required" does not break ABI, because it is already logically represented as "required". This fixes dtbs_check warning: DTC_CHK arch/loongarch/boot/dts/loongson-2k0500-ref.dtb arch/loongarch/boot/dts/loongson-2k0500-ref.dtb: interrupt-controller@1fe11440: interrupt-names:0: 'int0' was expected From schema: Documentation/devicetree/bindings/interrupt-controller/loongson,liointc.yaml arch/loongarch/boot/dts/loongson-2k0500-ref.dtb: interrupt-controller@1fe11440: Unevaluated properties are not allowed ('interrupt-names' was unexpected) From schema: Documentation/devicetree/bindings/interrupt-controller/loongson,liointc.yaml DTC_CHK arch/loongarch/boot/dts/loongson-2k1000-ref.dtb arch/loongarch/boot/dts/loongson-2k1000-ref.dtb: interrupt-controller@1fe01440: interrupt-names:0: 'int0' was expected From schema: Documentation/devicetree/bindings/interrupt-controller/loongson,liointc.yaml arch/loongarch/boot/dts/loongson-2k1000-ref.dtb: interrupt-controller@1fe01440: Unevaluated properties are not allowed ('interrupt-names' was unexpected) From schema: Documentation/devicetree/bindings/interrupt-controller/loongson,liointc.yaml Acked-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:00 +08:00
Binbin Zhou	aaeebb3ea4	dt-bindings: interrupt-controller: loongson,liointc: Fix dtbs_check warning for reg-names As we know, the Loongson-2K0500 is a single-core CPU, and the core1- related register (isr1) does not exist. So "reg" and "reg-names" should be set to "minItems 2"(main nad isr0). This fixes dtbs_check warning: DTC_CHK arch/loongarch/boot/dts/loongson-2k0500-ref.dtb arch/loongarch/boot/dts/loongson-2k0500-ref.dtb: interrupt-controller@1fe11400: reg-names: ['main', 'isr0'] is too short From schema: Documentation/devicetree/bindings/interrupt-controller/loongson,liointc.yaml arch/loongarch/boot/dts/loongson-2k0500-ref.dtb: interrupt-controller@1fe11400: Unevaluated properties are not allowed ('reg-names' was unexpected) From schema: Documentation/devicetree/bindings/interrupt-controller/loongson,liointc.yaml arch/loongarch/boot/dts/loongson-2k0500-ref.dtb: interrupt-controller@1fe11400: reg: [[0, 534844416, 0, 64], [0, 534843456, 0, 8]] is too short From schema: Documentation/devicetree/bindings/interrupt-controller/loongson,liointc.yaml arch/loongarch/boot/dts/loongson-2k0500-ref.dtb: interrupt-controller@1fe11440: reg-names: ['main', 'isr0'] is too short From schema: Documentation/devicetree/bindings/interrupt-controller/loongson,liointc.yaml Acked-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:00 +08:00
Binbin Zhou	ec6b36edf0	dt-bindings: loongarch: Add Loongson SoC boards compatibles Add Loongson SoC boards binding with DT schema format using json-schema. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:00 +08:00
Binbin Zhou	8e07e0e396	dt-bindings: loongarch: Add CPU bindings for LoongArch Add the available CPUs in LoongArch binding with DT schema format using json-schema. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:00 +08:00
WANG Rui	90868ff9ca	LoongArch: Enable initial Rust support Enable initial Rust support for LoongArch. Tested-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: WANG Rui <wangrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:00 +08:00
WANG Rui	f58b0abae8	scripts/min-tool-version.sh: Raise minimum clang version to 18.0.0 for loongarch The existing mainline clang development version encounters difficulties compiling the LoongArch kernel module. It is anticipated that this issue will be resolved in the upcoming 18.0.0 release. To prevent user confusion arising from broken builds, it is advisable to raise the minimum required clang version for LoongArch to 18.0.0. Suggested-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://github.com/ClangBuiltLinux/linux/issues/1941 Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: WANG Rui <wangrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:43:00 +08:00
WANG Xuerui	2772ae4d66	modpost: Ignore relaxation and alignment marker relocs on LoongArch With recent trunk versions of binutils and gcc, alignment directives are represented with R_LARCH_ALIGN relocs on LoongArch, which is necessary for the linker to maintain alignment requirements during its relaxation passes. And even though the kernel is built with relaxation disabled, so far a small number of R_LARCH_RELAX marker relocs are still emitted as part of la.* pseudo instructions in assembly. These two kinds of relocs do not refer to symbols, which can trip up modpost's section mismatch checks, because the r_offset of said relocs can be zero or any other meaningless value, eventually leading to a `from == NULL` condition in default_mismatch_handler and SIGSEGV. As the two kinds of relocs are not concerned with symbols, just ignore them for section mismatch check purposes. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-01-17 12:42:59 +08:00
Nicolas Dichtel	e9ce7ededf	selftests: rtnetlink: use setup_ns in bonding test This is a follow-up of commit `a159cbe81d` ("selftests: rtnetlink: check enslaving iface in a bond") after the merge of net-next into net. The goal is to follow the new convention, see commit `d3b6b11161` ("selftests/net: convert rtnetlink.sh to run it in unique namespace") for more details. Let's use also the generic dummy name instead of defining a new one. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240115135922.3662648-1-nicolas.dichtel@6wind.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-16 17:49:54 -08:00
Russell King (Oracle)	97eb5d51b4	net: sfp-bus: fix SFP mode detect from bitrate The referenced commit moved the setting of the Autoneg and pause bits early in sfp_parse_support(). However, we check whether the modes are empty before using the bitrate to set some modes. Setting these bits so early causes that test to always be false, preventing this working, and thus some modules that used to work no longer do. Move them just before the call to the quirk. Fixes: `8110633db4` ("net: sfp-bus: allow SFP quirks to override Autoneg and pause bits") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://lore.kernel.org/r/E1rPMJW-001Ahf-L0@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-16 17:49:49 -08:00
Kunwu Chan	776dac5a66	net: dsa: vsc73xx: Add null pointer check to vsc73xx_gpio_probe devm_kasprintf() returns a pointer to dynamically allocated memory which can be NULL upon failure. Fixes: `05bd97fc55` ("net: dsa: Add Vitesse VSC73xx DSA router driver") Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Suggested-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240111072018.75971-1-chentao@kylinos.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-16 17:48:27 -08:00
Colin Ian King	8ca5d2641b	cifs: remove redundant variable tcon_exist The variable tcon_exist is being assigned however it is never read, the variable is redundant and can be removed. Cleans up clang scan build warning: warning: Although the value stored to 'tcon_exist' is used in the enclosing expression, the value is never actually readfrom 'tcon_exist' [deadcode.DeadStores] Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-01-16 17:19:46 -06:00
Erick Archer	1057066009	eventfs: Use kcalloc() instead of kzalloc() As noted in the "Deprecated Interfaces, Language Features, Attributes, and Conventions" documentation [1], size calculations (especially multiplication) should not be performed in memory allocator (or similar) function arguments due to the risk of them overflowing. This could lead to values wrapping around and a smaller allocation being made than the caller was expecting. Using those allocations could lead to linear overflows of heap memory and other misbehaviors. So, use the purpose specific kcalloc() function instead of the argument size * count in the kzalloc() function. [1] https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments Link: https://lore.kernel.org/linux-trace-kernel/20240115181658.4562-1-erick.archer@gmx.com Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://github.com/KSPP/linux/issues/162 Signed-off-by: Erick Archer <erick.archer@gmx.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-01-16 17:52:33 -05:00
Steven Rostedt (Google)	852e46e239	eventfs: Do not create dentries nor inodes in iterate_shared The original eventfs code added a wrapper around the dcache_readdir open callback and created all the dentries and inodes at open, and increment their ref count. A wrapper was added around the dcache_readdir release function to decrement all the ref counts of those created inodes and dentries. But this proved to be buggy[1] for when a kprobe was created during a dir read, it would create a dentry between the open and the release, and because the release would decrement all ref counts of all files and directories, that would include the kprobe directory that was not there to have its ref count incremented in open. This would cause the ref count to go to negative and later crash the kernel. To solve this, the dentries and inodes that were created and had their ref count upped in open needed to be saved. That list needed to be passed from the open to the release, so that the release would only decrement the ref counts of the entries that were incremented in the open. Unfortunately, the dcache_readdir logic was already using the file->private_data, which is the only field that can be used to pass information from the open to the release. What was done was the eventfs created another descriptor that had a void pointer to save the dcache_readdir pointer, and it wrapped all the callbacks, so that it could save the list of entries that had their ref counts incremented in the open, and pass it to the release. The wrapped callbacks would just put back the dcache_readdir pointer and call the functions it used so it could still use its data[2]. But Linus had an issue with the "hijacking" of the file->private_data (unfortunately this discussion was on a security list, so no public link). Which we finally agreed on doing everything within the iterate_shared callback and leave the dcache_readdir out of it[3]. All the information needed for the getents() could be created then. But this ended up being buggy too[4]. The iterate_shared callback was not the right place to create the dentries and inodes. Even Christian Brauner had issues with that[5]. An attempt was to go back to creating the inodes and dentries at the open, create an array to store the information in the file->private_data, and pass that information to the other callbacks.[6] The difference between that and the original method, is that it does not use dcache_readdir. It also does not up the ref counts of the dentries and pass them. Instead, it creates an array of a structure that saves the dentry's name and inode number. That information is used in the iterate_shared callback, and the array is freed in the dir release. The dentries and inodes created in the open are not used for the iterate_share or release callbacks. Just their names and inode numbers. Linus did not like that either[7] and just wanted to remove the dentries being created in iterate_shared and use the hard coded inode numbers. [ All this while Linus enjoyed an unexpected vacation during the merge window due to lack of power. ] [1] https://lore.kernel.org/linux-trace-kernel/20230919211804.230edf1e@gandalf.local.home/ [2] https://lore.kernel.org/linux-trace-kernel/20230922163446.1431d4fa@gandalf.local.home/ [3] https://lore.kernel.org/linux-trace-kernel/20240104015435.682218477@goodmis.org/ [4] https://lore.kernel.org/all/202401152142.bfc28861-oliver.sang@intel.com/ [5] https://lore.kernel.org/all/20240111-unzahl-gefegt-433acb8a841d@brauner/ [6] https://lore.kernel.org/all/20240116114711.7e8637be@gandalf.local.home/ [7] https://lore.kernel.org/all/20240116170154.5bf0a250@gandalf.local.home/ Link: https://lore.kernel.org/linux-trace-kernel/20240116211353.573784051@goodmis.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Fixes: `493ec81a8f` ("eventfs: Stop using dcache_readdir() for getdents()") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202401152142.bfc28861-oliver.sang@intel.com Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-01-16 17:48:19 -05:00
Matthew Wilcox (Oracle)	7bed6f3d08	block: Fix iterating over an empty bio with bio_for_each_folio_all If the bio contains no data, bio_first_folio() calls page_folio() on a NULL pointer and oopses. Move the test that we've reached the end of the bio from bio_next_folio() to bio_first_folio(). Reported-by: syzbot+8b23309d5788a79d3eea@syzkaller.appspotmail.com Reported-by: syzbot+004c1e0fced2b4bc3dcc@syzkaller.appspotmail.com Fixes: `640d1930be` ("block: Add bio_for_each_folio_all()") Cc: stable@vger.kernel.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20240116212959.3413014-1-willy@infradead.org [axboe: add unlikely() to error case] Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-01-16 15:02:25 -07:00
Steven Rostedt (Google)	53c41052ba	eventfs: Have the inodes all for files and directories all be the same The dentries and inodes are created in the readdir for the sole purpose of getting a consistent inode number. Linus stated that is unnecessary, and that all inodes can have the same inode number. For a virtual file system they are pretty meaningless. Instead use a single unique inode number for all files and one for all directories. Link: https://lore.kernel.org/all/20240116133753.2808d45e@gandalf.local.home/ Link: https://lore.kernel.org/linux-trace-kernel/20240116211353.412180363@goodmis.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-01-16 16:27:47 -05:00
Hans de Goede	58f65f9db7	Input: atkbd - use ab83 as id when skipping the getid command Barnabás reported that the change to skip the getid command when the controller is in translated mode on laptops caused the Version field of his "AT Translated Set 2 keyboard" input device to change from ab83 to abba, breaking a custom hwdb entry for this keyboard. Use the standard ab83 id for keyboards when getid is skipped (rather then that getid fails) to avoid reporting a different Version to userspace then before skipping the getid. Fixes: `936e4d49ec` ("Input: atkbd - skip ATKBD_CMD_GETID in translated mode") Reported-by: Barnabás Pőcze <pobrn@protonmail.com> Closes: https://lore.kernel.org/linux-input/W1ydwoG2fYv85Z3C3yfDOJcVpilEvGge6UGa9kZh8zI2-qkHXp7WLnl2hSkFz63j-c7WupUWI5TLL6n7Lt8DjRuU-yJBwLYWrreb1hbnd6A=@protonmail.com/ Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20240116204325.7719-1-hdegoede@redhat.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>	2024-01-16 13:16:10 -08:00
Dmitry Antipov	be50df31c4	block: bio-integrity: fix kcalloc() arguments order When compiling with gcc version 14.0.1 20240116 (experimental) and W=1, I've noticed the following warning: block/bio-integrity.c: In function 'bio_integrity_map_user': block/bio-integrity.c:339:38: warning: 'kcalloc' sizes specified with 'sizeof' in the earlier argument and not in the later argument [-Wcalloc-transposed-args] 339 \| bvec = kcalloc(sizeof(*bvec), nr_vecs, GFP_KERNEL); \| ^ block/bio-integrity.c:339:38: note: earlier argument should specify number of elements, later size of each element Since 'n' and 'size' arguments of 'kcalloc()' are multiplied to calculate the final size, their actual order doesn't affect the result and so this is not a bug. But it's still worth to fix it. Fixes: `492c5d4559` ("block: bio-integrity: directly map user buffers") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20240116143437.89060-1-dmantipov@yandex.ru Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-01-16 09:51:22 -07:00
Takashi Iwai	e069642059	Merge tag 'asoc-fix-v6.8-merge-window' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v6.8 A bunch of small fixes that come in during the merge window, mainly fixing issues from some core refactoring around dummy components that weren't detected until things reached mainline. The TAS driver changes are a little larger than normal for a device ID addition due to some shuffling around of where things are registered and DT updates but aren't really any more substantial than normal.	2024-01-16 17:37:17 +01:00
Hao Sun	33772ff3b8	selftests/bpf: Add test for alu on PTR_TO_FLOW_KEYS Add a test case for PTR_TO_FLOW_KEYS alu. Testing if alu with variable offset on flow_keys is rejected. For the fixed offset success case, we already have C code coverage to verify (e.g. via bpf_flow.c). Signed-off-by: Hao Sun <sunhao.th@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20240115082028.9992-2-sunhao.th@gmail.com	2024-01-16 17:12:48 +01:00
Hao Sun	22c7fa171a	bpf: Reject variable offset alu on PTR_TO_FLOW_KEYS For PTR_TO_FLOW_KEYS, check_flow_keys_access() only uses fixed off for validation. However, variable offset ptr alu is not prohibited for this ptr kind. So the variable offset is not checked. The following prog is accepted: func#0 @0 0: R1=ctx() R10=fp0 0: (bf) r6 = r1 ; R1=ctx() R6_w=ctx() 1: (79) r7 = (u64 )(r6 +144) ; R6_w=ctx() R7_w=flow_keys() 2: (b7) r8 = 1024 ; R8_w=1024 3: (37) r8 /= 1 ; R8_w=scalar() 4: (57) r8 &= 1024 ; R8_w=scalar(smin=smin32=0, smax=umax=smax32=umax32=1024,var_off=(0x0; 0x400)) 5: (0f) r7 += r8 mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1 mark_precise: frame0: regs=r8 stack= before 4: (57) r8 &= 1024 mark_precise: frame0: regs=r8 stack= before 3: (37) r8 /= 1 mark_precise: frame0: regs=r8 stack= before 2: (b7) r8 = 1024 6: R7_w=flow_keys(smin=smin32=0,smax=umax=smax32=umax32=1024,var_off =(0x0; 0x400)) R8_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=1024, var_off=(0x0; 0x400)) 6: (79) r0 = (u64 )(r7 +0) ; R0_w=scalar() 7: (95) exit This prog loads flow_keys to r7, and adds the variable offset r8 to r7, and finally causes out-of-bounds access: BUG: unable to handle page fault for address: ffffc90014c80038 [...] Call Trace: <TASK> bpf_dispatcher_nop_func include/linux/bpf.h:1231 [inline] __bpf_prog_run include/linux/filter.h:651 [inline] bpf_prog_run include/linux/filter.h:658 [inline] bpf_prog_run_pin_on_cpu include/linux/filter.h:675 [inline] bpf_flow_dissect+0x15f/0x350 net/core/flow_dissector.c:991 bpf_prog_test_run_flow_dissector+0x39d/0x620 net/bpf/test_run.c:1359 bpf_prog_test_run kernel/bpf/syscall.c:4107 [inline] __sys_bpf+0xf8f/0x4560 kernel/bpf/syscall.c:5475 __do_sys_bpf kernel/bpf/syscall.c:5561 [inline] __se_sys_bpf kernel/bpf/syscall.c:5559 [inline] __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:5559 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b Fix this by rejecting ptr alu with variable offset on flow_keys. Applying the patch rejects the program with "R7 pointer arithmetic on flow_keys prohibited". Fixes: `d58e468b11` ("flow_dissector: implements flow dissector BPF hook") Signed-off-by: Hao Sun <sunhao.th@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20240115082028.9992-1-sunhao.th@gmail.com	2024-01-16 17:12:29 +01:00
Palmer Dabbelt	a894e8ed09	Merge patch series "riscv: support kernel-mode Vector" Andy Chiu <andy.chiu@sifive.com> says: This series provides support running Vector in kernel mode. Additionally, kernel-mode Vector can be configured to run without turnning off preemption on a CONFIG_PREEMPT kernel. Along with the suport, we add Vector optimized copy_{to,from}_user. And provide a simple threshold to decide when to run the vectorized functions. We decided to drop vectorized memcpy/memset/memmove for the moment due to the concern of memory side-effect in kernel_vector_begin(). The detailed description can be found at v9[0] This series is composed by 4 parts: patch 1-4: adds basic support for kernel-mode Vector patch 5: includes vectorized copy_{to,from}_user into the kernel patch 6: refactor context switch code in fpu [1] patch 7-10: provides some code refactors and support for preemptible kernel-mode Vector. This series can be merged if we feel any part of {1~4, 5, 6, 7~10} is mature enough. This patch is tested on a QEMU with V and verified that booting, normal userspace operations all work as usual with thresholds set to 0. Also, we test by launching multiple kernel threads which continuously executes and verifies Vector operations in the background. The module that tests these operation is expected to be upstream later. * b4-shazam-merge: riscv: vector: allow kernel-mode Vector with preemption riscv: vector: use kmem_cache to manage vector context riscv: vector: use a mask to write vstate_ctrl riscv: vector: do not pass task_struct into riscv_v_vstate_{save,restore}() riscv: fpu: drop SR_SD bit checking riscv: lib: vectorize copy_to_user/copy_from_user riscv: sched: defer restoring Vector context for user riscv: Add vector extension XOR implementation riscv: vector: make Vector always available for softirq context riscv: Add support for kernel mode vector Link: https://lore.kernel.org/r/20240115055929.4736-1-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-16 07:14:04 -08:00
Andy Chiu	2080ff9493	riscv: vector: allow kernel-mode Vector with preemption Add kernel_vstate to keep track of kernel-mode Vector registers when trap introduced context switch happens. Also, provide riscv_v_flags to let context save/restore routine track context status. Context tracking happens whenever the core starts its in-kernel Vector executions. An active (dirty) kernel task's V contexts will be saved to memory whenever a trap-introduced context switch happens. Or, when a softirq, which happens to nest on top of it, uses Vector. Context retoring happens when the execution transfer back to the original Kernel context where it first enable preempt_v. Also, provide a config CONFIG_RISCV_ISA_V_PREEMPTIVE to give users an option to disable preemptible kernel-mode Vector at build time. Users with constraint memory may want to disable this config as preemptible kernel-mode Vector needs extra space for tracking of per thread's kernel-mode V context. Or, users might as well want to disable it if all kernel-mode Vector code is time sensitive and cannot tolerate context switch overhead. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-11-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-16 07:14:02 -08:00
Andy Chiu	bd446f5df5	riscv: vector: use kmem_cache to manage vector context The allocation size of thread.vstate.datap is always riscv_v_vsize. So it is possbile to use kmem_cache_* to manage the allocation. This gives users more information regarding allocation of vector context via /proc/slabinfo. And it potentially reduces the latency of the first-use trap because of the allocation caches. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-10-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-16 07:14:01 -08:00
Andy Chiu	5b6048f2ff	riscv: vector: use a mask to write vstate_ctrl riscv_v_ctrl_set() should only touch bits within PR_RISCV_V_VSTATE_CTRL_MASK. So, use the mask when we really set task's vstate_ctrl. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-9-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-16 07:14:00 -08:00
Andy Chiu	d6c78f1ca3	riscv: vector: do not pass task_struct into riscv_v_vstate_{save,restore}() riscv_v_vstate_{save,restore}() can operate only on the knowlege of struct __riscv_v_ext_state, and struct pt_regs. Let the caller decides which should be passed into the function. Meanwhile, the kernel-mode Vector is going to introduce another vstate, so this also makes functions potentially able to be reused. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-8-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-16 07:13:59 -08:00
Andy Chiu	a93fdaf183	riscv: fpu: drop SR_SD bit checking SR_SD summarizes the dirty status of FS/VS/XS. However, the current code structure does not fully utilize it because each extension specific code is divided into an individual segment. So remove the SR_SD check for now. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Reviewed-by: Song Shuai <songshuaishuai@tinylab.org> Reviewed-by: Guo Ren <guoren@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-7-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-16 07:13:58 -08:00
Andy Chiu	c2a658d419	riscv: lib: vectorize copy_to_user/copy_from_user This patch utilizes Vector to perform copy_to_user/copy_from_user. If Vector is available and the size of copy is large enough for Vector to perform better than scalar, then direct the kernel to do Vector copies for userspace. Though the best programming practice for users is to reduce the copy, this provides a faster variant when copies are inevitable. The optimal size for using Vector, copy_to_user_thres, is only a heuristic for now. We can add DT parsing if people feel the need of customizing it. The exception fixup code of the __asm_vector_usercopy must fallback to the scalar one because accessing user pages might fault, and must be sleepable. Current kernel-mode Vector does not allow tasks to be preemptible, so we must disactivate Vector and perform a scalar fallback in such case. The original implementation of Vector operations comes from https://github.com/sifive/sifive-libc, which we agree to contribute to Linux kernel. Co-developed-by: Jerry Shih <jerry.shih@sifive.com> Signed-off-by: Jerry Shih <jerry.shih@sifive.com> Co-developed-by: Nick Knight <nick.knight@sifive.com> Signed-off-by: Nick Knight <nick.knight@sifive.com> Suggested-by: Guo Ren <guoren@kernel.org> Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-6-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-16 07:13:57 -08:00
Andy Chiu	7df56cbc27	riscv: sched: defer restoring Vector context for user User will use its Vector registers only after the kernel really returns to the userspace. So we can delay restoring Vector registers as long as we are still running in kernel mode. So, add a thread flag to indicates the need of restoring Vector and do the restore at the last arch-specific exit-to-user hook. This save the context restoring cost when we switch over multiple processes that run V in kernel mode. For example, if the kernel performs a context swicth from A->B->C, and returns to C's userspace, then there is no need to restore B's V-register. Besides, this also prevents us from repeatedly restoring V context when executing kernel-mode Vector multiple times. The cost of this is that we must disable preemption and mark vector as busy during vstate_{save,restore}. Because then the V context will not get restored back immediately when a trap-causing context switch happens in the middle of vstate_{save,restore}. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-5-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-16 07:13:56 -08:00
Greentime Hu	c5674d00ca	riscv: Add vector extension XOR implementation This patch adds support for vector optimized XOR and it is tested in qemu. Co-developed-by: Han-Kuan Chen <hankuan.chen@sifive.com> Signed-off-by: Han-Kuan Chen <hankuan.chen@sifive.com> Signed-off-by: Greentime Hu <greentime.hu@sifive.com> Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-4-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-16 07:13:55 -08:00
Andy Chiu	956895b9d8	riscv: vector: make Vector always available for softirq context The goal of this patch is to provide full support of Vector in kernel softirq context. So that some of the crypto alogrithms won't need scalar fallbacks. By disabling bottom halves in active kernel-mode Vector, softirq will not be able to nest on top of any kernel-mode Vector. So, softirq context is able to use Vector whenever it runs. After this patch, Vector context cannot start with irqs disabled. Otherwise local_bh_enable() may run in a wrong context. Disabling bh is not enough for RT-kernel to prevent preeemption. So we must disable preemption, which also implies disabling bh on RT. Related-to: commit `696207d425` ("arm64/sve: Make kernel FPU protection RT friendly") Related-to: commit `66c3ec5a71` ("arm64: neon: Forbid when irqs are disabled") Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-3-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-16 07:13:54 -08:00
Greentime Hu	ecd2ada8a5	riscv: Add support for kernel mode vector Add kernel_vector_begin() and kernel_vector_end() function declarations and corresponding definitions in kernel_mode_vector.c These are needed to wrap uses of vector in kernel mode. Co-developed-by: Vincent Chen <vincent.chen@sifive.com> Signed-off-by: Vincent Chen <vincent.chen@sifive.com> Signed-off-by: Greentime Hu <greentime.hu@sifive.com> Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-2-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-16 07:13:53 -08:00

... 6 7 8 9 10 ...

1248730 Commits