linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-02-27 08:10:36 -05:00

Author	SHA1	Message	Date
Tony Ambardar	c7ad907367	selftests/bpf: Add missing system defines for mips Update get_sys_includes in Makefile with missing MIPS-related definitions to fix many, many compilation errors building selftests/bpf. The following added defines drive conditional logic in system headers for word-size and endianness selection: MIPSEL, MIPSEB _MIPS_SZPTR _MIPS_SZLONG _MIPS_SIM, _ABIO32, _ABIN32, _ABI64 Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/f3cfceaf5299cdd2ac0e0a36072d6ca7be23e603.1721692479.git.tony.ambardar@gmail.com	2024-07-29 15:05:04 -07:00
Ihor Solodrai	a0ef659d03	selftests/bpf: Don't include .d files on make clean Ignore generated %.test.o dependencies when make goal is clean or docs-clean. Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/all/oNTIdax7aWGJdEgabzTqHzF4r-WTERrV1e1cNaPQMp-UhYUQpozXqkbuAlLBulczr6I99-jM5x3dxv56JJowaYBkm765R9Aa9kyrVuCl_kA=@pm.me Link: https://lore.kernel.org/bpf/K69Y8OKMLXBWR0dtOfsC4J46-HxeQfvqoFx1CysCm7u19HRx4MB6yAKOFkM6X-KAx2EFuCcCh_9vYWpsgQXnAer8oQ8PMeDEuiRMYECuGH4=@pm.me	2024-07-29 15:05:04 -07:00
Song Liu	c7db4873fb	selftests/bpf: Add a test for mmap-able map in map Regular BPF hash map is not mmap-able from user space. However, map-in-map with outer map of type BPF_MAP_TYPE_HASH_OF_MAPS and mmap-able array as inner map can perform similar operations as a mmap-able hash map. This can be used by applications that benefit from fast accesses to some local data. Add a selftest to show this use case. Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240723051455.1589192-1-song@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 15:05:04 -07:00
Martin KaFai Lau	1edf364a8a	Merge branch 'use network helpers, part 10' Geliang Tang says: ==================== This set is part 10 of series "use network helpers" all BPF selftests wide. Patches 1-3 drop local functions make_client(), make_socket() and inetaddr_len() in sk_lookup.c. Patch 4 drops a useless function __start_server() in network_helpers.c. ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 15:05:03 -07:00
Geliang Tang	71a2fbaf9c	selftests/bpf: Drop __start_server in network_helpers The helper start_server_addr() is a wrapper of __start_server(), the only difference between them is __start_server() accepts a sockaddr type address parameter, but start_server_addr() accepts a sockaddr_storage one. This patch drops __start_server(), and updates the callers to invoke start_server_addr() instead. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/31399df7cb957b7c233e79963b0aa0dc4278d273.1721475357.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 15:05:03 -07:00
Geliang Tang	c3c41e016c	selftests/bpf: Drop inetaddr_len in sk_lookup No need to use a dedicated helper inetaddr_len() to get the length of the IPv4 or IPv6 address, it can be got by make_sockaddr(), this patch drops it. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/32e2a4122921051da38a6e4fbb2ebee5f0af5a4e.1721475357.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 15:05:03 -07:00
Geliang Tang	01c2f776ed	selftests/bpf: Drop make_socket in sk_lookup This patch uses the public network helers client_socket() + make_sockaddr() in sk_lookup.c to create the client socket, set the timeout sockopts, and make the connecting address. The local defined function make_socket() can be dropped then. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/588771977ac48c27f73526d8421a84b91d7cf218.1721475357.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 15:05:03 -07:00
Geliang Tang	af994e31b7	selftests/bpf: Drop make_client in sk_lookup This patch uses the new helper connect_to_addr_str() in sk_lookup.c to create the client socket and connect to the server, instead of using local defined function make_client(). This local function can be dropped then. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/058199d7ab46802249dae066ca22c98f6be508ee.1721475357.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 15:05:03 -07:00
Alexei Starovoitov	aa8ebb270c	selftests/bpf: Workaround strict bpf_lsm return value check. test_progs-no_alu32 -t libbpf_get_fd_by_id_opts is being rejected by the verifier with the following error due to compiler optimization: 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) ; @ test_libbpf_get_fd_by_id_opts.c:0 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 9: (95) exit At program exit the register R0 has smax=9223372036854775795 should have been in [-4095, 0] Workaround by adding barrier(). Eventually the verifier will be able to recognize it. Fixes: `5d99e198be` ("bpf, lsm: Add check for BPF LSM return value") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 15:04:55 -07:00
Alexei Starovoitov	02d9fe1c4d	Merge branch 'add-bpf-lsm-return-value-range-check-bpf-part' Xu Kuohai says: ==================== Add BPF LSM return value range check, BPF part From: Xu Kuohai <xukuohai@huawei.com> LSM BPF prog may make kernel panic when returning an unexpected value, such as returning positive value on hook file_alloc_security. To fix it, series [1] refactored LSM hook return values and added BPF return value check on top of that. Since the refactoring of LSM hooks and checking BPF prog return value patches is not closely related, this series separates BPF-related patches from [1]. v2: - Update Shung-Hsi's patch with [3] v1: https://lore.kernel.org/bpf/20240719081749.769748-1-xukuohai@huaweicloud.com/ Changes to [1]: 1. Extend LSM disabled list to include hooks refactored in [1] to avoid dependency on the hooks return value refactoring patches. 2. Replace the special case patch for bitwise AND on [-1, 0] with Shung-Hsi's general bitwise AND improvement patch [2]. 3. Remove unused patches. [1] https://lore.kernel.org/bpf/20240711111908.3817636-1-xukuohai@huaweicloud.com https://lore.kernel.org/bpf/20240711113828.3818398-1-xukuohai@huaweicloud.com [2] https://lore.kernel.org/bpf/ykuhustu7vt2ilwhl32kj655xfdgdlm2xkl5rff6tw2ycksovp@ss2n4gpjysnw [3] https://lore.kernel.org/bpf/20240719081702.137173-1-shung-hsi.yu@suse.com/ Shung-Hsi Yu (1): bpf, verifier: improve signed ranges inference for BPF_AND ==================== Link: https://lore.kernel.org/r/20240719110059.797546-1-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 13:09:48 -07:00
Xu Kuohai	04d8243b1f	selftests/bpf: Add verifier tests for bpf lsm Add verifier tests to check bpf lsm return values and disabled hooks. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20240719110059.797546-10-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 13:09:45 -07:00
Xu Kuohai	d463dd9c9a	selftests/bpf: Add test for lsm tail call Add test for lsm tail call to ensure tail call can only be used between bpf lsm progs attached to the same hook. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20240719110059.797546-9-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 13:09:41 -07:00
Xu Kuohai	2b23b6c0f0	selftests/bpf: Add return value checks for failed tests The return ranges of some bpf lsm test progs can not be deduced by the verifier accurately. To avoid erroneous rejections, add explicit return value checks for these progs. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20240719110059.797546-8-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 13:09:37 -07:00
Xu Kuohai	4dc7556490	selftests/bpf: Avoid load failure for token_lsm.c The compiler optimized the two bpf progs in token_lsm.c to make return value from the bool variable in the "return -1" path, causing an unexpected rejection: 0: R1=ctx() R10=fp0 ; int BPF_PROG(bpf_token_capable, struct bpf_token token, int cap) @ bpf_lsm.c:17 0: (b7) r6 = 0 ; R6_w=0 ; if (my_pid == 0 \|\| my_pid != (bpf_get_current_pid_tgid() >> 32)) @ bpf_lsm.c:19 1: (18) r1 = 0xffffc9000102a000 ; R1_w=map_value(map=bpf_lsm.bss,ks=4,vs=5) 3: (61) r7 = (u32 )(r1 +0) ; R1_w=map_value(map=bpf_lsm.bss,ks=4,vs=5) R7_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 4: (15) if r7 == 0x0 goto pc+11 ; R7_w=scalar(smin=umin=umin32=1,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 5: (67) r7 <<= 32 ; R7_w=scalar(smax=0x7fffffff00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000)) 6: (c7) r7 s>>= 32 ; R7_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff) 7: (85) call bpf_get_current_pid_tgid#14 ; R0=scalar() 8: (77) r0 >>= 32 ; R0_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 9: (5d) if r0 != r7 goto pc+6 ; R0_w=scalar(smin=smin32=0,smax=umax=umax32=0x7fffffff,var_off=(0x0; 0x7fffffff)) R7=scalar(smin=smin32=0,smax=umax=umax32=0x7fffffff,var_off=(0x0; 0x7fffffff)) ; if (reject_capable) @ bpf_lsm.c:21 10: (18) r1 = 0xffffc9000102a004 ; R1_w=map_value(map=bpf_lsm.bss,ks=4,vs=5,off=4) 12: (71) r6 = (u8 )(r1 +0) ; R1_w=map_value(map=bpf_lsm.bss,ks=4,vs=5,off=4) R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) ; @ bpf_lsm.c:0 13: (87) r6 = -r6 ; R6_w=scalar() 14: (67) r6 <<= 56 ; R6_w=scalar(smax=0x7f00000000000000,umax=0xff00000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xff00000000000000)) 15: (c7) r6 s>>= 56 ; R6_w=scalar(smin=smin32=-128,smax=smax32=127) ; int BPF_PROG(bpf_token_capable, struct bpf_token token, int cap) @ bpf_lsm.c:17 16: (bf) r0 = r6 ; R0_w=scalar(id=1,smin=smin32=-128,smax=smax32=127) R6_w=scalar(id=1,smin=smin32=-128,smax=smax32=127) 17: (95) exit At program exit the register R0 has smin=-128 smax=127 should have been in [-4095, 0] To avoid this failure, change the variable type from bool to int. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20240719110059.797546-7-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 13:09:34 -07:00
Xu Kuohai	763aa759d3	bpf: Fix compare error in function retval_range_within After checking lsm hook return range in verifier, the test case "test_progs -t test_lsm" failed, and the failure log says: libbpf: prog 'test_int_hook': BPF program load failed: Invalid argument libbpf: prog 'test_int_hook': -- BEGIN PROG LOAD LOG -- 0: R1=ctx() R10=fp0 ; int BPF_PROG(test_int_hook, struct vm_area_struct vma, @ lsm.c:89 0: (79) r0 = (u64 )(r1 +24) ; R0_w=scalar(smin=smin32=-4095,smax=smax32=0) R1=ctx() [...] 24: (b4) w0 = -1 ; R0_w=0xffffffff ; int BPF_PROG(test_int_hook, struct vm_area_struct vma, @ lsm.c:89 25: (95) exit At program exit the register R0 has smin=4294967295 smax=4294967295 should have been in [-4095, 0] It can be seen that instruction "w0 = -1" zero extended -1 to 64-bit register r0, setting both smin and smax values of r0 to 4294967295. This resulted in a false reject when r0 was checked with range [-4095, 0]. Given bpf lsm does not return 64-bit values, this patch fixes it by changing the compare between r0 and return range from 64-bit operation to 32-bit operation for bpf lsm. Fixes: `8fa4ecd49b` ("bpf: enforce exact retval range on subprog/callback exit") Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Link: https://lore.kernel.org/r/20240719110059.797546-5-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 13:09:29 -07:00
Xu Kuohai	28ead3eaab	bpf: Prevent tail call between progs attached to different hooks bpf progs can be attached to kernel functions, and the attached functions can take different parameters or return different return values. If prog attached to one kernel function tail calls prog attached to another kernel function, the ctx access or return value verification could be bypassed. For example, if prog1 is attached to func1 which takes only 1 parameter and prog2 is attached to func2 which takes two parameters. Since verifier assumes the bpf ctx passed to prog2 is constructed based on func2's prototype, verifier allows prog2 to access the second parameter from the bpf ctx passed to it. The problem is that verifier does not prevent prog1 from passing its bpf ctx to prog2 via tail call. In this case, the bpf ctx passed to prog2 is constructed from func1 instead of func2, that is, the assumption for ctx access verification is bypassed. Another example, if BPF LSM prog1 is attached to hook file_alloc_security, and BPF LSM prog2 is attached to hook bpf_lsm_audit_rule_known. Verifier knows the return value rules for these two hooks, e.g. it is legal for bpf_lsm_audit_rule_known to return positive number 1, and it is illegal for file_alloc_security to return positive number. So verifier allows prog2 to return positive number 1, but does not allow prog1 to return positive number. The problem is that verifier does not prevent prog1 from calling prog2 via tail call. In this case, prog2's return value 1 will be used as the return value for prog1's hook file_alloc_security. That is, the return value rule is bypassed. This patch adds restriction for tail call to prevent such bypasses. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20240719110059.797546-4-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 13:09:26 -07:00
Xu Kuohai	5d99e198be	bpf, lsm: Add check for BPF LSM return value A bpf prog returning a positive number attached to file_alloc_security hook makes kernel panic. This happens because file system can not filter out the positive number returned by the LSM prog using IS_ERR, and misinterprets this positive number as a file pointer. Given that hook file_alloc_security never returned positive number before the introduction of BPF LSM, and other BPF LSM hooks may encounter similar issues, this patch adds LSM return value check in verifier, to ensure no unexpected value is returned. Fixes: `520b7aa00d` ("bpf: lsm: Initialize the BPF LSM hooks") Reported-by: Xin Liu <liuxin350@huawei.com> Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20240719110059.797546-3-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 13:09:22 -07:00
Xu Kuohai	21c7063f6d	bpf, lsm: Add disabled BPF LSM hook list Add a disabled hooks list for BPF LSM. progs being attached to the listed hooks will be rejected by the verifier. Suggested-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20240719110059.797546-2-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 13:09:18 -07:00
Alexei Starovoitov	e2854bc373	Merge branch 'bpf-retire-the-unsupported_ops-usage-in-struct_ops' Martin KaFai Lau says: ==================== bpf: Retire the unsupported_ops usage in struct_ops From: Martin KaFai Lau <martin.lau@kernel.org> This series retires the unsupported_ops usage and depends on the null-ness check on the cfi_stubs instead. Please see individual patches for details. v2: - Fixed a gcc compiler warning on Patch 1. ==================== Link: https://lore.kernel.org/r/20240722183049.2254692-1-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 13:09:14 -07:00
Martin KaFai Lau	4009c95fed	selftests/bpf: Ensure the unsupported struct_ops prog cannot be loaded There is an existing "bpf_tcp_ca/unsupp_cong_op" test to ensure the unsupported tcp-cc "get_info" struct_ops prog cannot be loaded. This patch adds a new test in the bpf_testmod such that the unsupported ops test does not depend on other kernel subsystem where its supporting ops may be changed in the future. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20240722183049.2254692-4-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 13:09:10 -07:00
Martin KaFai Lau	e44b4fc40c	selftests/bpf: Fix the missing tramp_1 to tramp_40 ops in cfi_stubs The tramp_1 to tramp_40 ops is not set in the cfi_stubs in the bpf_testmod_ops. It fails the struct_ops_multi_pages test after retiring the unsupported_ops in the earlier patch. This patch initializes them in a loop during the bpf_testmod_init(). Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20240722183049.2254692-3-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 13:08:56 -07:00
Martin KaFai Lau	e42ac14180	bpf: Check unsupported ops from the bpf_struct_ops's cfi_stubs The bpf_tcp_ca struct_ops currently uses a "u32 unsupported_ops[]" array to track which ops is not supported. After cfi_stubs had been added, the function pointer in cfi_stubs is also NULL for the unsupported ops. Thus, the "u32 unsupported_ops[]" becomes redundant. This observation was originally brought up in the bpf/cfi discussion: https://lore.kernel.org/bpf/CAADnVQJoEkdjyCEJRPASjBw1QGsKYrF33QdMGc1RZa9b88bAEA@mail.gmail.com/ The recent bpf qdisc patch (https://lore.kernel.org/bpf/20240714175130.4051012-6-amery.hung@bytedance.com/) also needs to specify quite many unsupported ops. It is a good time to clean it up. This patch removes the need of "u32 unsupported_ops[]" and tests for null-ness in the cfi_stubs instead. Testing the cfi_stubs is done in a new function bpf_struct_ops_supported(). The verifier will call bpf_struct_ops_supported() when loading the struct_ops program. The ".check_member" is removed from the bpf_tcp_ca in this patch. ".check_member" could still be useful for other subsytems to enforce other restrictions (e.g. sched_ext checks for prog->sleepable). To keep the same error return, ENOTSUPP is used. Cc: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20240722183049.2254692-2-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 12:54:13 -07:00
Tao Chen	0d7c06125c	bpftool: Add document for net attach/detach on tcx subcommand This commit adds sample output for net attach/detach on tcx subcommand. Signed-off-by: Tao Chen <chen.dylane@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/bpf/20240721144252.96264-1-chen.dylane@gmail.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 12:54:08 -07:00
Tao Chen	4f88dde0e1	bpftool: Add bash-completion for tcx subcommand This commit adds bash-completion for attaching tcx program on interface. Signed-off-by: Tao Chen <chen.dylane@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/bpf/20240721144238.96246-1-chen.dylane@gmail.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 12:54:04 -07:00
Tao Chen	3b9d4fee8a	bpftool: Add net attach/detach command to tcx prog Now, attach/detach tcx prog supported in libbpf, so we can add new command 'bpftool attach/detach tcx' to attach tcx prog with bpftool for user. # bpftool prog load tc_prog.bpf.o /sys/fs/bpf/tc_prog # bpftool prog show ... 192: sched_cls name tc_prog tag 187aeb611ad00cfc gpl loaded_at 2024-07-11T15:58:16+0800 uid 0 xlated 152B jited 97B memlock 4096B map_ids 100,99,97 btf_id 260 # bpftool net attach tcx_ingress name tc_prog dev lo # bpftool net ... tc: lo(1) tcx/ingress tc_prog prog_id 29 # bpftool net detach tcx_ingress dev lo # bpftool net ... tc: # bpftool net attach tcx_ingress name tc_prog dev lo # bpftool net tc: lo(1) tcx/ingress tc_prog prog_id 29 Test environment: ubuntu_22_04, 6.7.0-060700-generic Signed-off-by: Tao Chen <chen.dylane@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/bpf/20240721144221.96228-1-chen.dylane@gmail.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 12:54:00 -07:00
Tao Chen	b7264f87f7	bpftool: Refactor xdp attach/detach type judgment This commit no logical changed, just increases code readability and facilitates TCX prog expansion, which will be implemented in the next patch. Signed-off-by: Tao Chen <chen.dylane@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/bpf/20240721143353.95980-2-chen.dylane@gmail.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 12:53:50 -07:00
Alexei Starovoitov	81a0b95432	Merge branch 'bpf-fix-tailcall-hierarchy' Leon Hwang says: ==================== bpf: Fix tailcall hierarchy This patchset fixes a tailcall hierarchy issue. The issue is confirmed in the discussions of "bpf, x64: Fix tailcall infinite loop" [0]. The issue has been resolved on both x86_64 and arm64 [1]. I provide a long commit message in the "bpf, x64: Fix tailcall hierarchy" patch to describe how the issue happens and how this patchset resolves the issue in details. How does this patchset resolve the issue? In short, it stores tail_call_cnt on the stack of main prog, and propagates tail_call_cnt_ptr to its subprogs. First, at the prologue of main prog, it initializes tail_call_cnt and prepares tail_call_cnt_ptr. And at the prologue of subprog, it reuses the tail_call_cnt_ptr from caller. Then, when a tailcall happens, it increments tail_call_cnt by its pointer. v5 -> v6: * Address comments from Eduard: * Add JITed dumping along annotating comments * Rewrite two selftests with RUN_TESTS macro. v4 -> v5: * Solution changes from tailcall run ctx to tail_call_cnt and its pointer. It's because v4 solution is unable to handle the case that there is no tailcall in subprog but there is tailcall in EXT prog which attaches to the subprog. v3 -> v4: * Solution changes from per-task tail_call_cnt to tailcall run ctx. As for per-cpu/per-task solution, there is a case it is unable to handle [2]. v2 -> v3: * Solution changes from percpu tail_call_cnt to tail_call_cnt at task_struct. v1 -> v2: * Solution changes from extra run-time call insn to percpu tail_call_cnt. * Address comments from Alexei: * Use percpu tail_call_cnt. * Use asm to make sure no callee saved registers are touched. RFC v2 -> v1: * Solution changes from propagating tail_call_cnt with its pointer to extra run-time call insn. * Address comments from Maciej: * Replace all memcpy(prog, x86_nops[5], X86_PATCH_SIZE) with emit_nops(&prog, X86_PATCH_SIZE) RFC v1 -> RFC v2: * Address comments from Stanislav: * Separate moving emit_nops() as first patch. Links: [0] https://lore.kernel.org/bpf/6203dd01-789d-f02c-5293-def4c1b18aef@gmail.com/ [1] https://github.com/kernel-patches/bpf/pull/7350/checks [2] https://lore.kernel.org/bpf/CAADnVQK1qF+uBjwom2s2W-yEmgd_3rGi5Nr+KiV3cW0T+UPPfA@mail.gmail.com/ ==================== Link: https://lore.kernel.org/r/20240714123902.32305-1-hffilwlqm@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 12:53:46 -07:00
Leon Hwang	b83b936f3e	selftests/bpf: Add testcases for tailcall hierarchy fixing Add some test cases to confirm the tailcall hierarchy issue has been fixed. On x64, the selftests result is: cd tools/testing/selftests/bpf && ./test_progs -t tailcalls 327/18 tailcalls/tailcall_bpf2bpf_hierarchy_1:OK 327/19 tailcalls/tailcall_bpf2bpf_hierarchy_fentry:OK 327/20 tailcalls/tailcall_bpf2bpf_hierarchy_fexit:OK 327/21 tailcalls/tailcall_bpf2bpf_hierarchy_fentry_fexit:OK 327/22 tailcalls/tailcall_bpf2bpf_hierarchy_fentry_entry:OK 327/23 tailcalls/tailcall_bpf2bpf_hierarchy_2:OK 327/24 tailcalls/tailcall_bpf2bpf_hierarchy_3:OK 327 tailcalls:OK Summary: 1/24 PASSED, 0 SKIPPED, 0 FAILED On arm64, the selftests result is: cd tools/testing/selftests/bpf && ./test_progs -t tailcalls 327/18 tailcalls/tailcall_bpf2bpf_hierarchy_1:OK 327/19 tailcalls/tailcall_bpf2bpf_hierarchy_fentry:OK 327/20 tailcalls/tailcall_bpf2bpf_hierarchy_fexit:OK 327/21 tailcalls/tailcall_bpf2bpf_hierarchy_fentry_fexit:OK 327/22 tailcalls/tailcall_bpf2bpf_hierarchy_fentry_entry:OK 327/23 tailcalls/tailcall_bpf2bpf_hierarchy_2:OK 327/24 tailcalls/tailcall_bpf2bpf_hierarchy_3:OK 327 tailcalls:OK Summary: 1/24 PASSED, 0 SKIPPED, 0 FAILED Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Leon Hwang <hffilwlqm@gmail.com> Link: https://lore.kernel.org/r/20240714123902.32305-4-hffilwlqm@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 12:53:42 -07:00
Leon Hwang	66ff4d61dc	bpf, arm64: Fix tailcall hierarchy This patch fixes a tailcall issue caused by abusing the tailcall in bpf2bpf feature on arm64 like the way of "bpf, x64: Fix tailcall hierarchy". On arm64, when a tail call happens, it uses tail_call_cnt_ptr to increment tail_call_cnt, too. At the prologue of main prog, it has to initialize tail_call_cnt and prepare tail_call_cnt_ptr. At the prologue of subprog, it pushes x26 register twice, and does not initialize tail_call_cnt. At the epilogue, it pops x26 twice, no matter whether it is main prog or subprog. Fixes: `d4609a5d8c` ("bpf, arm64: Keep tail call count across bpf2bpf calls") Acked-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Leon Hwang <hffilwlqm@gmail.com> Link: https://lore.kernel.org/r/20240714123902.32305-3-hffilwlqm@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 12:53:38 -07:00
Leon Hwang	116e04ba14	bpf, x64: Fix tailcall hierarchy This patch fixes a tailcall issue caused by abusing the tailcall in bpf2bpf feature. As we know, tail_call_cnt propagates by rax from caller to callee when to call subprog in tailcall context. But, like the following example, MAX_TAIL_CALL_CNT won't work because of missing tail_call_cnt back-propagation from callee to caller. \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> \#include "bpf_legacy.h" struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail1(struct __sk_buff skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } static __noinline int subprog_tail2(struct __sk_buff skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("tc") int entry(struct __sk_buff skb) { volatile int ret = 1; count++; subprog_tail1(skb); subprog_tail2(skb); return ret; } char __license[] SEC("license") = "GPL"; At run time, the tail_call_cnt in entry() will be propagated to subprog_tail1() and subprog_tail2(). But, when the tail_call_cnt in subprog_tail1() updates when bpf_tail_call_static(), the tail_call_cnt in entry() won't be updated at the same time. As a result, in entry(), when tail_call_cnt in entry() is less than MAX_TAIL_CALL_CNT and subprog_tail1() returns because of MAX_TAIL_CALL_CNT limit, bpf_tail_call_static() in suprog_tail2() is able to run because the tail_call_cnt in subprog_tail2() propagated from entry() is less than MAX_TAIL_CALL_CNT. So, how many tailcalls are there for this case if no error happens? From top-down view, does it look like hierarchy layer and layer? With this view, there will be 2+4+8+...+2^33 = 2^34 - 2 = 17,179,869,182 tailcalls for this case. How about there are N subprog_tail() in entry()? There will be almost N^34 tailcalls. Then, in this patch, it resolves this case on x86_64. In stead of propagating tail_call_cnt from caller to callee, it propagates its pointer, tail_call_cnt_ptr, tcc_ptr for short. However, where does it store tail_call_cnt? It stores tail_call_cnt on the stack of main prog. When tail call happens in subprog, it increments tail_call_cnt by tcc_ptr. Meanwhile, it stores tail_call_cnt_ptr on the stack of main prog, too. And, before jump to tail callee, it has to pop tail_call_cnt and tail_call_cnt_ptr. Then, at the prologue of subprog, it must not make rax as tail_call_cnt_ptr again. It has to reuse tail_call_cnt_ptr from caller. As a result, at run time, it has to recognize rax is tail_call_cnt or tail_call_cnt_ptr at prologue by: 1. rax is tail_call_cnt if rax is <= MAX_TAIL_CALL_CNT. 2. rax is tail_call_cnt_ptr if rax is > MAX_TAIL_CALL_CNT, because a pointer won't be <= MAX_TAIL_CALL_CNT. Here's an example to dump JITed. struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("tc") int entry(struct __sk_buff skb) { int ret = 1; count++; subprog_tail(skb); subprog_tail(skb); return ret; } When bpftool p d j id 42: int entry(struct __sk_buff skb): bpf_prog_0c0f4c2413ef19b1_entry: ; int entry(struct __sk_buff skb) 0: endbr64 4: nopl (%rax,%rax) 9: xorq %rax, %rax ;; rax = 0 (tail_call_cnt) c: pushq %rbp d: movq %rsp, %rbp 10: endbr64 14: cmpq $33, %rax ;; if rax > 33, rax = tcc_ptr 18: ja 0x20 ;; if rax > 33 goto 0x20 ---+ 1a: pushq %rax ;; [rbp - 8] = rax = 0 \| 1b: movq %rsp, %rax ;; rax = rbp - 8 \| 1e: jmp 0x21 ;; ---------+ \| 20: pushq %rax ;; <--------\|---------------+ 21: pushq %rax ;; <--------+ [rbp - 16] = rax 22: pushq %rbx ;; callee saved 23: movq %rdi, %rbx ;; rbx = skb (callee saved) ; count++; 26: movabsq $-82417199407104, %rdi 30: movl (%rdi), %esi 33: addl $1, %esi 36: movl %esi, (%rdi) ; subprog_tail(skb); 39: movq %rbx, %rdi ;; rdi = skb 3c: movq -16(%rbp), %rax ;; rax = tcc_ptr 43: callq 0x80 ;; call subprog_tail() ; subprog_tail(skb); 48: movq %rbx, %rdi ;; rdi = skb 4b: movq -16(%rbp), %rax ;; rax = tcc_ptr 52: callq 0x80 ;; call subprog_tail() ; return ret; 57: movl $1, %eax 5c: popq %rbx 5d: leave 5e: retq int subprog_tail(struct __sk_buff skb): bpf_prog_3a140cef239a4b4f_subprog_tail: ; int subprog_tail(struct __sk_buff skb) 0: endbr64 4: nopl (%rax,%rax) 9: nopl (%rax) ;; do not touch tail_call_cnt c: pushq %rbp d: movq %rsp, %rbp 10: endbr64 14: pushq %rax ;; [rbp - 8] = rax (tcc_ptr) 15: pushq %rax ;; [rbp - 16] = rax (tcc_ptr) 16: pushq %rbx ;; callee saved 17: pushq %r13 ;; callee saved 19: movq %rdi, %rbx ;; rbx = skb ; asm volatile("r1 = %[ctx]\n\t" 1c: movabsq $-105487587488768, %r13 ;; r13 = jmp_table 26: movq %rbx, %rdi ;; 1st arg, skb 29: movq %r13, %rsi ;; 2nd arg, jmp_table 2c: xorl %edx, %edx ;; 3rd arg, index = 0 2e: movq -16(%rbp), %rax ;; rax = [rbp - 16] (tcc_ptr) 35: cmpq $33, (%rax) 39: jae 0x4e ;; if tcc_ptr >= 33 goto 0x4e --------+ 3b: jmp 0x4e ;; jmp bypass, toggled by poking \| 40: addq $1, (%rax) ;; (*tcc_ptr)++ \| 44: popq %r13 ;; callee saved \| 46: popq %rbx ;; callee saved \| 47: popq %rax ;; undo rbp-16 push \| 48: popq %rax ;; undo rbp-8 push \| 49: nopl (%rax,%rax) ;; tail call target, toggled by poking \| ; return 0; ;; \| 4e: popq %r13 ;; restore callee saved <--------------+ 50: popq %rbx ;; restore callee saved 51: leave 52: retq Furthermore, when trampoline is the caller of bpf prog, which is tail_call_reachable, it is required to propagate rax through trampoline. Fixes: `ebf7d1f508` ("bpf, x64: rework pro/epilogue and tailcall handling in JIT") Fixes: `e411901c0b` ("bpf: allow for tailcalls in BPF subprograms for x64 JIT") Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Leon Hwang <hffilwlqm@gmail.com> Link: https://lore.kernel.org/r/20240714123902.32305-2-hffilwlqm@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 12:53:31 -07:00
Andrii Nakryiko	bde0c5a737	Merge branch 'bpf-track-find_equal_scalars-history-on-per-instruction-level' Eduard Zingerman says: ==================== bpf: track find_equal_scalars history on per-instruction level This is a fix for precision tracking bug reported in [0]. It supersedes my previous attempt to fix similar issue in commit [1]. Here is a minimized test case from [0]: 0: call bpf_get_prandom_u32; 1: r7 = r0; 2: r8 = r0; 3: call bpf_get_prandom_u32; 4: if r0 > 1 goto +0; /* --- checkpoint #1: r7.id=1, r8.id=1 --- / 5: if r8 >= r0 goto 9f; 6: r8 += r8; / --- checkpoint #2: r7.id=1, r8.id=0 --- / 7: if r7 == 0 goto 9f; 8: r0 /= 0; / --- checkpoint #3 --- / 9: r0 = 42; 10: exit; W/o this fix verifier incorrectly assumes that instruction at label (8) is unreachable. The issue is caused by failure to infer precision mark for r0 at checkpoint #1: - first verification path is: - (0-4): r0 range [0,1]; - (5): r8 range [0,0], propagated to r7; - (6): r8.id is reset; - (7): jump is predicted to happen; - (9-10): safe exit. - when jump at (7) is predicted mark_chain_precision() for r7 is called and backtrack_insn() proceeds as follows: - at (7) r7 is marked as precise; - at (5) r8 is not currently tracked and thus r0 is not marked; - at (4-5) boundary logic from [1] is triggered and r7,r8 are marked as precise; - => r0 precision mark is missed. - when second branch of (4) is considered, verifier prunes the state because r0 is not marked as precise in the visited state. Basically, backtracking logic fails to notice that at (5) range information is gained for both r7 and r8, and thus both r8 and r0 have to be marked as precise. This happens because [1] can only account for such range transfers at parent/child state boundaries. The solution suggested by Andrii Nakryiko in [0] is to use jump history to remember which registers gained range as a result of find_equal_scalars() [renamed to sync_linked_regs()] and use this information in backtrack_insn(). Which is what this patch-set does. The patch-set uses u64 value as a vector of 10-bit values that identify registers gaining range in find_equal_scalars(). This amounts to maximum of 6 possible values. To check if such capacity is sufficient I've instrumented kernel to track a histogram for maximal amount of registers that gain range in find_equal_scalars per program verification [2]. Measurements done for verifier selftests and Cilium bpf object files from [3] show that number of such registers is always* <= 4 and in 98% of cases it is <= 2. When tested on a subset of selftests identified by selftests/bpf/veristat.cfg and Cilium bpf object files from [3] this patch-set has minimal verification performance impact: File Program Insns (DIFF) States (DIFF) ------------------------ ------------------------ -------------- ------------- bpf_host.o tail_handle_nat_fwd_ipv4 -75 (-0.61%) -3 (-0.39%) pyperf600_nounroll.bpf.o on_event +1673 (+0.33%) +3 (+0.01%) [0] https://lore.kernel.org/bpf/CAEf4BzZ0xidVCqB47XnkXcNhkPWF6_nTV7yt+_Lf0kcFEut2Mg@mail.gmail.com/ [1] commit `904e6ddf41` ("bpf: Use scalar ids in mark_chain_precision()") [2] https://github.com/eddyz87/bpf/tree/find-equal-scalars-in-jump-history-with-stats [3] https://github.com/anakryiko/cilium Changes: - v2 -> v3: A number of stylistic changes suggested by Andrii: - renamings: - struct reg_or_spill -> linked_reg; - find_equal_scalars() -> collect_linked_regs; - copy_known_reg() -> sync_linked_regs; - collect_linked_regs() now returns linked regs set of size 2 or larger; - dropped usage of bit fields in struct linked_reg; - added a patch changing references to find_equal_scalars() in selftests comments. - v1 -> v2: - patch "bpf: replace env->cur_hist_ent with a getter function" is dropped (Andrii); - added structure linked_regs and helper functions to [de]serialize u64 value as such structure (Andrii); - bt_set_equal_scalars() renamed to bt_sync_linked_regs(), moved to start and end of backtrack_insn() in order to untie linked register logic from conditional jumps backtracking. Andrii requested a more radical change of moving linked registers processing to bt_set_xxx() functions, I did an experiment in this direction: https://github.com/eddyz87/bpf/tree/find-equal-scalars-in-jump-history--linked-regs-in-bt-set-reg the end result of the experiment seems much uglier than version presented in v2. Revisions: - v1: https://lore.kernel.org/bpf/20240222005005.31784-1-eddyz87@gmail.com/ - v2: https://lore.kernel.org/bpf/20240705205851.2635794-1-eddyz87@gmail.com/ ==================== Link: https://lore.kernel.org/r/20240718202357.1746514-1-eddyz87@gmail.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 12:53:27 -07:00
Eduard Zingerman	cfbf25481d	selftests/bpf: Update comments find_equal_scalars->sync_linked_regs find_equal_scalars() is renamed to sync_linked_regs(), this commit updates existing references in the selftests comments. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240718202357.1746514-5-eddyz87@gmail.com	2024-07-29 12:53:24 -07:00
Eduard Zingerman	bebc17b1c0	selftests/bpf: Tests for per-insn sync_linked_regs() precision tracking Add a few test cases to verify precision tracking for scalars gaining range because of sync_linked_regs(): - check what happens when more than 6 registers might gain range in sync_linked_regs(); - check if precision is propagated correctly when operand of conditional jump gained range in sync_linked_regs() and one of linked registers is marked precise; - check if precision is propagated correctly when operand of conditional jump gained range in sync_linked_regs() and a other-linked operand of the conditional jump is marked precise; - add a minimized reproducer for precision tracking bug reported in [0]; - Check that mark_chain_precision() for one of the conditional jump operands does not trigger equal scalars precision propagation. [0] https://lore.kernel.org/bpf/CAEf4BzZ0xidVCqB47XnkXcNhkPWF6_nTV7yt+_Lf0kcFEut2Mg@mail.gmail.com/ Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240718202357.1746514-4-eddyz87@gmail.com	2024-07-29 12:53:17 -07:00
Eduard Zingerman	842edb5507	bpf: Remove mark_precise_scalar_ids() Function mark_precise_scalar_ids() is superseded by bt_sync_linked_regs() and equal scalars tracking in jump history. mark_precise_scalar_ids() propagates precision over registers sharing same ID on parent/child state boundaries, while jump history records allow bt_sync_linked_regs() to propagate same information with instruction level granularity, which is strictly more precise. This commit removes mark_precise_scalar_ids() and updates test cases in progs/verifier_scalar_ids to reflect new verifier behavior. The tests are updated in the following manner: - mark_precise_scalar_ids() propagated precision regardless of presence of conditional jumps, while new jump history based logic only kicks in when conditional jumps are present. Hence test cases are augmented with conditional jumps to still trigger precision propagation. - As equal scalars tracking no longer relies on parent/child state boundaries some test cases are no longer interesting, such test cases are removed, namely: - precision_same_state and precision_cross_state are superseded by linked_regs_bpf_k; - precision_same_state_broken_link and equal_scalars_broken_link are superseded by linked_regs_broken_link. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240718202357.1746514-3-eddyz87@gmail.com	2024-07-29 12:53:14 -07:00
Eduard Zingerman	4bf79f9be4	bpf: Track equal scalars history on per-instruction level Use bpf_verifier_state->jmp_history to track which registers were updated by find_equal_scalars() (renamed to collect_linked_regs()) when conditional jump was verified. Use recorded information in backtrack_insn() to propagate precision. E.g. for the following program: while verifying instructions 1: r1 = r0 \| 2: if r1 < 8 goto ... \| push r0,r1 as linked registers in jmp_history 3: if r0 > 16 goto ... \| push r0,r1 as linked registers in jmp_history 4: r2 = r10 \| 5: r2 += r0 v mark_chain_precision(r0) while doing mark_chain_precision(r0) 5: r2 += r0 \| mark r0 precise 4: r2 = r10 \| 3: if r0 > 16 goto ... \| mark r0,r1 as precise 2: if r1 < 8 goto ... \| mark r0,r1 as precise 1: r1 = r0 v Technically, do this as follows: - Use 10 bits to identify each register that gains range because of sync_linked_regs(): - 3 bits for frame number; - 6 bits for register or stack slot number; - 1 bit to indicate if register is spilled. - Use u64 as a vector of 6 such records + 4 bits for vector length. - Augment struct bpf_jmp_history_entry with a field 'linked_regs' representing such vector. - When doing check_cond_jmp_op() remember up to 6 registers that gain range because of sync_linked_regs() in such a vector. - Don't propagate range information and reset IDs for registers that don't fit in 6-value vector. - Push a pair {instruction index, linked registers vector} to bpf_verifier_state->jmp_history. - When doing backtrack_insn() check if any of recorded linked registers is currently marked precise, if so mark all linked registers as precise. This also requires fixes for two test_verifier tests: - precise: test 1 - precise: test 2 Both tests contain the following instruction sequence: 19: (bf) r2 = r9 ; R2=scalar(id=3) R9=scalar(id=3) 20: (a5) if r2 < 0x8 goto pc+1 ; R2=scalar(id=3,umin=8) 21: (95) exit 22: (07) r2 += 1 ; R2_w=scalar(id=3+1,...) 23: (bf) r1 = r10 ; R1_w=fp0 R10=fp0 24: (07) r1 += -8 ; R1_w=fp-8 25: (b7) r3 = 0 ; R3_w=0 26: (85) call bpf_probe_read_kernel#113 The call to bpf_probe_read_kernel() at (26) forces r2 to be precise. Previously, this forced all registers with same id to become precise immediately when mark_chain_precision() is called. After this change, the precision is propagated to registers sharing same id only when 'if' instruction is backtracked. Hence verification log for both tests is changed: regs=r2,r9 -> regs=r2 for instructions 25..20. Fixes: `904e6ddf41` ("bpf: Use scalar ids in mark_chain_precision()") Reported-by: Hao Sun <sunhao.th@gmail.com> Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240718202357.1746514-2-eddyz87@gmail.com Closes: https://lore.kernel.org/bpf/CAEf4BzZ0xidVCqB47XnkXcNhkPWF6_nTV7yt+_Lf0kcFEut2Mg@mail.gmail.com/	2024-07-29 12:53:10 -07:00
Ihor Solodrai	844f7315e7	selftests/bpf: Use auto-dependencies for test objects Make use of -M compiler options when building .test.o objects to generate .d files and avoid re-building all tests every time. Previously, if a single test bpf program under selftests/bpf/progs/.c has changed, make would rebuild all the .bpf.o, .skel.h and .test.o objects, which is a lot of unnecessary work. A typical dependency chain is: progs/x.c -> x.bpf.o -> x.skel.h -> x.test.o -> trunner_binary However for many tests it's not a 1:1 mapping by name, and so far %.test.o have been simply dependent on all %.skel.h files, and %.skel.h files on all %.bpf.o objects. Avoid full rebuilds by instructing the compiler (via -MMD) to produce *.d files with real dependencies, and appropriately including them. Exploit make feature that rebuilds included makefiles if they were changed by setting %.test.d as prerequisite for %.test.o files. A couple of examples of compilation time speedup (after the first clean build): $ touch progs/verifier_and.c && time make -j8 Before: real 0m16.651s After: real 0m2.245s $ touch progs/read_vsyscall.c && time make -j8 Before: real 0m15.743s After: real 0m1.575s A drawback of this change is that now there is an overhead due to make processing lots of .d files, which potentially may slow down unrelated targets. However a time to make all from scratch hasn't changed significantly: $ make clean && time make -j8 Before: real 1m31.148s After: real 1m30.309s Suggested-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/VJihUTnvtwEgv_mOnpfy7EgD9D2MPNoHO-MlANeLIzLJPGhDeyOuGKIYyKgk0O6KPjfM-MuhtvPwZcngN8WFqbTnTRyCSMc2aMZ1ODm1T_g=@pm.me	2024-07-29 12:53:07 -07:00
Markus Elfring	f157f9cb85	bpf: Simplify character output in seq_print_delegate_opts() Single characters should be put into a sequence. Thus use the corresponding function “seq_putc” for two selected calls. This issue was transformed by using the Coccinelle software. Suggested-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr> Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/abde0992-3d71-44d2-ab27-75b382933a22@web.de	2024-07-29 12:53:04 -07:00
Markus Elfring	df862de41f	bpf: Replace 8 seq_puts() calls by seq_putc() calls Single line breaks should occasionally be put into a sequence. Thus use the corresponding function “seq_putc”. This issue was transformed by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/e26b7df9-cd63-491f-85e8-8cabe60a85e5@web.de	2024-07-29 12:53:00 -07:00
Martin KaFai Lau	2291247296	Merge branch 'use network helpers, part 9' Geliang Tang says: ==================== v3: - patch 2: - clear errno before connect_to_fd_opts. - print err logs in run_test. - set err to -1 when fd >= 0. - patch 3: - drop "int err". v2: - update patch 2 as Martin suggested. This is the 9th part of series "use network helpers" all BPF selftests wide. Patches 1-2 update network helpers interfaces suggested by Martin. Patch 3 adds a new helper connect_to_addr_str() as Martin suggested instead of adding connect_fd_to_addr_str(). Patch 4 uses this newly added helper in make_client(). Patch 5 uses make_client() in sk_lookup and drop make_socket(). ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 12:52:54 -07:00
Geliang Tang	c70b2d9027	selftests/bpf: Add connect_to_addr_str helper Similar to connect_to_addr() helper for connecting to a server with the given sockaddr_storage type address, this patch adds a new helper named connect_to_addr_str() for connecting to a server with the given string type address "addr_str", together with its "family" and "port" as other parameters of connect_to_addr_str(). In connect_to_addr_str(), the parameters "family", "addr_str" and "port" are used to create a sockaddr_storage type address "addr" by invoking make_sockaddr(). Then pass this "addr" together with "addrlen", "type" and "opts" to connect_to_addr(). Suggested-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/647e82170831558dbde132a7a3d86df660dba2c4.1721282219.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 12:52:51 -07:00
Geliang Tang	e1ee5a48b5	selftests/bpf: Drop must_fail from network_helper_opts The struct member "must_fail" of network_helper_opts() is only used in cgroup_v1v2 tests, it makes sense to drop it from network_helper_opts. Return value (fd) of connect_to_fd_opts() and the expect errno (EPERM) can be checked in cgroup_v1v2.c directly, no need to check them in connect_fd_to_addr() in network_helpers.c. This also makes connect_fd_to_addr() function useless. It can be replaced by connect(). Suggested-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/3faf336019a9a48e2e8951f4cdebf19e3ac6e441.1721282219.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 12:52:45 -07:00
Geliang Tang	a63507f3b1	selftests/bpf: Drop type of connect_to_fd_opts The "type" parameter of connect_to_fd_opts() is redundant of "server_fd". Since the "type" can be obtained inside by invoking getsockopt(SO_TYPE), without passing it in as a parameter. This patch drops the "type" parameter of connect_to_fd_opts() and updates its callers. Suggested-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/50d8ce7ab7ab0c0f4d211fc7cc4ebe3d3f63424c.1721282219.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-29 12:52:24 -07:00
Linus Torvalds	1722389b0d	Merge tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf and netfilter. A lot of networking people were at a conference last week, busy catching COVID, so relatively short PR. Current release - regressions: - tcp: process the 3rd ACK with sk_socket for TFO and MPTCP Current release - new code bugs: - l2tp: protect session IDR and tunnel session list with one lock, make sure the state is coherent to avoid a warning - eth: bnxt_en: update xdp_rxq_info in queue restart logic - eth: airoha: fix location of the MBI_RX_AGE_SEL_MASK field Previous releases - regressions: - xsk: require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len, the field reuses previously un-validated pad Previous releases - always broken: - tap/tun: drop short frames to prevent crashes later in the stack - eth: ice: add a per-VF limit on number of FDIR filters - af_unix: disable MSG_OOB handling for sockets in sockmap/sockhash" * tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits) tun: add missing verification for short frame tap: add missing verification for short frame mISDN: Fix a use after free in hfcmulti_tx() gve: Fix an edge case for TSO skb validity check bnxt_en: update xdp_rxq_info in queue restart logic tcp: process the 3rd ACK with sk_socket for TFO/MPTCP selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len bpf: Fix a segment issue when downgrading gso_size net: mediatek: Fix potential NULL pointer dereference in dummy net_device handling MAINTAINERS: make Breno the netconsole maintainer MAINTAINERS: Update bonding entry net: nexthop: Initialize all fields in dumped nexthops net: stmmac: Correct byte order of perfect_match selftests: forwarding: skip if kernel not support setting bridge fdb learning limit tipc: Return non-zero value from tipc_udp_addr2str() on error netfilter: nft_set_pipapo_avx2: disable softinterrupts ice: Fix recipe read procedure ice: Add a per-VF limit on number of FDIR filters net: bonding: correctly annotate RCU in bond_should_notify_peers() ...	2024-07-25 13:32:25 -07:00
Linus Torvalds	8bf100092d	Merge tag 'printk-for-6.11-trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - trivial printk changes The bigger "real" printk work is still being discussed. * tag 'printk-for-6.11-trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: vsprintf: add missing MODULE_DESCRIPTION() macro printk: Rename console_replay_all() and update context	2024-07-25 13:18:41 -07:00
Linus Torvalds	b485625078	Merge tag 'constfy-sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl Pull sysctl constification from Joel Granados: "Treewide constification of the ctl_table argument of proc_handlers using a coccinelle script and some manual code formatting fixups. This is a prerequisite to moving the static ctl_table structs into read-only data section which will ensure that proc_handler function pointers cannot be modified" * tag 'constfy-sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: sysctl: treewide: constify the ctl_table argument of proc_handlers	2024-07-25 12:58:36 -07:00
Linus Torvalds	bba959f477	Merge tag 'efi-fixes-for-v6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: - Wipe screen_info after allocating it from the heap - used by arm32 and EFI zboot, other EFI architectures allocate it statically - Revert to allocating boot_params from the heap on x86 when entering via the native PE entrypoint, to work around a regression on older Dell hardware * tag 'efi-fixes-for-v6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: x86/efistub: Revert to heap allocated boot_params for PE entrypoint efi/libstub: Zero initialize heap allocated struct screen_info	2024-07-25 12:55:21 -07:00
Linus Torvalds	9b21993654	Merge tag 'kgdb-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux Pull kgdb updates from Daniel Thompson: "Three small changes this cycle: - Clean up an architecture abstraction that is no longer needed because all the architectures have converged. - Actually use the prompt argument to kdb_position_cursor() instead of ignoring it (functionally this fix is a nop but that was due to luck rather than good judgement) - Fix a -Wformat-security warning" * tag 'kgdb-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux: kdb: Get rid of redundant kdb_curr_task() kdb: Use the passed prompt in kdb_position_cursor() kdb: address -Wformat-security warnings	2024-07-25 12:48:42 -07:00
Linus Torvalds	28e7241cb8	Merge tag 'mips_6.11_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS updates from Thomas Bogendoerfer: - Use improved timer sync for Loongson64 - Fix address of GCR_ACCESS register - Add missing MODULE_DESCRIPTION * tag 'mips_6.11_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: mips: sibyte: add missing MODULE_DESCRIPTION() macro MIPS: SMP-CPS: Fix address for GCR_ACCESS register for CM3 and later MIPS: Loongson64: Switch to SYNC_R4K	2024-07-25 12:41:53 -07:00
Linus Torvalds	f646429524	Merge tag 'parisc-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc updates from Helge Deller: "The gettimeofday() and clock_gettime() syscalls are now available as vDSO functions, and Dave added a patch which allows to use NVMe cards in the PCI slots as fast and easy alternative to SCSI discs. Summary: - add gettimeofday() and clock_gettime() vDSO functions - enable PCI_MSI_ARCH_FALLBACKS to allow PCI to PCIe bridge adaptor with PCIe NVME card to function in parisc machines - allow users to reduce kernel unaligned runtime warnings - minor code cleanups" * tag 'parisc-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Add support for CONFIG_SYSCTL_ARCH_UNALIGN_NO_WARN parisc: Use max() to calculate parisc_tlb_flush_threshold parisc: Fix warning at drivers/pci/msi/msi.h:121 parisc: Add 64-bit gettimeofday() and clock_gettime() vDSO functions parisc: Add 32-bit gettimeofday() and clock_gettime() vDSO functions parisc: Clean up unistd.h file	2024-07-25 12:37:42 -07:00
Linus Torvalds	f9bcc61ad1	Merge tag 'uml-for-linus-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Richard Weinberger: - Support for preemption - i386 Rust support - Huge cleanup by Benjamin Berg - UBSAN support - Removal of dead code * tag 'uml-for-linus-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (41 commits) um: vector: always reset vp->opened um: vector: remove vp->lock um: register power-off handler um: line: always fill *error_out in setup_one_line() um: remove pcap driver from documentation um: Enable preemption in UML um: refactor TLB update handling um: simplify and consolidate TLB updates um: remove force_flush_all from fork_handler um: Do not flush MM in flush_thread um: Delay flushing syscalls until the thread is restarted um: remove copy_context_skas0 um: remove LDT support um: compress memory related stub syscalls while adding them um: Rework syscall handling um: Add generic stub_syscall6 function um: Create signal stack memory assignment in stub_data um: Remove stub-data.h include from common-offsets.h um: time-travel: fix signal blocking race/hang um: time-travel: remove time_exit() ...	2024-07-25 12:33:08 -07:00

1 2 3 4 5 ...

1294131 Commits