linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-19 04:42:35 -04:00

Author	SHA1	Message	Date
Donglin Peng	230e7d7de5	tools/resolve_btfids: Support BTF sorting feature This introduces a new BTF sorting phase that specifically sorts BTF types by name in ascending order, so that the binary search can be used to look up types. Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20260109130003.3313716-4-dolinux.peng@gmail.com	2026-01-13 16:10:39 -08:00
Donglin Peng	a3acd7d434	selftests/bpf: Add test cases for btf__permute functionality This patch introduces test cases for the btf__permute function to ensure it works correctly with both base BTF and split BTF scenarios. The test suite includes: - test_permute_base: Validates permutation on base BTF - test_permute_split: Tests permutation on split BTF Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20260109130003.3313716-3-dolinux.peng@gmail.com	2026-01-13 16:10:39 -08:00
Donglin Peng	6fbf129c49	libbpf: Add BTF permutation support for type reordering Introduce btf__permute() API to allow in-place rearrangement of BTF types. This function reorganizes BTF type order according to a provided array of type IDs, updating all type references to maintain consistency. Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20260109130003.3313716-2-dolinux.peng@gmail.com	2026-01-13 16:10:39 -08:00
Song Chen	c9c9f6bf7f	bpf: Remove an unused parameter in check_func_proto The func_id parameter is not needed in check_func_proto. This patch removes it. Signed-off-by: Song Chen <chensong_2000@189.cn> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260105155009.4581-1-chensong_2000@189.cn	2026-01-13 10:00:15 -08:00
Alexei Starovoitov	da4ab5dcc9	Merge branch 'bpf-recognize-special-arithmetic-shift-in-the-verifier' Puranjay Mohan says: ==================== bpf: Recognize special arithmetic shift in the verifier v3: https://lore.kernel.org/all/20260103022310.935686-1-puranjay@kernel.org/ Changes in v3->v4: - Fork verifier state while processing BPF_OR when src_reg has [-1,0] range and 2nd operand is a constant. This is to detect the following pattern: i32 X > -1 ? C1 : -1 --> (X >>s 31) \| C1 - Add selftests for above. - Remove __description("s>>=63") (Eduard in another patchset) v2: https://lore.kernel.org/bpf/20251115022611.64898-1-alexei.starovoitov@gmail.com/ Changes in v2->v3: - fork verifier state while processing BPF_AND when src_reg has [-1,0] range and 2nd operand is a constant. v1->v2: Use __mark_reg32_known() or __mark_reg_known() for zero too. Add comment to selftest. v1: https://lore.kernel.org/bpf/20251114031039.63852-1-alexei.starovoitov@gmail.com/ ==================== Link: https://patch.msgid.link/20260112201424.816836-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 09:33:39 -08:00
Alexei Starovoitov	9160335317	selftests/bpf: Add tests for s>>=31 and s>>=63 Add tests for special arithmetic shift right. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Co-developed-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260112201424.816836-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 09:33:38 -08:00
Alexei Starovoitov	bffacdb80b	bpf: Recognize special arithmetic shift in the verifier cilium bpf_wiregard.bpf.c when compiled with -O1 fails to load with the following verifier log: 192: (79) r2 = (u64 )(r10 -304) ; R2=pkt(r=40) R10=fp0 fp-304=pkt(r=40) ... 227: (85) call bpf_skb_store_bytes#9 ; R0=scalar() 228: (bc) w2 = w0 ; R0=scalar() R2=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 229: (c4) w2 s>>= 31 ; R2=scalar(smin=0,smax=umax=0xffffffff,smin32=-1,smax32=0,var_off=(0x0; 0xffffffff)) 230: (54) w2 &= -134 ; R2=scalar(smin=0,smax=umax=umax32=0xffffff7a,smax32=0x7fffff7a,var_off=(0x0; 0xffffff7a)) ... 232: (66) if w2 s> 0xffffffff goto pc+125 ; R2=scalar(smin=umin=umin32=0x80000000,smax=umax=umax32=0xffffff7a,smax32=-134,var_off=(0x80000000; 0x7fffff7a)) ... 238: (79) r4 = (u64 )(r10 -304) ; R4=scalar() R10=fp0 fp-304=scalar() 239: (56) if w2 != 0xffffff78 goto pc+210 ; R2=0xffffff78 // -136 ... 258: (71) r1 = (u8 )(r4 +0) R4 invalid mem access 'scalar' The error might confuse most bpf authors, since fp-304 slot had 'pkt' pointer at insn 192 and became 'scalar' at 238. That happened because bpf_skb_store_bytes() clears all packet pointers including those in the stack. On the first glance it might look like a bug in the source code, since ctx->data pointer should have been reloaded after the call to bpf_skb_store_bytes(). The relevant part of cilium source code looks like this: // bpf/lib/nodeport.h int dsr_set_ipip6() { if (ctx_adjust_hroom(...)) return DROP_INVALID; // -134 if (ctx_store_bytes(...)) return DROP_WRITE_ERROR; // -141 return 0; } bool dsr_fail_needs_reply(int code) { if (code == DROP_FRAG_NEEDED) // -136 return true; return false; } tail_nodeport_ipv6_dsr() { ret = dsr_set_ipip6(...); if (!IS_ERR(ret)) { ... } else { if (dsr_fail_needs_reply(ret)) return dsr_reply_icmp6(...); } } The code doesn't have arithmetic shift by 31 and it reloads ctx->data every time it needs to access it. So it's not a bug in the source code. The reason is DAGCombiner::foldSelectCCToShiftAnd() LLVM transformation: // If this is a select where the false operand is zero and the compare is a // check of the sign bit, see if we can perform the "gzip trick": // select_cc setlt X, 0, A, 0 -> and (sra X, size(X)-1), A // select_cc setgt X, 0, A, 0 -> and (not (sra X, size(X)-1)), A The conditional branch in dsr_set_ipip6() and its return values are optimized into BPF_ARSH plus BPF_AND: 227: (85) call bpf_skb_store_bytes#9 228: (bc) w2 = w0 229: (c4) w2 s>>= 31 ; R2=scalar(smin=0,smax=umax=0xffffffff,smin32=-1,smax32=0,var_off=(0x0; 0xffffffff)) 230: (54) w2 &= -134 ; R2=scalar(smin=0,smax=umax=umax32=0xffffff7a,smax32=0x7fffff7a,var_off=(0x0; 0xffffff7a)) after insn 230 the register w2 can only be 0 or -134, but the verifier approximates it, since there is no way to represent two scalars in bpf_reg_state. After fallthough at insn 232 the w2 can only be -134, hence the branch at insn 239: (56) if w2 != -136 goto pc+210 should be always taken, and trapping insn 258 should never execute. LLVM generated correct code, but the verifier follows impossible path and rejects valid program. To fix this issue recognize this special LLVM optimization and fork the verifier state. So after insn 229: (c4) w2 s>>= 31 the verifier has two states to explore: one with w2 = 0 and another with w2 = 0xffffffff which makes the verifier accept bpf_wiregard.c A similar pattern exists were OR operation is used in place of the AND operation, the verifier detects that pattern as well by forking the state before the OR operation with a scalar in range [-1,0]. Note there are 20+ such patterns in bpf_wiregard.o compiled with -O1 and -O2, but they're rarely seen in other production bpf programs, so push_stack() approach is not a concern. Reported-by: Hao Sun <sunhao.th@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Co-developed-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260112201424.816836-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 09:33:38 -08:00
Alexei Starovoitov	1fffe1f4b9	Merge branch 'fix-a-few-selftest-failure-due-to-64k-page' Yonghong Song says: ==================== Fix a few selftest failure due to 64K page Fix a few arm64 selftest failures due to 64K page. Please see each indvidual patch for why the test failed and how the test gets fixed. ==================== Link: https://patch.msgid.link/20260113061018.3797051-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 09:32:17 -08:00
Yonghong Song	951d79017e	selftests/bpf: Fix verifier_arena_globals1 failure with 64K page With 64K page on arm64, verifier_arena_globals1 failed like below: ... libbpf: map 'arena': failed to create: -E2BIG ... #509/1 verifier_arena_globals1/check_reserve1:FAIL ... For 64K page, if the number of arena pages is (1UL << 20), the total memory will exceed 4G and this will cause map creation failure. Adjusting ARENA_PAGES based on the actual page size fixed the problem. Cc: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260113061033.3798549-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 09:32:16 -08:00
Yonghong Song	d2f7cd20a7	selftests/bpf: Fix sk_bypass_prot_mem failure with 64K page The current selftest sk_bypass_prot_mem only supports 4K page. When running with 64K page on arm64, the following failure happens: ... check_bypass:FAIL:no bypass unexpected no bypass: actual 3 <= expected 32 ... #385/1 sk_bypass_prot_mem/TCP :FAIL ... check_bypass:FAIL:no bypass unexpected no bypass: actual 4 <= expected 32 ... #385/2 sk_bypass_prot_mem/UDP :FAIL ... Adding support to 64K page as well fixed the failure. Cc: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260113061028.3798326-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 09:32:16 -08:00
Yonghong Song	2465a08d43	selftests/bpf: Fix dmabuf_iter/lots_of_buffers failure with 64K page On arm64 with 64K page , I observed the following test failure: ... subtest_dmabuf_iter_check_lots_of_buffers:FAIL:total_bytes_read unexpected total_bytes_read: actual 4696 <= expected 65536 #97/3 dmabuf_iter/lots_of_buffers:FAIL With 4K page on x86, the total_bytes_read is 4593. With 64K page on arm64, the total_byte_read is 4696. In progs/dmabuf_iter.c, for each iteration, the output is BPF_SEQ_PRINTF(seq, "%lu\n%llu\n%s\n%s\n", inode, size, name, exporter); The only difference between 4K and 64K page is 'size' in the above BPF_SEQ_PRINTF. The 4K page will output '4096' and the 64K page will output '65536'. So the total_bytes_read with 64K page is slighter greater than 4K page. Adjusting the total_bytes_read from 65536 to 4096 fixed the issue. Cc: T.J. Mercier <tjmercier@google.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260113061023.3798085-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 09:32:16 -08:00
Mykyta Yatsenko	7af3339948	bpf: Consistently use reg_state() for register access in the verifier Replace the pattern of declaring a local regs array from cur_regs() and then indexing into it with the more concise reg_state() helper. This simplifies the code by eliminating intermediate variables and makes register access more consistent throughout the verifier. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20260113134826.2214860-1-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 09:31:17 -08:00
Alexei Starovoitov	9c1a3525fd	Merge branch 'use-correct-destructor-kfunc-types' Sami Tolvanen says: ==================== While running BPF self-tests with CONFIG_CFI (Control Flow Integrity) enabled, I ran into a couple of failures in bpf_obj_free_fields() caused by type mismatches between the btf_dtor_kfunc_t function pointer type and the registered destructor functions. It looks like we can't change the argument type for these functions to match btf_dtor_kfunc_t because the verifier doesn't like void pointer arguments for functions used in BPF programs, so this series fixes the issue by adding stubs with correct types to use as destructors for each instance of this I found in the kernel tree. The last patch changes btf_check_dtor_kfuncs() to enforce the function type when CFI is enabled, so we don't end up registering destructors that panic the kernel. v5: - Rebased on bpf-next/master again. v4: https://lore.kernel.org/bpf/20251126221724.897221-6-samitolvanen@google.com/ - Rebased on bpf-next/master. - Renamed CONFIG_CFI_CLANG to CONFIG_CFI. - Picked up Acked/Tested-by tags. v3: https://lore.kernel.org/bpf/20250728202656.559071-6-samitolvanen@google.com/ - Renamed the functions and went back to __bpf_kfunc based on review feedback. v2: https://lore.kernel.org/bpf/20250725214401.1475224-6-samitolvanen@google.com/ - Annotated the stubs with CFI_NOSEAL to fix issues with IBT sealing on x86. - Changed __bpf_kfunc to explicit __used __retain. v1: https://lore.kernel.org/bpf/20250724223225.1481960-6-samitolvanen@google.com/ ==================== Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260110082548.113748-6-samitolvanen@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-12 18:54:11 -08:00
Sami Tolvanen	99fde4d062	bpf, btf: Enforce destructor kfunc type with CFI Ensure that registered destructor kfuncs have the same type as btf_dtor_kfunc_t to avoid a kernel panic on systems with CONFIG_CFI enabled. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260110082548.113748-10-samitolvanen@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-12 18:53:57 -08:00
Sami Tolvanen	ba7f1024a1	selftests/bpf: Use the correct destructor kfunc type With CONFIG_CFI enabled, the kernel strictly enforces that indirect function calls use a function pointer type that matches the target function. As bpf_testmod_ctx_release() signature differs from the btf_dtor_kfunc_t pointer type used for the destructor calls in bpf_obj_free_fields(), add a stub function with the correct type to fix the type mismatch. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260110082548.113748-9-samitolvanen@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-12 18:53:57 -08:00
Sami Tolvanen	c99d97b466	bpf: net_sched: Use the correct destructor kfunc type With CONFIG_CFI enabled, the kernel strictly enforces that indirect function calls use a function pointer type that matches the target function. As bpf_kfree_skb() signature differs from the btf_dtor_kfunc_t pointer type used for the destructor calls in bpf_obj_free_fields(), add a stub function with the correct type to fix the type mismatch. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260110082548.113748-8-samitolvanen@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-12 18:53:57 -08:00
Sami Tolvanen	b40a5d724f	bpf: crypto: Use the correct destructor kfunc type With CONFIG_CFI enabled, the kernel strictly enforces that indirect function calls use a function pointer type that matches the target function. I ran into the following type mismatch when running BPF self-tests: CFI failure at bpf_obj_free_fields+0x190/0x238 (target: bpf_crypto_ctx_release+0x0/0x94; expected type: 0xa488ebfc) Internal error: Oops - CFI: 00000000f2008228 [#1] SMP ... As bpf_crypto_ctx_release() is also used in BPF programs and using a void pointer as the argument would make the verifier unhappy, add a simple stub function with the correct type and register it as the destructor kfunc instead. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Tested-by: Viktor Malik <vmalik@redhat.com> Link: https://lore.kernel.org/r/20260110082548.113748-7-samitolvanen@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-12 18:53:57 -08:00
Varun R Mallya	5714ca8cba	libbpf: Fix OOB read in btf_dump_get_bitfield_value When dumping bitfield data, btf_dump_get_bitfield_value() reads data based on the underlying type's size (t->size). However, it does not verify that the provided data buffer (data_sz) is large enough to contain these bytes. If btf_dump__dump_type_data() is called with a buffer smaller than the type's size, this leads to an out-of-bounds read. This was confirmed by AddressSanitizer in the linked issue. Fix this by ensuring we do not read past the provided data_sz limit. Fixes: `a1d3cc3c5e` ("libbpf: Avoid use of __int128 in typed dump display") Reported-by: Harrison Green <harrisonmichaelgreen@gmail.com> Suggested-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Varun R Mallya <varunrmallya@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260106233527.163487-1-varunrmallya@gmail.com Closes: https://github.com/libbpf/libbpf/issues/928	2026-01-09 15:54:31 -08:00
WanLi Niu	4effccde0a	bpftool: Make skeleton C++ compatible with explicit casts Fix C++ compilation errors in generated skeleton by adding explicit pointer casts and use char * subtraction for offset calculation error: invalid conversion from 'void' to '<obj_name>' [-fpermissive] \| skel = skel_alloc(sizeof(skel)); \| ~~~~~~~~~~^~~~~~~~~~~~~~~ \| \| \| void error: arithmetic on pointers to void \| skel->ctx.sz = (void )&skel->links - (void )skel; \| ~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~ error: assigning to 'struct <obj_name>__<ident> ' from incompatible type 'void ' \| skel-><ident> = skel_prep_map_data((void )data, 4096, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \| sizeof(data) - 1); \| ~~~~~~~~~~~~~~~~~ error: assigning to 'struct <obj_name>__<ident> ' from incompatible type 'void ' \| skel-><ident> = skel_finalize_map_data(&skel->maps.<ident>.initial_value, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \| 4096, PROT_READ \| PROT_WRITE, skel->maps.<ident>.map_fd); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Minimum reproducer: $ cat test.bpf.c int val; // placed in .bss section #include "vmlinux.h" #include <bpf/bpf_helpers.h> SEC("raw_tracepoint/sched_wakeup_new") int handle(void ctx) { return 0; } $ cat test.cpp #include <cerrno> extern "C" { #include "test.bpf.skel.h" } $ bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h $ clang -g -O2 -target bpf -c test.bpf.c -o test.bpf.o $ bpftool gen skeleton test.bpf.o -L > test.bpf.skel.h $ g++ -c test.cpp -I. Co-developed-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: WanLi Niu <niuwl1@chinatelecom.cn> Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260106023123.2928-1-kiraskyler@163.com	2026-01-09 11:01:54 -08:00
Alexei Starovoitov	2175ccfb93	Merge branch 'bpf-selftests-fixes-for-gcc-bpf-16' Jose E. Marchesi says: ==================== bpf: selftests fixes for GCC-BPF 16 Hello. Just a couple of small fixes to get the BPF selftests build with what will become GCC 16 this spring. One of the regressions is due to a change in the behavior of a warning in GCC 16. The other is due to the fact that GCC 16 actually implements btf_decl_tag and btf_type_tag. Salud! ==================== Link: https://patch.msgid.link/20260106173650.18191-1-jose.marchesi@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 21:04:11 -08:00
Jose E. Marchesi	681600647c	bpf: GCC requires function attributes before the declarator GCC insists in placing attributes before the declarators in function declarations. Now that GCC supports btf_decl_tag and therefore __tag1 and __tag2 expand to actual attributes, the compiler is complaining about it for static __noinline int foo(int x __tag1 __tag2) __tag1 __tag2 progs/test_btf_decl_tag.c:36:1: error: attributes should be specified \ before the declarator in a function definition This patch simply places the tags before the declarator. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Cc: david.faust@oracle.com Cc: cupertino.miranda@oracle.com Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260106173650.18191-3-jose.marchesi@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 21:04:11 -08:00
Jose E. Marchesi	97fb54d86d	bpf: adapt selftests to GCC 16 -Wunused-but-set-variable GCC 16 has changed the semantics of -Wunused-but-set-variable, as well as introducing new options -Wunused-but-set-variable={0,1,2,3} to adjust the level of support. One of the changes is that GCC now treats 'sum += 1' and 'sum++' as non-usage, whereas clang (and GCC < 16) considers the first as usage and the second as non-usage, which is sort of inconsistent. The GCC 16 -Wunused-but-set-variable=2 option implements the previous semantics of -Wunused-but-set-variable, but since it is a new option, it cannot be used unconditionally for forward-compatibility, just for backwards-compatibility. So this patch adds pragmas to the two self-tests impacted by this, progs/free_timer.c and progs/rcu_read_lock.c, to make gcc to ignore -Wunused-but-set-variable warnings when compiling them with GCC > 15. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44677#c25 for details on why this regression got introduced in GCC upstream. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Cc: david.faust@oracle.com Cc: cupertino.miranda@oracle.com Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260106173650.18191-2-jose.marchesi@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 21:04:11 -08:00
Nathan Chancellor	2421649778	scripts/gen-btf.sh: Ensure initial object in gen_btf_o is ELF with correct endianness After commit `600605853f` ("scripts/gen-btf.sh: Fix .btf.o generation when compiling for RISCV"), there is an error from llvm-objcopy when CONFIG_LTO_CLANG is enabled: llvm-objcopy: error: '.tmp_vmlinux1.btf.o': The file was not recognized as a valid object file Failed to generate BTF for vmlinux KBUILD_CFLAGS includes CC_FLAGS_LTO, which makes clang emit an LLVM IR object, rather than an ELF one as expected by llvm-objcopy. Most areas of the kernel deal with this by filtering out CC_FLAGS_LTO from KBUILD_CFLAGS for the particular object or directory but this is not so easy to do in bash. Just include '-fno-lto' after KBUILD_CFLAGS to ensure an ELF object is consistently created as the initial .o file. Additionally, while there is no reported or discovered bug yet, the absence of KBUILD_CPPFLAGS from this command could result in incorrect endianness because KBUILD_CPPFLAGS typically contains '-mbig-endian' and '-mlittle-endian' so that biendian toolchains can be used. Include it in this ${CC} command to hopefully limit necessary changes to this command for the foreseeable future. Fixes: `600605853f` ("scripts/gen-btf.sh: Fix .btf.o generation when compiling for RISCV") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260106-fix-gen-btf-sh-lto-v2-1-01d3e1c241c4@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 21:00:38 -08:00
Alexei Starovoitov	f39703b20b	Merge branch 'bpf-introduce-bpf_f_cpu-and-bpf_f_all_cpus-flags-for-percpu-maps' Leon Hwang says: ==================== bpf: Introduce BPF_F_CPU and BPF_F_ALL_CPUS flags for percpu maps This patch set introduces the BPF_F_CPU and BPF_F_ALL_CPUS flags for percpu maps, as the requirement of BPF_F_ALL_CPUS flag for percpu_array maps was discussed in the thread of "[PATCH bpf-next v3 0/4] bpf: Introduce global percpu data"[1]. The goal of BPF_F_ALL_CPUS flag is to reduce data caching overhead in light skeletons by allowing a single value to be reused to update values across all CPUs. This avoids the M:N problem where M cached values are used to update a map on N CPUs kernel. The BPF_F_CPU flag is accompanied by flags-embedded cpu info, which specifies the target CPU for the operation: * For lookup operations: the flag field alongside cpu info enable querying a value on the specified CPU. * For update operations: the flag field alongside cpu info enable updating value for specified CPU. Links: [1] https://lore.kernel.org/bpf/20250526162146.24429-1-leon.hwang@linux.dev/ Changes: v12 -> v13: * No changes, rebased on latest tree. v11 -> v12: * Dropped the v11 changes. * Stabilized the lru_percpu_hash map test by keeping an extra spare entry, which can be used temporarily during updates to avoid unintended LRU evictions. v10 -> v11: * Support the combination of BPF_EXIST and BPF_F_CPU/BPF_F_ALL_CPUS for update operations. * Fix unstable lru_percpu_hash map test using the combination of BPF_EXIST and BPF_F_CPU/BPF_F_ALL_CPUS to avoid LRU eviction (reported by Alexei). v9 -> v10: * Add tests to verify array and hash maps do not support BPF_F_CPU and BPF_F_ALL_CPUS flags. * Address comment from Andrii: * Copy map value using copy_map_value_long for percpu_cgroup_storage maps in a separate patch. v8 -> v9: * Change value type from u64 to u32 in selftests. * Address comments from Andrii: * Keep value_size unaligned and update everywhere for consistency when cpu flags are specified. * Update value by getting pointer for percpu hash and percpu cgroup_storage maps. v7 -> v8: * Address comments from Andrii: * Check BPF_F_LOCK when update percpu_array, percpu_hash and lru_percpu_hash maps. * Refactor flags check in __htab_map_lookup_and_delete_batch(). * Keep value_size unaligned and copy value using copy_map_value() in __htab_map_lookup_and_delete_batch() when BPF_F_CPU is specified. * Update warn message in libbpf's validate_map_op(). * Update comment of libbpf's bpf_map__lookup_elem(). v6 -> v7: * Get correct value size for percpu_hash and lru_percpu_hash in update_batch API. * Set 'count' as 'max_entries' in test cases for lookup_batch API. * Address comment from Alexei: * Move cpu flags check into bpf_map_check_op_flags(). v5 -> v6: * Move bpf_map_check_op_flags() from 'bpf.h' to 'syscall.c'. * Address comments from Alexei: * Drop the refactoring code of data copying logic for percpu maps. * Drop bpf_map_check_op_flags() wrappers. v4 -> v5: * Address comments from Andrii: * Refactor data copying logic for all percpu maps. * Drop this_cpu_ptr() micro-optimization. * Drop cpu check in libbpf's validate_map_op(). * Enhance bpf_map_check_op_flags() using allowed flags instead of 'extra_flags_mask'. v3 -> v4: * Address comments from Andrii: * Remove unnecessary map_type check in bpf_map_value_size(). * Reduce code churn. * Remove unnecessary do_delete check in __htab_map_lookup_and_delete_batch(). * Introduce bpf_percpu_copy_to_user() and bpf_percpu_copy_from_user(). * Rename check_map_flags() to bpf_map_check_op_flags() with extra_flags_mask. * Add human-readable pr_warn() explanations in validate_map_op(). * Use flags in bpf_map__delete_elem() and bpf_map__lookup_and_delete_elem(). * Drop "for alignment reasons". v3 link: https://lore.kernel.org/bpf/20250821160817.70285-1-leon.hwang@linux.dev/ v2 -> v3: * Address comments from Alexei: * Use BPF_F_ALL_CPUS instead of BPF_ALL_CPUS magic. * Introduce these two cpu flags for all percpu maps. * Address comments from Jiri: * Reduce some unnecessary u32 cast. * Refactor more generic map flags check function. * A code style issue. v2 link: https://lore.kernel.org/bpf/20250805163017.17015-1-leon.hwang@linux.dev/ v1 -> v2: * Address comments from Andrii: * Embed cpu info as high 32 bits of flags totally. * Use ERANGE instead of E2BIG. * Few format issues. ==================== Link: https://patch.msgid.link/20260107022022.12843-1-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 20:48:33 -08:00
Leon Hwang	07bf7aa58e	selftests/bpf: Add cases to test BPF_F_CPU and BPF_F_ALL_CPUS flags Add test coverage for the new BPF_F_CPU and BPF_F_ALL_CPUS flags support in percpu maps. The following APIs are exercised: * bpf_map_update_batch() * bpf_map_lookup_batch() * bpf_map_update_elem() * bpf_map__update_elem() * bpf_map_lookup_elem_flags() * bpf_map__lookup_elem() For lru_percpu_hash map, set max_entries to 'libbpf_num_possible_cpus() + 1' and only use the first 'libbpf_num_possible_cpus()' entries. This ensures a spare entry is always available in the LRU free list, avoiding eviction. When updating an existing key in lru_percpu_hash map: 1. l_new = prealloc_lru_pop(); /* Borrow from free list / 2. l_old = lookup_elem_raw(); / Found, key exists / 3. pcpu_copy_value(); / In-place update / 4. bpf_lru_push_free(); / Return l_new to free list */ Also add negative tests to verify that non-percpu array and hash maps reject the BPF_F_CPU and BPF_F_ALL_CPUS flags. Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20260107022022.12843-8-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 20:48:32 -08:00
Leon Hwang	2546863b4a	libbpf: Add BPF_F_CPU and BPF_F_ALL_CPUS flags support for percpu maps Add libbpf support for the BPF_F_CPU flag for percpu maps by embedding the cpu info into the high 32 bits of: 1. flags: bpf_map_lookup_elem_flags(), bpf_map__lookup_elem(), bpf_map_update_elem() and bpf_map__update_elem() 2. opts->elem_flags: bpf_map_lookup_batch() and bpf_map_update_batch() And the flag can be BPF_F_ALL_CPUS, but cannot be 'BPF_F_CPU \| BPF_F_ALL_CPUS'. Behavior: * If the flag is BPF_F_ALL_CPUS, the update is applied across all CPUs. * If the flag is BPF_F_CPU, it updates value only to the specified CPU. * If the flag is BPF_F_CPU, lookup value only from the specified CPU. * lookup does not support BPF_F_ALL_CPUS. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20260107022022.12843-7-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 20:48:32 -08:00
Leon Hwang	47c79f05aa	bpf: Add BPF_F_CPU and BPF_F_ALL_CPUS flags support for percpu_cgroup_storage maps Introduce BPF_F_ALL_CPUS flag support for percpu_cgroup_storage maps to allow updating values for all CPUs with a single value for update_elem API. Introduce BPF_F_CPU flag support for percpu_cgroup_storage maps to allow: * update value for specified CPU for update_elem API. * lookup value for specified CPU for lookup_elem API. The BPF_F_CPU flag is passed via map_flags along with embedded cpu info. Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20260107022022.12843-6-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 20:48:32 -08:00
Leon Hwang	8526397c3c	bpf: Copy map value using copy_map_value_long for percpu_cgroup_storage maps Copy map value using 'copy_map_value_long()'. It's to keep consistent style with the way of other percpu maps. No functional change intended. Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20260107022022.12843-5-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 20:48:32 -08:00
Leon Hwang	c6936161fd	bpf: Add BPF_F_CPU and BPF_F_ALL_CPUS flags support for percpu_hash and lru_percpu_hash maps Introduce BPF_F_ALL_CPUS flag support for percpu_hash and lru_percpu_hash maps to allow updating values for all CPUs with a single value for both update_elem and update_batch APIs. Introduce BPF_F_CPU flag support for percpu_hash and lru_percpu_hash maps to allow: * update value for specified CPU for both update_elem and update_batch APIs. * lookup value for specified CPU for both lookup_elem and lookup_batch APIs. The BPF_F_CPU flag is passed via: * map_flags along with embedded cpu info. * elem_flags along with embedded cpu info. Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20260107022022.12843-4-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 20:48:32 -08:00
Leon Hwang	8eb76cb03f	bpf: Add BPF_F_CPU and BPF_F_ALL_CPUS flags support for percpu_array maps Introduce support for the BPF_F_ALL_CPUS flag in percpu_array maps to allow updating values for all CPUs with a single value for both update_elem and update_batch APIs. Introduce support for the BPF_F_CPU flag in percpu_array maps to allow: * update value for specified CPU for both update_elem and update_batch APIs. * lookup value for specified CPU for both lookup_elem and lookup_batch APIs. The BPF_F_CPU flag is passed via: * map_flags of lookup_elem and update_elem APIs along with embedded cpu info. * elem_flags of lookup_batch and update_batch APIs along with embedded cpu info. Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20260107022022.12843-3-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 20:48:32 -08:00
Leon Hwang	2b421662c7	bpf: Introduce BPF_F_CPU and BPF_F_ALL_CPUS flags Introduce BPF_F_CPU and BPF_F_ALL_CPUS flags and check them for following APIs: * 'map_lookup_elem()' * 'map_update_elem()' * 'generic_map_lookup_batch()' * 'generic_map_update_batch()' And, get the correct value size for these APIs. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20260107022022.12843-2-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 20:48:32 -08:00
Alexei Starovoitov	a8d5067592	Merge branch 'bpf-verifier-allow-calling-arena-functions-when-holding-bpf-lock' Emil Tsalapatis says: ==================== bpf/verifier: Allow calling arena functions when holding BPF lock BPF arena-related kfuncs now cannot sleep, so they are safe to call while holding a spinlock. However, the verifier still rejects programs that do so. Update the verifier to allow arena kfunc calls while holding a lock. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Changes v1->v2: (https://lore.kernel.org/r/20260106-arena-under-lock-v1-0-6ca9c121d826@etsalapatis.com) - Added patch to account for active locks in_sleepable_context() (AI) ==================== Link: https://patch.msgid.link/20260106-arena-under-lock-v2-0-378e9eab3066@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 17:44:20 -08:00
Emil Tsalapatis	b81d5e9d96	selftests/bpf: add tests for arena kfuncs under lock Add selftests to ensure the verifier permits calling the arena kfunc API while holding a lock. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260106-arena-under-lock-v2-3-378e9eab3066@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 17:44:13 -08:00
Emil Tsalapatis	39f77533b6	bpf: Allow calls to arena functions while holding spinlocks The bpf_arena_*_pages() kfuncs can be called from sleepable contexts, but the verifier still prevents BPF programs from calling them while holding a spinlock. Amend the verifier to allow for BPF programs calling arena page management functions while holding a lock. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260106-arena-under-lock-v2-2-378e9eab3066@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 17:44:00 -08:00
Emil Tsalapatis	b25b48c7d3	bpf: Check active lock count in in_sleepable_context() The in_sleepable_context() function is used to specialize the BPF code in do_misc_fixups(). With the addition of nonsleepable arena kfuncs, there are kfuncs whose specialization depends on whether we are holding a lock. We should use the nonsleepable version while holding a lock and the sleepable one when not. Add a check for active_locks to account for locking when specializing arena kfuncs. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260106-arena-under-lock-v2-1-378e9eab3066@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 17:43:19 -08:00
Roman Gushchin	ea180ffbd2	mm: drop mem_cgroup_usage() declaration from memcontrol.h mem_cgroup_usage() is not used outside of memcg-v1 code, the declaration was added by a mistake. Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Link: https://lore.kernel.org/r/20260106042313.140256-1-roman.gushchin@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-05 21:02:05 -08:00
Puranjay Mohan	a069190b59	bpf: Replace __opt annotation with __nullable for kfuncs The __opt annotation was originally introduced specifically for buffer/size argument pairs in bpf_dynptr_slice() and bpf_dynptr_slice_rdwr(), allowing the buffer pointer to be NULL while still validating the size as a constant. The __nullable annotation serves the same purpose but is more general and is already used throughout the BPF subsystem for raw tracepoints, struct_ops, and other kfuncs. This patch unifies the two annotations by replacing __opt with __nullable. The key change is in the verifier's get_kfunc_ptr_arg_type() function, where mem/size pair detection is now performed before the nullable check. This ensures that buffer/size pairs are correctly classified as KF_ARG_PTR_TO_MEM_SIZE even when the buffer is nullable, while adding an !arg_mem_size condition to the nullable check prevents interference with mem/size pair handling. When processing KF_ARG_PTR_TO_MEM_SIZE arguments, the verifier now uses is_kfunc_arg_nullable() instead of the removed is_kfunc_arg_optional() to determine whether to skip size validation for NULL buffers. This is the first documentation added for the __nullable annotation, which has been in use since it was introduced but was previously undocumented. No functional changes to verifier behavior - nullable buffer/size pairs continue to work exactly as before. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102221513.1961781-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 15:51:34 -08:00
Alexei Starovoitov	7694ff8f6c	Merge branch 'memcg-accounting-for-bpf-arena' Puranjay Mohan says: ==================== memcg accounting for BPF arena v4: https://lore.kernel.org/all/20260102181333.3033679-1-puranjay@kernel.org/ Changes in v4->v5: - Remove unused variables from bpf_map_alloc_pages() (CI) v3: https://lore.kernel.org/all/20260102151852.570285-1-puranjay@kernel.org/ Changes in v3->v4: - Do memcg set/recover in arena_reserve_pages() rather than bpf_arena_reserve_pages() for symmetry with other kfuncs (Alexei) v2: https://lore.kernel.org/all/20251231141434.3416822-1-puranjay@kernel.org/ Changes in v2->v3: - Remove memcg accounting from bpf_map_alloc_pages() as the caller does it already. (Alexei) - Do memcg set/recover in arena_alloc/free_pages() rather than bpf_arena_alloc/free_pages(), it reduces copy pasting in sleepable/non_sleepable functions. v1: https://lore.kernel.org/all/20251230153006.1347742-1-puranjay@kernel.org/ Changes in v1->v2: - Return both pointers through arguments from bpf_map_memcg_enter and make it return void. (Alexei) - Add memcg accounting in arena_free_worker (AI) This set adds memcg accounting logic into arena kfuncs and other places that do allocations in arena.c. ==================== Link: https://patch.msgid.link/20260102200230.25168-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 14:32:00 -08:00
Puranjay Mohan	e66fe1bc6d	bpf: arena: Reintroduce memcg accounting When arena allocations were converted from bpf_map_alloc_pages() to kmalloc_nolock() to support non-sleepable contexts, memcg accounting was inadvertently lost. This commit restores proper memory accounting for all arena-related allocations. All arena related allocations are accounted into memcg of the process that created bpf_arena. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102200230.25168-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 14:31:59 -08:00
Puranjay Mohan	817593af7b	bpf: syscall: Introduce memcg enter/exit helpers Introduce bpf_map_memcg_enter() and bpf_map_memcg_exit() helpers to reduce code duplication in memcg context management. bpf_map_memcg_enter() gets the memcg from the map, sets it as active, and returns both the previous and the now active memcg. bpf_map_memcg_exit() restores the previous active memcg and releases the reference obtained during enter. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102200230.25168-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 14:31:59 -08:00
Alexei Starovoitov	e40030a46a	Merge branch 'bpf-make-kf_trusted_args-default' Puranjay Mohan says: ==================== bpf: Make KF_TRUSTED_ARGS default v2: https://lore.kernel.org/all/20251231171118.1174007-1-puranjay@kernel.org/ Changes in v2->v3: - Fix documentation: add a new section for kfunc parameters (Eduard) - Remove all occurances of KF_TRUSTED from comments, etc. (Eduard) - Fix the netfilter kfuncs to drop dead NULL checks. - Fix selftest for netfilter kfuncs to check for verification failures and remove the runtime failure that are not possible after this changes v1: https://lore.kernel.org/all/20251224192448.3176531-1-puranjay@kernel.org/ Changes in v1->v2: - Update kfunc_dynptr_param selftest to use a real pointer that is not ptr_to_stack and not CONST_PTR_TO_DYNPTR rather than casting 1 (Alexei) - Thoroughly review all kfuncs in the to find regressions or missing annotations. (Eduard) - Fix kfuncs found from the above step. This series makes trusted arguments the default requirement for all BPF kfuncs, inverting the current opt-in model. Instead of requiring explicit KF_TRUSTED_ARGS flags, kfuncs now require trusted arguments by default and must explicitly opt-out using __nullable/__opt annotations or the KF_RCU flag. This improves security and type safety by preventing BPF programs from passing untrusted or NULL pointers to kernel functions at verification time, while maintaining flexibility for the small number of kfuncs that legitimately need to accept NULL or RCU pointers. MOTIVATION The current opt-in model is error-prone and inconsistent. Most kfuncs already require trusted pointers from sources like KF_ACQUIRE, struct_ops callbacks, or tracepoints. Making trusted arguments the default: - Prevents NULL pointer dereferences at verification time - Reduces defensive NULL checks in kernel code - Provides better error messages for invalid BPF programs - Aligns with existing patterns (context pointers, struct_ops already trusted) IMPACT ANALYSIS Comprehensive analysis of all 304+ kfuncs across 37 kernel files found: - Most kfuncs (299/304) are already safe and require no changes - Only 4 kfuncs required fixes (all included in this series) - 0 regressions found in independent verification All bpf selftests are passing. The hid_bpf tests are also passing: # PASSED: 20 / 20 tests passed. # Totals: pass:20 fail:0 xfail:0 xpass:0 skip:0 error:0 bpf programs in drivers/hid/bpf/progs/ show no regression as shown by veristat: Done. Processed 24 files, 62 programs. Skipped 0 files, 0 programs. TECHNICAL DETAILS The verifier now validates kfunc arguments in this order: 1. NULL check (runs first): Rejects NULL unless parameter has __nullable/__opt 2. Trusted check: Rejects untrusted pointers unless kfunc has KF_RCU Special cases that bypass trusted checking: - Context pointers (xdp_md, __sk_buff): Handled via KF_ARG_PTR_TO_CTX - Struct_ops callbacks: Pre-marked as PTR_TRUSTED during initialization - KF_RCU kfuncs: Have separate validation path for RCU pointers BACKWARD COMPATIBILITY This affects BPF program verification, not runtime: - Valid programs passing trusted pointers: Continue to work - Programs with bugs: May now fail verification (preventing runtime crashes) This series introduces two intentional breaking changes to the BPF verifier's kfunc handling: 1. NULL pointer rejection timing: Kfuncs that previously accepted NULL pointers without KF_TRUSTED_ARGS will now reject NULL at verification time instead of returning runtime errors. This affects netfilter connection tracking functions (bpf_xdp_ct_lookup, bpf_skb_ct_lookup, bpf_xdp_ct_alloc, bpf_skb_ct_alloc), which now enforce their documented "Cannot be NULL" requirements at load time rather than returning -EINVAL at runtime. 2. Fentry/fexit program restrictions: BPF programs using fentry/fexit attachment points can no longer pass their callback arguments directly to kfuncs, as these arguments are not marked as trusted by default. Programs requiring trusted argument semantics should migrate to tp_btf (tracepoint with BTF) attachment points where arguments are guaranteed trusted by the verifier. Both changes strengthen the verifier's safety guarantees by catching errors earlier in the development cycle and are accompanied by comprehensive test updates demonstrating the new expected behaviors. ==================== Link: https://patch.msgid.link/20260102180038.2708325-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:30 -08:00
Puranjay Mohan	cf503eb2c6	selftests: bpf: Fix test_bpf_nf for trusted args becoming default With trusted args now being the default, passing NULL to kfunc parameters that are pointers causes verifier rejection rather than a runtime error. The test_bpf_nf test was failing because it attempted to pass NULL to bpf_xdp_ct_lookup() to verify runtime error handling. Since the NULL check now happens at verification time, remove the runtime test case that passed NULL to the bpf_tuple parameter and instead add verification-time tests to ensure the verifier correctly rejects programs that pass NULL to trusted arguments. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-11-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:29 -08:00
Puranjay Mohan	cf82580c86	selftests: bpf: fix cgroup_hierarchical_stats The cgroup_hierarchical_stats selftests uses an fentry program attached to cgroup_attach_task and then passes the received &dst_cgrp->self to the css_rstat_updated() kfunc. The verifier now assumes that all kfuncs only takes trusted pointer arguments, and pointers received by fentry are not marked trustes by default. Use a tp_btf program in place for fentry for this test, pointers received by tp_btf programs are marked trusted by the verifier. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-10-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:29 -08:00
Puranjay Mohan	230b0118e4	selftests: bpf: fix test_kfunc_dynptr_param As verifier now assumes that all kfuncs only takes trusted pointer arguments, passing 0 (NULL) to a kfunc that doesn't mark the argument as __nullable or __opt will be rejected with a failure message of: Possibly NULL pointer passed to trusted arg<n> Pass a non-null value to the kfunc to test the expected failure mode. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-9-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:29 -08:00
Puranjay Mohan	03cc77b10e	selftests: bpf: Update failure message for rbtree_fail The rbtree_api_use_unchecked_remove_retval() selftest passes a pointer received from bpf_rbtree_remove() to bpf_rbtree_add() without checking for NULL, this was earlier caught by __check_ptr_off_reg() in the verifier. Now the verifier assumes every kfunc only takes trusted pointer arguments, so it catches this NULL pointer earlier in the path and provides a more accurate failure message. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-8-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:29 -08:00
Puranjay Mohan	df5004579b	selftests: bpf: Update kfunc_param_nullable test for new error message With trusted args now being the default, the NULL pointer check runs before type-specific validation. Update test3 to expect the new error message "Possibly NULL pointer passed to trusted arg0" instead of the old dynptr-specific error message. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-7-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:29 -08:00
Puranjay Mohan	8fe172fa30	HID: bpf: drop dead NULL checks in kfuncs As KF_TRUSTED_ARGS is now considered default for all kfuns, the verifier will not allow passing NULL pointers to these kfuns. These checks for NULL pointers can therefore be removed. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-6-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:29 -08:00
Puranjay Mohan	cd1d609491	bpf: xfrm: drop dead NULL check in bpf_xdp_get_xfrm_state() As KF_TRUSTED_ARGS is now considered the default for all kfuncs, the opts parameter in bpf_xdp_get_xfrm_state() can never be NULL. Verifier will detect this at load time and will not allow passing NULL to this function. This matches the documentation above the kfunc that says this parameter (opts) Cannot be NULL. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-5-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:29 -08:00
Puranjay Mohan	bddaf9adda	bpf: net: netfilter: drop dead NULL checks bpf_xdp_ct_lookup() and bpf_skb_ct_lookup() receive bpf_tuple and opts parameter that are expected to be not NULL for real usages (see doc string above functions). They return an error if NULL is passed for opts or tuple. The verifier will now reject programs that pass NULL to these parameters, the kfuns can assume that these are always valid pointer, so drop the NULL checks for these parameters. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-4-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:28 -08:00
Puranjay Mohan	7646c7afd9	bpf: Remove redundant KF_TRUSTED_ARGS flag from all kfuncs Now that KF_TRUSTED_ARGS is the default for all kfuncs, remove the explicit KF_TRUSTED_ARGS flag from all kfunc definitions and remove the flag itself. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:28 -08:00

1 2 3 4 5 ...

1412167 Commits