linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-16 10:11:38 -04:00

Author	SHA1	Message	Date
Ihor Solodrai	2364959abe	libbpf: Start v1.8 development cycle libbpf 1.7.0 has been released [1]. Update libbpf.map and libbpf_verson.h to start v1.8 development cycle. [1] https://github.com/libbpf/libbpf/releases/tag/v1.7.0 Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260316163916.2822081-1-ihor.solodrai@linux.dev	2026-03-16 14:15:15 -07:00
Ihor Solodrai	7e2f40ef0a	selftests/bpf: Bump path and command buffer sizes in bpftool_helpers.c The path length of 64 is way too low in some envirnoments, which leads to subtle failures due to truncation [1]. Replace BPFTOOL_PATH_MAX_LEN with PATH_MAX, and set BPFTOOL_FULL_CMD_MAX_LEN to double of PATH_MAX. [1] https://github.com/libbpf/libbpf/actions/runs/22980753016/job/66719800527 Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260312234820.439720-1-ihor.solodrai@linux.dev	2026-03-16 14:14:57 -07:00
Mykyta Yatsenko	c73a244366	bpftool: Allow explicitly skip llvm, libbfd and libcrypto dependencies Introduce SKIP_LLVM, SKIP_LIBBFD, and SKIP_CRYPTO build flags that let users build bpftool without these optional dependencies. SKIP_LLVM=1 skips LLVM even when detected. SKIP_LIBBFD=1 prevents the libbfd JIT disassembly fallback when LLVM is absent. Together, they produce a bpftool with no disassembly support. SKIP_CRYPTO=1 excludes sign.c and removes the -lcrypto link dependency. Inline stubs in main.h return errors with a clear message if signing functions are called at runtime. Use BPFTOOL_WITHOUT_CRYPTO (not HAVE_LIBCRYPTO_SUPPORT) as the C define, following the BPFTOOL_WITHOUT_SKELETONS naming convention for bpftool-internal build config, leaving HAVE_LIBCRYPTO_SUPPORT free for proper feature detection in the future. All three flags are propagated through the selftests Makefile to bpftool sub-builds. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260312-b4-bpftool_build-v2-1-4c9d57133644@meta.com	2026-03-16 14:14:14 -07:00
Alexei Starovoitov	6c8e1a9eee	Merge branch 'bpf-relax-8-frame-limitation-for-global-subprogs' Emil Tsalapatis says: ==================== bpf: Relax 8 frame limitation for global subprogs The BPF verifier currently limits the maximum runtime call stack to 8 frames. Larger BPF programs like sched-ext schedulers routinely fail verification because they exceed this limit, even as they use very little actual stack space for each frame. Relax the verifier to permit call stacks > 8 frames deep when the call stacks include global subprogs. The old 8 stack frame limit now only applies to call stacks composed entirely of static function calls. This works because global functions are each verified in isolation, so the verifier does not need to cross-reference verification state across the function call boundary, which has been the reason for limiting the call stack size in the first place. This patch does not change the verification time limit of 8 stack frames. Static functions that are inlined for verification purposes still only go 8 frames deep to avoid changing the verifier's internal data structures used for verification. These data structures only support holding information on up to 8 stack frames. This patch also does not adjust the actual maximum stack size of 512. CHANGELOG ========= v5 -> v6 (https://lore.kernel.org/bpf/20260311182831.91219-1-emil@etsalapatis.com/) - Make bpf_subprog_call_depth_info internal to verifier.c (Alexei) v4 -> v5 (https://lore.kernel.org/bpf/20260309204430.201219-1-emil@etsalapatis.com/) - Move depth tracking state to verifier (Eduard) and free it after verification (Alexei) - Fix selftest patch title and formatting errors (Yonghong) v3 -> v4 (https://lore.kernel.org/bpf/20260303043106.406099-1-emil@etsalapatis.com/) - Factor out temp call depth tracking info into its own struct (Eduard) - Bring depth calculation loop in line with the other instances (Mykyta) - Add comment on why selftest call stack is 16 bytes/frame (Eduard) - Rename "cidx" to "caller" for clarity (Mykyta, Eduard) v2 -> v3 (https://lore.kernel.org/bpf/20260210213606.475415-1-emil@etsalapatis.com/) - Change logic to remove arbitrary limit on call depth (Eduard) - Add additional selftests (Eduard) v1 -> v2 (https://lore.kernel.org/bpf/20260202233716.835638-1-emil@etsalapatis.com) - Adjust patch to only increase the runtime stack depth, leaving the verification-time stack depth unchanged (Alexei) Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> ==================== Link: https://patch.msgid.link/20260316161225.128011-1-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-16 11:26:42 -07:00
Emil Tsalapatis	01d5d2f7d9	selftests/bpf: Add deep call stack selftests Add tests that demonstrate the verifier support for deep call stacks while still enforcing maximum stack size limits. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260316161225.128011-3-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-16 11:26:41 -07:00
Emil Tsalapatis	ad95d3c758	bpf: Only enforce 8 frame call stack limit for all-static stacks The BPF verifier currently enforces a call stack depth of 8 frames, regardless of the actual stack space consumption of those frames. The limit is necessary for static call stacks, because the bookkeeping data structures used by the verifier when stepping into static functions during verification only support 8 stack frames. However, this limitation only matters for static stack frames: Global subprogs are verified by themselves and do not require limiting the call depth. Relax this limitation to only apply to static stack frames. Verification now only fails when there is a sequence of 8 calls to non-global subprogs. Calling into a global subprog resets the counter. This allows deeper call stacks, provided all frames still fit in the stack. The change does not increase the maximum size of the call stack, only the maximum number of frames we can place in it. Also change the progs/test_global_func3.c selftest to use static functions, since with the new patch it would otherwise unexpectedly pass verification. Acked-by: Mykyta Yatsenko <yatsenko@meta.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260316161225.128011-2-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-16 11:26:41 -07:00
Ilya Leoshkevich	202e42e4aa	s390/bpf: Zero-extend bpf prog return values and kfunc arguments s390x ABI requires callers to zero-extend unsigned arguments and sign-extend signed arguments, and callees to zero-extend unsigned return values and sign-extend signed return values. s390 BPF JIT currently implements only sign extension. Fix this omission and implement zero extension too. Fixes: `528eb2cb87` ("s390/bpf: Implement arch_prepare_bpf_trampoline()") Reported-by: Hari Bathini <hbathini@linux.ibm.com> Closes: https://lore.kernel.org/bpf/20260312080113.843408-1-hbathini@linux.ibm.com/ Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260313174807.581826-1-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-16 09:21:35 -07:00
Alexei Starovoitov	bb41fcef5c	Merge branch 'optimize-bounds-refinement-by-reordering-deductions' Paul Chaignon says: ==================== Optimize bounds refinement by reordering deductions This patchset optimizes the bounds refinement (reg_bounds_sync) by reordering deductions in __reg_deduce_bounds. This reordering allows us to improve precision slightly while losing one call to __reg_deduce_bounds. The first patch from Eduard refactors the __reg_deduce_bounds subfunctions, the second patch implements the reordering, and the last one adds a selftest. Changes in v3: - Added first commit from Eduard that significantly helps with readability of second commit. - Reshuffled a bit more the functions in the second commit to improve precision (Eduard). - Rebased. Changes in v2: - Updated description to mention potential precision improvement and to clarify the sequence of refinements (Shung-Hsi). - Added the second patch. - Rebased. ==================== Link: https://patch.msgid.link/cover.1773401138.git.paul.chaignon@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-13 19:09:35 -07:00
Paul Chaignon	0a753d8cd6	selftests/bpf: Test case for refinement improvement using 64b bounds This new selftest demonstrates the improvement of bounds refinement from the previous patch. It is inspired from a set of reg_bounds_sync inputs generated using CBMC [1] by Shung-Hsi: reg.smin_value=0x8000000000000002 reg.smax_value=2 reg.umin_value=2 reg.umax_value=19 reg.s32_min_value=2 reg.s32_max_value=3 reg.u32_min_value=2 reg.u32_max_value=3 reg_bounds_sync returns R=[2; 3] without the previous patch, and R=2 with it. __reg64_deduce_bounds is able to derive that u64=2, but before the previous patch, those bounds are overwritten in __reg_deduce_mixed_bounds using the 32bits bounds. To arrive to these reg_bounds_sync inputs, we bound the 32bits value first to [2; 3]. We can then upper-bound s64 without impacting u64. At that point, the refinement to u64=2 doesn't happen because the ranges still overlap in two points: 0 umin=2 umax=0xff..ff00..03 U64_MAX \| [xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx] \| \|----------------------------\|------------------------------\| \|xx] [xxxxxxxxxxxxxxxxxxxxxxxxxxxx\| 0 smax=2 smin=0x800..02 -1 With an upper-bound check at value 19, we can reach the above inputs for reg_bounds_sync. At that point, the refinement to u64=2 happens and because it isn't overwritten by __reg_deduce_mixed_bounds anymore, reg_bounds_sync returns with reg=2. The test validates this result by including an illegal instruction in the (dead) branch reg != 2. Link: https://github.com/shunghsiyu/reg_bounds_sync-review/ [1] Co-developed-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Tested-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/622dc51c581cd4d652fff362188b2a5f73c1fe99.1773401138.git.paul.chaignon@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-13 19:09:35 -07:00
Paul Chaignon	9e5fcb003a	bpf: Avoid one round of bounds deduction In commit `5dbb19b16a` ("bpf: Add third round of bounds deduction"), I added a new round of bounds deduction because two rounds were not enough to converge to a fixed point. This commit slightly refactor the bounds deduction logic such that two rounds are enough. In [1], Eduard noticed that after we improved the refinement logic, a third call to the bounds deduction (__reg_deduce_bounds) was needed to converge to a fixed point. More specifically, we needed this third call to improve the s64 range using the s32 range. We added the third call and postponed a more detailed analysis of the refinement logic. I've been looking into this more recently. The register refinement consists of the following calls. __update_reg_bounds(); 3 x __reg_deduce_bounds() { deduce_bounds_32_from_64(); deduce_bounds_32_from_32(); deduce_bounds_64_from_64(); deduce_bounds_64_from_32(); }; __reg_bound_offset(); __update_reg_bounds(); From this, we can observe that we first improve the 32bit ranges from the 64bit ranges in deduce_bounds_32_from_64, then improve the 64bit ranges on their own in deduce_bounds_64_from_64. Intuitively, if we were to improve the 64bit ranges on their own before we use them to improve the 32bit ranges, we may reach a fixed point earlier. In a similar manner, using CBMC, Eduard found that it's best to improve the 32bit ranges on their own after we've improve them using the 64bit ranges. That is, running deduce_bounds_32_from_32 after deduce_bounds_32_from_64. These changes allow us to lose one call to __reg_deduce_bounds. Without this reordering, the test "verifier_bounds/bounds deduction cross sign boundary, negative overlap" fails when removing one call to __reg_deduce_bounds. In some cases, this change can even improve precision a little bit, as illustrated in the new selftest in the next patch. As expected, this change didn't have any impact on the number of instructions processed when running it through the Cilium complexity test suite [2]. Link: https://lore.kernel.org/bpf/aIKtSK9LjQXB8FLY@mail.gmail.com/ [1] Link: https://pchaigno.github.io/test-verifier-complexity.html [2] Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Co-developed-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/1b00d2749ec4c774c3ada84e265ac7fda72cfe56.1773401138.git.paul.chaignon@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-13 19:09:35 -07:00
Eduard Zingerman	879cace976	bpf: better naming for __reg_deduce_bounds() parts This renaming will also help reshuffle the different parts in the subsequent patch. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/a988ecf2c57e265b97917136b14b421038534e8c.1773401138.git.paul.chaignon@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-13 19:09:35 -07:00
Hari Bathini	2af3aa702c	selftests/bpf: Improve test coverage for kfunc call On powerpc, immediate load instructions are sign extended. In case of unsigned types, arguments should be explicitly zero-extended by the caller. For kfunc call, this needs to be handled in the JIT code. In bpf_kfunc_call_test4(), that tests for sign-extension of signed argument types in kfunc calls, add some additional failure checks. And add bpf_kfunc_call_test5() to test zero-extension of unsigned argument types in kfunc calls. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260312080113.843408-1-hbathini@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-13 07:13:35 -07:00
Varun R Mallya	ca0f39a369	selftests/bpf: Fix const qualifier warning in fexit_bpf2bpf.c Building selftests with clang 23.0.0 (6fae863eba8a72cdd82f37e7111a46a70be525e0) triggers the following error: tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c:117:12: error: assigning to 'char ' from 'const char ' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers] The variable `tgt_name` is declared as `char `, but it stores the result of strstr(prog_name[i], "/"). Since `prog_name[i]` is a `const char `, the returned pointer should also be treated as const-qualified. Update `tgt_name` to `const char *` to match the type of the underlying string and silence the compiler warning. Signed-off-by: Varun R Mallya <varunrmallya@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Menglong Dong <menglong.dong@linux.dev> Link: https://lore.kernel.org/bpf/20260305222132.470700-1-varunrmallya@gmail.com	2026-03-11 10:54:40 -07:00
Sun Jian	c02e0ab8ae	selftests/bpf: Skip livepatch test when prerequisites are missing livepatch_trampoline relies on livepatch sysfs and livepatch-sample.ko. When CONFIG_LIVEPATCH is disabled or the samples module isn't built, the test fails with ENOENT and causes false failures in minimal CI configs. Skip the test when livepatch sysfs or the sample module is unavailable. Also avoid writing to livepatch sysfs when it's not present. Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20260309104448.817401-1-sun.jian.kdev@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-11 09:36:37 -07:00
Sun Jian	aa181c7d64	selftests/bpf: drop serial restriction Patch 1/2 added PID filtering to the probe_user BPF program to avoid cross-test interference from the global connect() hooks. With the interference removed, drop the serial_ prefix and remove the stale TODO comment so the test can run in parallel. Tested: ./test_progs -t probe_user -v ./test_progs -j$(nproc) -t probe_user Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260306083330.518627-2-sun.jian.kdev@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-11 09:34:22 -07:00
Sun Jian	70ce840d5f	selftests/bpf: filter by pid to avoid cross-test interference The test installs a kprobe on __sys_connect and checks that bpf_probe_write_user() can modify the syscall argument. However, any concurrent thread in any other test that calls connect() will also trigger the kprobe and have its sockaddr silently overwritten, causing flaky failures in unrelated tests. Constrain the hook to the current test process by filtering on a PID stored as a global variable in .bss. Initialize the .bss value from user space before bpf_object__load() using bpf_map__set_initial_value(), and validate the bss map value size to catch layout mismatches. No new map is introduced and the test keeps the existing non-skeleton flow. Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260306083330.518627-1-sun.jian.kdev@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-11 09:34:22 -07:00
Viktor Malik	900b7cc73c	selftests/bpf: Speed up module_attach test The module_attach test contains subtests which check that unloading a module while there are BPF programs attached to its functions is not possible because the module is still referenced. The problem is that the test calls the generic unload_module() helper function which is used for module cleanup after test_progs terminate and tries to wait until all module references are released. This unnecessarily slows down the module_attach subtests since each unsuccessful call to unload_module() takes about 1 second. Introduce try_unload_module() which takes the number of retries as a parameter. Make unload_module() call it with the currently used amount of 10000 retries but call it with just 1 retry from module_attach tests as it is always expected to fail. This speeds up the module_attach() test significantly. Before: # time ./test_progs -t module_attach [...] Summary: 1/14 PASSED, 0 SKIPPED, 0 FAILED real 0m5.011s user 0m0.293s sys 0m0.108s After: # time ./test_progs -t module_attach [...] Summary: 1/14 PASSED, 0 SKIPPED, 0 FAILED real 0m0.350s user 0m0.197s sys 0m0.063s Signed-off-by: Viktor Malik <vmalik@redhat.com> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Tested-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260306101628.3822284-1-vmalik@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-11 09:29:09 -07:00
Alan Maguire	e95e85b891	selftests/bpf: Handle !CONFIG_SMC in bpf_smc.c Currently BPF selftests will fail to compile if CONFIG_SMC is not set. Use BPF CO-RE to work around the case where CONFIG_SMC is not set; use ___local variants of relevant structures and utilize bpf_core_field_exists() for net->smc. The test continues to pass where CONFIG_SMC=y CONFIG_SMC_HS_CTRL_BPF=y but these changes allow the selftests to build in the absence of CONFIG_SMC=y. Also ensure that we get a pure skip rather than a skip+fail by removing the SMC is unsupported part from the ASSERT_FALSE() in get_smc_nl_family(); doing this means we get a skip without a fail when CONFIG_SMC is not set: $ sudo ./test_progs -t bpf_smc Summary: 1/0 PASSED, 1 SKIPPED, 0 FAILED Fixes: `beb3c67297` ("bpf/selftests: Add selftest for bpf_smc_hs_ctrl") Reported-by: Colm Harrington <colm.harrington@oracle.com> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Tested-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://patch.msgid.link/20260310111330.601765-1-alan.maguire@oracle.com	2026-03-10 17:40:15 -07:00
Alexei Starovoitov	0c55d4817a	Merge branch 'fix-test_cgroup_iter_memcg-issues-found-during-back-porting' Hui Zhu says: ==================== Fix test_cgroup_iter_memcg issues found during back-porting While back-porting "mm: bpf kfuncs to access memcg data", I encountered issues with test_cgroup_iter_memcg, specifically in test_kmem. The test_cgroup_iter_memcg test would falsely pass when bpf_mem_cgroup_page_state() failed due to incompatible enum values across kernel versions. Additionally, test_kmem would fail on systems with cgroup.memory=nokmem enabled. These patches are my fixes for the problems I encountered. Changelog: v5: According to the comments of Emil Tsalapatis and JP Kobryn, dropped "selftests/bpf: Check bpf_mem_cgroup_page_state return value". v4: Fixed wrong git commit log in "bpf: Use bpf_core_enum_value for stats in cgroup_iter_memcg". v3: According to the comments of JP Kobryn, remove kmem subtest from cgroup_iter_memcg and fix assertion string in test_pgfault. v2: According to the comments of JP Kobryn, added bpf_core_enum_value() usage in the BPF program to handle cross-kernel enum value differences at load-time instead of compile-time. Dropped the mm/memcontrol.c patch. Modified test_kmem handling: instead of skipping when nokmem is set, verify that kmem value is zero as expected. According to the comments of bot, fixed assertion message: changed "bpf_mem_cgroup_page_state" to "bpf_mem_cgroup_vm_events" for PGFAULT check. ==================== Link: https://patch.msgid.link/cover.1772505399.git.zhuhui@kylinos.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-10 11:54:05 -07:00
Hui Zhu	da99028c21	selftests/bpf: Use bpf_core_enum_value for stats in cgroup_iter_memcg Replace hardcoded enum values with bpf_core_enum_value() calls in cgroup_iter_memcg test to improve portability across different kernel versions. The change adds runtime enum value resolution for: - node_stat_item: NR_ANON_MAPPED, NR_SHMEM, NR_FILE_PAGES, NR_FILE_MAPPED - vm_event_item: PGFAULT This ensures the BPF program can adapt to enum value changes between kernel versions. Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Reviewed-by: JP Kobryn <jp.kobryn@linux.dev> Signed-off-by: Hui Zhu <zhuhui@kylinos.cn> Link: https://lore.kernel.org/r/ca6eb1a1a4fd7a17ffe995acf52c9a4ceb7bac13.1772505399.git.zhuhui@kylinos.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-10 11:53:48 -07:00
Hui Zhu	a8fce027e1	selftests/bpf: Remove kmem subtest from cgroup_iter_memcg When cgroup.memory=nokmem is set in the kernel command line, kmem accounting is disabled. This causes the test_kmem subtest in cgroup_iter_memcg to fail because it expects non-zero kmem values. Remove the kmem subtest altogether since the remaining subtests (shmem, file, pgfault) already provide sufficient coverage for the cgroup iter memcg functionality. Reviewed-by: JP Kobryn <jp.kobryn@linux.dev> Signed-off-by: Hui Zhu <zhuhui@kylinos.cn> Link: https://lore.kernel.org/r/35fa32a019361ec26265c8a789ee31e448d4dbda.1772505399.git.zhuhui@kylinos.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-10 11:53:23 -07:00
Alexei Starovoitov	437350df86	Merge branch 'bpf-support-for-non_null-ptr-detection-with-jeq-jne-with-register-operand' Cupertino Miranda says: ==================== bpf: support for non_null ptr detection with JEQ/JNE with register operand Changes from v1: - Corrected typos in commit messages. - Fixed indentation. - Replaced text by simpler version suggested by Eduard. Changes from v2: - Small fixes after AI patch checker complaints. Changes from v3: - Removed log file. No idea how that got added. ==================== Link: https://patch.msgid.link/20260304195018.181396-1-cupertino.miranda@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-10 11:51:27 -07:00
Cupertino Miranda	6a1c9a442f	selftests/bpf: tests to non_null ptr detection using register operand in JEQ/JNE This patch adds two tests to check non_null ptr detection when using JEQ and JNE have a register in second operand, and its value is known to be 0. Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com> Cc: David Faust <david.faust@oracle.com> Cc: Jose Marchesi <jose.marchesi@oracle.com> Cc: Elena Zannoni <elena.zannoni@oracle.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260304195018.181396-4-cupertino.miranda@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-10 11:51:18 -07:00
Cupertino Miranda	2f4cb53eed	bpf: detect non null pointer with register operand in JEQ/JNE. This patch adds support to validate a pointer as not null when its value is compared to a register whose value the verifier knows to be null. Initial pattern only verifies against an immediate operand. Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com> Cc: David Faust <david.faust@oracle.com> Cc: Jose Marchesi <jose.marchesi@oracle.com> Cc: Elena Zannoni <elena.zannoni@oracle.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260304195018.181396-3-cupertino.miranda@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-10 11:51:18 -07:00
Alexei Starovoitov	bd2e02e3c9	Merge branch 'always-allow-sleepable-and-fmod_ret-programs-on-syscalls' Viktor Malik says: ==================== Always allow sleepable and fmod_ret programs on syscalls Both sleepable and fmod_ret programs are only allowed on selected functions. For convenience, the error injection list was originally used. When error injection is disabled, that list is empty and sleepable tracing programs, as well as fmod_ret programs, are effectively unavailable. This patch series addresses the issue by at least enabling sleepable and fmod_ret programs on syscalls, if error injection is disabled. More details on why syscalls are used can be found in [1]. [1] https://lore.kernel.org/bpf/CAADnVQK6qP8izg+k9yV0vdcT-+=axtFQ2fKw7D-2Ei-V6WS5Dw@mail.gmail.com/ Changes in v3: - Handle LoongArch (Leon) - Add Kumar's and Leon's acks Changes in v2: - Check "sys_" prefix instead of "sys" for powerpc syscalls (AI review) - Add link to the original discussion (Kumar) - Add explanation why arch syscall prefixes are hard-coded (Leon) ==================== Link: https://patch.msgid.link/cover.1773055375.git.vmalik@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-09 09:28:43 -07:00
Viktor Malik	fcec7c66d6	selftests/bpf: Move sleepable refcounted_kptr tests to syscalls Now that sleepable programs are always enabled on syscalls, let refcounted_kptr tests use syscalls rather than bpf_testmod_test_read, which is not sleepable with error injection disabled. The tests just check that the verifier can handle usage of RCU locks in sleepable programs and never actually attach. So, the attachment target doesn't matter (as long as it is sleepable) and with syscalls, the tests pass on kernels with disabled error injection. Signed-off-by: Viktor Malik <vmalik@redhat.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/8b6626eae384559855f7a0e846a16e83f25f06f6.1773055375.git.vmalik@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-09 09:28:42 -07:00
Viktor Malik	20c2e102a2	bpf: Always allow fmod_ret programs on syscalls fmod_ret BPF programs can only be attached to selected functions. For convenience, the error injection list was originally used (along with functions prefixed with "security_"), which contains syscalls and several other functions. When error injection is disabled (CONFIG_FUNCTION_ERROR_INJECTION=n), that list is empty and fmod_ret programs are effectively unavailable for most of the functions. In such a case, at least enable fmod_ret programs on syscalls. Signed-off-by: Viktor Malik <vmalik@redhat.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/472310f9a5f4944ad03214e4d943a4830fd8eb76.1773055375.git.vmalik@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-09 09:28:42 -07:00
Viktor Malik	16d9c56606	bpf: Always allow sleepable programs on syscalls Sleepable BPF programs can only be attached to selected functions. For convenience, the error injection list was originally used, which contains syscalls and several other functions. When error injection is disabled (CONFIG_FUNCTION_ERROR_INJECTION=n), that list is empty and sleepable tracing programs are effectively unavailable. In such a case, at least enable sleepable programs on syscalls. For discussion why syscalls were chosen, see [1]. To detect that a function is a syscall handler, we check for arch-specific prefixes for the most common architectures. Unfortunately, the prefixes are hard-coded in arch syscall code so we need to hard-code them, too. [1] https://lore.kernel.org/bpf/CAADnVQK6qP8izg+k9yV0vdcT-+=axtFQ2fKw7D-2Ei-V6WS5Dw@mail.gmail.com/ Signed-off-by: Viktor Malik <vmalik@redhat.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/2704a8512746655037e3c02b471b31bd0d76c8db.1773055375.git.vmalik@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-09 09:28:42 -07:00
Alexei Starovoitov	099bded752	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf 7.0-rc3 Cross-merge BPF and other fixes after downstream PR. No conflicts. Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-08 17:46:38 -07:00
Linus Torvalds	1f318b96cc	Linux 7.0-rc3 v7.0-rc3	2026-03-08 16:56:54 -07:00
Linus Torvalds	fc9f248d8c	Merge tag 'efi-fixes-for-v7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fix from Ard Biesheuvel: "Fix for the x86 EFI workaround keeping boot services code and data regions reserved until after SetVirtualAddressMap() completes: deferred struct page initialization may result in some of this memory being lost permanently" * tag 'efi-fixes-for-v7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: x86/efi: defer freeing of boot services memory	2026-03-08 12:13:09 -07:00
Linus Torvalds	014441d1e4	Merge tag 'i2c-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fix from Wolfram Sang: "A revert for the i801 driver restoring old locking behaviour" * tag 'i2c-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: i801: Revert "i2c: i801: replace acpi_lock with I2C bus lock"	2026-03-08 10:17:05 -07:00
Linus Torvalds	c23719abc3	Merge tag 'x86-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: - Fix SEV guest boot failures in certain circumstances, due to very early code relying on a BSS-zeroed variable that isn't actually zeroed yet an may contain non-zero bootup values Move the variable into the .data section go gain even earlier zeroing - Expose & allow the IBPB-on-Entry feature on SNP guests, which was not properly exposed to guests due to initial implementational caution - Fix O= build failure when CONFIG_EFI_SBAT_FILE is using relative file paths - Fix the various SNC (Sub-NUMA Clustering) topology enumeration bugs/artifacts (sched-domain build errors mostly). SNC enumeration data got more complicated with Granite Rapids X (GNR) and Clearwater Forest X (CWF), which exposed these bugs and made their effects more serious - Also use the now sane(r) SNC code to fix resctrl SNC detection bugs - Work around a historic libgcc unwinder bug in the vdso32 sigreturn code (again), which regressed during an overly aggressive recent cleanup of DWARF annotations * tag 'x86-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry/vdso32: Work around libgcc unwinder bug x86/resctrl: Fix SNC detection x86/topo: Fix SNC topology mess x86/topo: Replace x86_has_numa_in_package x86/topo: Add topology_num_nodes_per_package() x86/numa: Store extra copy of numa_nodes_parsed x86/boot: Handle relative CONFIG_EFI_SBAT_FILE file paths x86/sev: Allow IBPB-on-Entry feature for SNP guests x86/boot/sev: Move SEV decompressor variables into the .data section	2026-03-07 17:12:06 -08:00
Linus Torvalds	6ff1020c2f	Merge tag 'timers-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Ingo Molnar: "Make clock_adjtime() syscall timex validation slightly more permissive for auxiliary clocks, to not reject syscalls based on the status field that do not try to modify the status field. This makes the ABI behavior in clock_adjtime() consistent with CLOCK_REALTIME" * tag 'timers-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timekeeping: Fix timex status validation for auxiliary clocks	2026-03-07 17:09:15 -08:00
Linus Torvalds	b1b9a9d0b5	Merge tag 'sched-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Ingo Molnar: "Fix a DL scheduler bug that may corrupt internal metrics during PI and setscheduler() syscalls, resulting in kernel warnings and misbehavior. Found during stress-testing" * tag 'sched-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/deadline: Fix missing ENQUEUE_REPLENISH during PI de-boosting	2026-03-07 17:07:13 -08:00
Eric Dumazet	1954c4f012	eventpoll: Convert epoll_put_uevent() to scoped user access Saves two function calls, and one stac/clac pair. stac/clac is rather expensive on older cpus like Zen 2. A synthetic network stress test gives a ~1.5% increase of pps on AMD Zen 2. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-03-07 15:03:14 -08:00
Linus Torvalds	3b5d535c63	Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two core changes and the rest in drivers, one core change to quirk the behaviour of the Iomega Zip drive and one to fix a hang caused by tag reallocation problems, which has mostly been seen by the iscsi client. Note the latter fixes the problem but still has a slight sysfs memory leak, so will be amended in the next pull request (once we've run the fix for the fix through our testing)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: target: Fix recursive locking in __configfs_open_file() scsi: devinfo: Add BLIST_SKIP_IO_HINTS for Iomega ZIP scsi: mpi3mr: Clear reset history on ready and recheck state after timeout scsi: core: Fix refcount leak for tagset_refcnt	2026-03-07 14:04:50 -08:00
Linus Torvalds	fb07430e6f	Merge tag 'fbdev-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev Pull fbdev fix from Helge Deller: "Silence build error in au1100fb driver found by kernel test robot" * tag 'fbdev-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: fbdev: au1100fb: Fix build on MIPS64	2026-03-07 13:21:43 -08:00
Linus Torvalds	6deccafcb4	Merge tag 'parisc-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: "While testing Sasha Levin's 'kallsyms: embed source file:line info in kernel stack traces' patch series, which increases the typical kernel image size, I found some issues with the parisc initial kernel mapping which may prevent the kernel to boot. The three small patches here fix this" * tag 'parisc-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Fix initial page table creation for boot parisc: Check kernel mapping earlier at bootup parisc: Increase initial mapping to 64 MB with KALLSYMS	2026-03-07 12:38:16 -08:00
Linus Torvalds	8b7f4cd3ac	Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Alexei Starovoitov: - Fix u32/s32 bounds when ranges cross min/max boundary (Eduard Zingerman) - Fix precision backtracking with linked registers (Eduard Zingerman) - Fix linker flags detection for resolve_btfids (Ihor Solodrai) - Fix race in update_ftrace_direct_add/del (Jiri Olsa) - Fix UAF in bpf_trampoline_link_cgroup_shim (Lang Xu) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: resolve_btfids: Fix linker flags detection selftests/bpf: add reproducer for spurious precision propagation through calls bpf: collect only live registers in linked regs Revert "selftests/bpf: Update reg_bound range refinement logic" selftests/bpf: test refining u32/s32 bounds when ranges cross min/max boundary bpf: Fix u32/s32 bounds when ranges cross min/max boundary bpf: Fix a UAF issue in bpf_trampoline_link_cgroup_shim ftrace: Add missing ftrace_lock to update_ftrace_direct_add/del	2026-03-07 12:20:37 -08:00
Linus Torvalds	03dcad79ee	Merge tag 'rcu-fixes.v7.0-20260307a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux Pull RCU selftest fixes from Boqun Feng: "Fix a regression in RCU torture test pre-defined scenarios caused by commit `7dadeaa6e8` ("sched: Further restrict the preemption modes") which limits PREEMPT_NONE to architectures that do not support preemption at all and PREEMPT_VOLUNTARY to those architectures that do not yet have PREEMPT_LAZY support. Since major architectures (e.g. x86 and arm64) no longer support CONFIG_PREEMPT_NONE and CONFIG_PREEMPT_VOLUNTARY, using them in rcutorture, rcuscale, refscale, and scftorture pre-defined scenarios causes config checking errors. Switch these kconfigs to PREEMPT_LAZY" * tag 'rcu-fixes.v7.0-20260307a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: scftorture: Update due to x86 not supporting none/voluntary preemption refscale: Update due to x86 not supporting none/voluntary preemption rcuscale: Update due to x86 not supporting none/voluntary preemption rcutorture: Update due to x86 not supporting none/voluntary preemption	2026-03-07 11:56:55 -08:00
Linus Torvalds	aed0af05a8	Merge tag 'trace-v7.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix possible NULL pointer dereference in trace_data_alloc() On the trace_data_alloc() error path, it can call trigger_data_free() with a NULL pointer. This used to be a kfree() but was changed to trigger_data_free() to clean up any partial initialization. The issue is that trigger_data_free() does not expect a NULL pointer. Have trigger_data_free() return safely on NULL pointer. - Fix multiple events on the command line and bootconfig If multiple events are enabled on the command line separately and not grouped, only the last event gets enabled. That is: trace_event=sched_switch trace_event=sched_waking will only enable sched_waking whereas: trace_event=sched_switch,sched_waking will enable both. The bootconfig makes it even worse as the second way is the more common method. The issue is that a temporary buffer is used to store the events to enable later in boot. Each time the cmdline callback is called, it overwrites what was previously there. Have the callback append the next value (delimited by a comma) if the temporary buffer already has content. - Fix command line trace_buffer_size if >= 2G The logic to allocate the trace buffer uses "int" for the size parameter in the command line code causing overflow issues if more that 2G is specified. * tag 'trace-v7.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Fix trace_buf_size= cmdline parameter with sizes >= 2G tracing: Fix enabling multiple events on the kernel command line and bootconfig tracing: Add NULL pointer check to trigger_data_free()	2026-03-07 09:50:54 -08:00
Ihor Solodrai	b0dcdcb9ae	resolve_btfids: Fix linker flags detection The "\|\| echo -lzstd" default makes zstd an unconditional link dependency of resolve_btfids. On systems where libzstd-dev is not installed and pkg-config fails, the linker fails: ld: cannot find -lzstd: No such file or directory libzstd is a transitive dependency of libelf, so the -lzstd flag is strictly necessary only for static builds [1]. Remove ZSTD_LIBS variable, and instead set LIBELF_LIBS depending on whether the build is static or not. Use $(HOSTPKG_CONFIG) as primary source of the flags list. Also add a default value for HOSTPKG_CONFIG in case it's not built via the toplevel Makefile. Pass it from selftests/bpf too. [1] https://lore.kernel.org/bpf/4ff82800-2daa-4b9f-95a9-6f512859ee70@linux.dev/ Reported-by: BPF CI Bot (Claude Opus 4.6) <bot+bpf-ci@kernel.org> Reported-by: Vitaly Chikunov <vt@altlinux.org> Closes: https://lore.kernel.org/bpf/aaWqMcK-2AQw5dx8@altlinux.org/ Fixes: `4021848a90` ("selftests/bpf: Pass through build flags to bpftool and resolve_btfids") Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Reviewed-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/20260305014730.3123382-1-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-07 08:51:51 -08:00
Linus Torvalds	7b6e48df88	Merge tag 'hwmon-for-v7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - Fix initialization commands for AHT20 - Correct a malformed email address (emc1403) - Check the it87_lock() return value - Fix inverted polarity (max6639) - Fix overflows, underflows, sign extension, and other problems in macsmc - Fix stack overflow in debugfs read (pmbus/q54sj108a2) - Drop support for SMARC-sAM67 (discontinued and never released to market) * tag 'hwmon-for-v7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (pmbus/q54sj108a2) fix stack overflow in debugfs read hwmon: (max6639) fix inverted polarity dt-bindings: hwmon: sl28cpld: Drop sa67mcu compatible hwmon: (it87) Check the it87_lock() return value Revert "hwmon: add SMARC-sAM67 support" hwmon: (aht10) Fix initialization commands for AHT20 hwmon: (emc1403) correct a malformed email address hwmon: (macsmc) Fix overflows, underflows, and sign extension hwmon: (macsmc) Fix regressions in Apple Silicon SMC hwmon driver	2026-03-07 08:39:59 -08:00
Linus Torvalds	e33aafac04	Merge tag 'driver-core-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core Pull driver core fix from Danilo Krummrich: - Revert "driver core: enforce device_lock for driver_match_device()": When a device is already present in the system and a driver is registered on the same bus, we iterate over all devices registered on this bus to see if one of them matches. If we come across an already bound one where the corresponding driver crashed while holding the device lock (e.g. in probe()) we can't make any progress anymore. Thus, revert and clarify that an implementer of struct bus_type must not expect match() to be called with the device lock held. * tag 'driver-core-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: Revert "driver core: enforce device_lock for driver_match_device()"	2026-03-07 08:16:48 -08:00
Linus Torvalds	0f912c8917	Merge tag 'for-linus-7.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - a cleanup of arch/x86/kernel/head_64.S removing the pre-built page tables for Xen guests - a small comment update - another cleanup for Xen PVH guests mode - fix an issue with Xen PV-devices backed by driver domains * tag 'for-linus-7.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/xenbus: better handle backend crash xenbus: add xenbus_device parameter to xenbus_read_driver_state() x86/PVH: Use boot params to pass RSDP address in start_info page x86/xen: update outdated comment xen/acpi-processor: fix _CST detection using undersized evaluation buffer x86/xen: Build identity mapping page tables dynamically for XENPV	2026-03-07 07:44:32 -08:00
Alexei Starovoitov	325d1ba3ca	Merge branch 'bpf-fix-precision-backtracking-bug-with-linked-registers' Eduard Zingerman says: ==================== bpf: Fix precision backtracking bug with linked registers Emil Tsalapatis reported a verifier bug hit by the scx_lavd sched_ext scheduler. The essential part of the verifier log looks as follows: 436: ... // checkpoint hit for 438: (1d) if r7 == r8 goto ... frame 3: propagating r2,r7,r8 frame 2: propagating r6 mark_precise: frame3: last_idx ... mark_precise: frame3: regs=r2,r7,r8 stack= before 436: ... mark_precise: frame3: regs=r2,r7 stack= before 435: ... mark_precise: frame3: regs=r2,r7 stack= before 434: (85) call bpf_trace_vprintk#177 verifier bug: backtracking call unexpected regs 84 The log complains that registers r2 and r7 are tracked as precise while processing the bpf_trace_vprintk() call in precision backtracking. This can't be right, as r2 is reset by the call and there is nothing to backtrack it to. The precision propagation is triggered when a checkpoint is hit at instruction 438, r2 is dead at that instruction. This happens because of the following sequence of events: - Instruction 438 is first reached with registers r2 and r7 having the same id via a path that does not call bpf_trace_vprintk(): - Checkpoint is created at 438. - The jump at 438 is predicted, hence r7 and registers linked to it (r2) are propagated as precise, marking r2 and r7 precise in the checkpoint. - Instruction 438 is reached a second time with r2 undefined and via a path that calls bpf_trace_vprintk(): - Checkpoint is hit. - propagate_precision() picks registers r2 and r7 and propagates precision marks for those up to the helper call. The root cause is the fact that states_equal() and propagate_precision() assume that the precision flag can't be set for a dead register (as computed by compute_live_registers()). However, this is not the case when linked registers are at play. Fix this by accounting for live register flags in collect_linked_regs(). --- ==================== Link: https://patch.msgid.link/20260306-linked-regs-and-propagate-precision-v1-0-18e859be570d@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-06 21:50:05 -08:00
Eduard Zingerman	223ffb6a3d	selftests/bpf: add reproducer for spurious precision propagation through calls Add a test for the scenario described in the previous commit: an iterator loop with two paths where one ties r2/r7 via shared scalar id and skips a call, while the other goes through the call. Precision marks from the linked registers get spuriously propagated to the call path via propagate_precision(), hitting "backtracking call unexpected regs" in backtrack_insn(). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260306-linked-regs-and-propagate-precision-v1-2-18e859be570d@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-06 21:50:05 -08:00
Eduard Zingerman	2658a1720a	bpf: collect only live registers in linked regs Fix an inconsistency between func_states_equal() and collect_linked_regs(): - regsafe() uses check_ids() to verify that cached and current states have identical register id mapping. - func_states_equal() calls regsafe() only for registers computed as live by compute_live_registers(). - clean_live_states() is supposed to remove dead registers from cached states, but it can skip states belonging to an iterator-based loop. - collect_linked_regs() collects all registers sharing the same id, ignoring the marks computed by compute_live_registers(). Linked registers are stored in the state's jump history. - backtrack_insn() marks all linked registers for an instruction as precise whenever one of the linked registers is precise. The above might lead to a scenario: - There is an instruction I with register rY known to be dead at I. - Instruction I is reached via two paths: first A, then B. - On path A: - There is an id link between registers rX and rY. - Checkpoint C is created at I. - Linked register set {rX, rY} is saved to the jump history. - rX is marked as precise at I, causing both rX and rY to be marked precise at C. - On path B: - There is no id link between registers rX and rY, otherwise register states are sub-states of those in C. - Because rY is dead at I, check_ids() returns true. - Current state is considered equal to checkpoint C, propagate_precision() propagates spurious precision mark for register rY along the path B. - Depending on a program, this might hit verifier_bug() in the backtrack_insn(), e.g. if rY ∈ [r1..r5] and backtrack_insn() spots a function call. The reproducer program is in the next patch. This was hit by sched_ext scx_lavd scheduler code. Changes in tests: - verifier_scalar_ids.c selftests need modification to preserve some registers as live for __msg() checks. - exceptions_assert.c adjusted to match changes in the verifier log, R0 is dead after conditional instruction and thus does not get range. - precise.c adjusted to match changes in the verifier log, register r9 is dead after comparison and it's range is not important for test. Reported-by: Emil Tsalapatis <emil@etsalapatis.com> Fixes: `0fb3cf6110` ("bpf: use register liveness information for func_states_equal") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260306-linked-regs-and-propagate-precision-v1-1-18e859be570d@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-06 21:49:40 -08:00
Linus Torvalds	4ae12d8bd9	Merge tag 'kbuild-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux Pull Kbuild fixes from Nathan Chancellor: - Split out .modinfo section from ELF_DETAILS macro, as that macro may be used in other areas that expect to discard .modinfo, breaking certain image layouts - Adjust genksyms parser to handle optional attributes in certain declarations, necessary after commit `07919126ec` ("netfilter: annotate NAT helper hook pointers with __rcu") - Include resolve_btfids in external module build created by scripts/package/install-extmod-build when it may be run on external modules - Avoid removing objtool binary with 'make clean', as it is required for external module builds * tag 'kbuild-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: kbuild: Leave objtool binary around with 'make clean' kbuild: install-extmod-build: Package resolve_btfids if necessary genksyms: Fix parsing a declarator with a preceding attribute kbuild: Split .modinfo out from ELF_DETAILS	2026-03-06 20:27:13 -08:00

1 2 3 4 5 ...

1427962 Commits