linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-16 14:51:51 -04:00

Author	SHA1	Message	Date
Alexei Starovoitov	596bef1d71	bpf: Support 32-bit scalar spills in stacksafe() v1->v2: updated comments v1: https://lore.kernel.org/bpf/20260322225124.14005-1-alexei.starovoitov@gmail.com/ The commit `6efbde200b` ("bpf: Handle scalar spill vs all MISC in stacksafe()") in stacksafe() only recognized full 64-bit scalar spills when comparing stack states for equivalence during state pruning and missed 32-bit scalar spill. When 32-bit scalar is spilled the check_stack_write_fixed_off() -> save_register_state() calls mark_stack_slot_misc() for slot[0-3], which preserves STACK_INVALID and STACK_ZERO (on a fresh stack slot[0-3] remain STACK_INVALID), sets slot[4-7] = STACK_SPILL, and updates spilled_ptr. The im=4 path is only reached when im=0 fails: The loop at im=0 already attempts the 64-bit scalar-spill/all-MISC check. If it matches, i advances by 7, skipping the entire 8-byte slot. So im=4 is only reached when bytes 0-3 are neither a scalar spill nor all-MISC — they must pass individual byte-by-byte comparison first. Then bytes 4-7 get the scalar-unit treatment. is_spilled_scalar_after(stack, 4): slot_type[4] == STACK_SPILL from a 64-bit spill would have been caught at im=0 (unless it's a pointer spill, in which case spilled_ptr.type != SCALAR_VALUE -> returns false at im=4 too). A partial overwrite of a 64-bit spill invalidates the entire slot in check_stack_write_fixed_off(). is_stack_misc_after(stack, 4): Only checks bytes 4-7 are MISC/INVALID, returns &unbound_reg. Comparing two unbound regs via regsafe() is safe. Changes to cilium programs: File Program Insns (A) Insns (B) Insns (DIFF) _______________ _________________________________ _________ _________ ________________ bpf_host.o cil_host_policy 49351 45811 -3540 (-7.17%) bpf_host.o cil_to_host 2384 2270 -114 (-4.78%) bpf_host.o cil_to_netdev 112051 100269 -11782 (-10.51%) bpf_host.o tail_handle_ipv4_cont_from_host 61175 60910 -265 (-0.43%) bpf_host.o tail_handle_ipv4_cont_from_netdev 9381 8873 -508 (-5.42%) bpf_host.o tail_handle_ipv4_from_host 12994 7066 -5928 (-45.62%) bpf_host.o tail_handle_ipv4_from_netdev 85015 59875 -25140 (-29.57%) bpf_host.o tail_handle_ipv6_cont_from_host 24732 23527 -1205 (-4.87%) bpf_host.o tail_handle_ipv6_cont_from_netdev 9463 8953 -510 (-5.39%) bpf_host.o tail_handle_ipv6_from_host 12477 11787 -690 (-5.53%) bpf_host.o tail_handle_ipv6_from_netdev 30814 30017 -797 (-2.59%) bpf_host.o tail_handle_nat_fwd_ipv4 8943 8860 -83 (-0.93%) bpf_host.o tail_handle_snat_fwd_ipv4 64716 61625 -3091 (-4.78%) bpf_host.o tail_handle_snat_fwd_ipv6 48299 30797 -17502 (-36.24%) bpf_host.o tail_ipv4_host_policy_ingress 21591 20017 -1574 (-7.29%) bpf_host.o tail_ipv6_host_policy_ingress 21177 20693 -484 (-2.29%) bpf_host.o tail_nodeport_nat_egress_ipv4 16588 16543 -45 (-0.27%) bpf_host.o tail_nodeport_nat_ingress_ipv4 39200 36116 -3084 (-7.87%) bpf_host.o tail_nodeport_nat_ingress_ipv6 50102 48003 -2099 (-4.19%) bpf_lxc.o tail_handle_ipv4_cont 113092 96891 -16201 (-14.33%) bpf_lxc.o tail_handle_ipv6 6727 6701 -26 (-0.39%) bpf_lxc.o tail_handle_ipv6_cont 25567 21805 -3762 (-14.71%) bpf_lxc.o tail_ipv4_ct_egress 28843 15970 -12873 (-44.63%) bpf_lxc.o tail_ipv4_ct_ingress 16691 10213 -6478 (-38.81%) bpf_lxc.o tail_ipv4_ct_ingress_policy_only 16691 10213 -6478 (-38.81%) bpf_lxc.o tail_ipv4_policy 6776 6622 -154 (-2.27%) bpf_lxc.o tail_ipv4_to_endpoint 7523 7219 -304 (-4.04%) bpf_lxc.o tail_ipv6_ct_egress 10275 9999 -276 (-2.69%) bpf_lxc.o tail_ipv6_ct_ingress 6466 6438 -28 (-0.43%) bpf_lxc.o tail_ipv6_ct_ingress_policy_only 6466 6438 -28 (-0.43%) bpf_lxc.o tail_ipv6_policy 6859 5159 -1700 (-24.78%) bpf_lxc.o tail_ipv6_to_endpoint 7039 4427 -2612 (-37.11%) bpf_lxc.o tail_nodeport_ipv6_dsr 1175 1033 -142 (-12.09%) bpf_lxc.o tail_nodeport_nat_egress_ipv4 16318 16292 -26 (-0.16%) bpf_lxc.o tail_nodeport_nat_ingress_ipv4 18907 18490 -417 (-2.21%) bpf_lxc.o tail_nodeport_nat_ingress_ipv6 14624 14556 -68 (-0.46%) bpf_lxc.o tail_nodeport_rev_dnat_ipv4 4776 4588 -188 (-3.94%) bpf_overlay.o tail_handle_inter_cluster_revsnat 15733 15498 -235 (-1.49%) bpf_overlay.o tail_handle_ipv4 124682 105717 -18965 (-15.21%) bpf_overlay.o tail_handle_ipv6 16201 15801 -400 (-2.47%) bpf_overlay.o tail_handle_snat_fwd_ipv4 21280 19323 -1957 (-9.20%) bpf_overlay.o tail_handle_snat_fwd_ipv6 20824 20822 -2 (-0.01%) bpf_overlay.o tail_nodeport_ipv6_dsr 1175 1033 -142 (-12.09%) bpf_overlay.o tail_nodeport_nat_egress_ipv4 16293 16267 -26 (-0.16%) bpf_overlay.o tail_nodeport_nat_ingress_ipv4 20841 20737 -104 (-0.50%) bpf_overlay.o tail_nodeport_nat_ingress_ipv6 14678 14629 -49 (-0.33%) bpf_sock.o cil_sock4_connect 1678 1623 -55 (-3.28%) bpf_sock.o cil_sock4_sendmsg 1791 1736 -55 (-3.07%) bpf_sock.o cil_sock6_connect 3641 3600 -41 (-1.13%) bpf_sock.o cil_sock6_recvmsg 2048 1899 -149 (-7.28%) bpf_sock.o cil_sock6_sendmsg 3755 3721 -34 (-0.91%) bpf_wireguard.o tail_handle_ipv4 31180 27484 -3696 (-11.85%) bpf_wireguard.o tail_handle_ipv6 12095 11760 -335 (-2.77%) bpf_wireguard.o tail_nodeport_ipv6_dsr 1232 1094 -138 (-11.20%) bpf_wireguard.o tail_nodeport_nat_egress_ipv4 16071 16061 -10 (-0.06%) bpf_wireguard.o tail_nodeport_nat_ingress_ipv4 20804 20565 -239 (-1.15%) bpf_wireguard.o tail_nodeport_nat_ingress_ipv6 13490 12224 -1266 (-9.38%) bpf_xdp.o tail_lb_ipv4 49695 42673 -7022 (-14.13%) bpf_xdp.o tail_lb_ipv6 122683 87896 -34787 (-28.36%) bpf_xdp.o tail_nodeport_ipv6_dsr 1833 1862 +29 (+1.58%) bpf_xdp.o tail_nodeport_nat_egress_ipv4 6999 6990 -9 (-0.13%) bpf_xdp.o tail_nodeport_nat_ingress_ipv4 28903 28780 -123 (-0.43%) bpf_xdp.o tail_nodeport_nat_ingress_ipv6 200361 197771 -2590 (-1.29%) bpf_xdp.o tail_nodeport_rev_dnat_ipv4 4606 4454 -152 (-3.30%) Changes to sched-ext: File Program Insns (A) Insns (B) Insns (DIFF) _________________________ ________________ _________ _________ _______________ scx_arena_selftests.bpf.o arena_selftest 236305 236251 -54 (-0.02%) scx_chaos.bpf.o chaos_dispatch 12282 8013 -4269 (-34.76%) scx_chaos.bpf.o chaos_enqueue 11398 7126 -4272 (-37.48%) scx_chaos.bpf.o chaos_init 3854 3828 -26 (-0.67%) scx_flash.bpf.o flash_init 1015 979 -36 (-3.55%) scx_flatcg.bpf.o fcg_dispatch 1143 1100 -43 (-3.76%) scx_lavd.bpf.o lavd_enqueue 35487 35472 -15 (-0.04%) scx_lavd.bpf.o lavd_init 21127 21107 -20 (-0.09%) scx_p2dq.bpf.o p2dq_enqueue 10210 7854 -2356 (-23.08%) scx_p2dq.bpf.o p2dq_init 3233 3207 -26 (-0.80%) scx_qmap.bpf.o qmap_init 20285 20230 -55 (-0.27%) scx_rusty.bpf.o rusty_select_cpu 1165 1148 -17 (-1.46%) scxtop.bpf.o on_sched_switch 2369 2355 -14 (-0.59%) Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260323022410.75444-1-alexei.starovoitov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-24 11:59:52 -07:00
Kumar Kartikeya Dwivedi	02bcf8ef26	bpf: Update MAINTAINERS file for general BPF entry Per discussion with Alexei, add Eduard and myself as maintainers under BPF [GENERAL]. While at it, drop R entries for reviewers who have been inactive. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20260324152230.2916217-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-24 09:05:11 -07:00
Varun R Mallya	b43d574c00	selftests/bpf: Add test for struct_ops __ref argument in any position Add a selftest to verify that the verifier correctly identifies refcounted arguments in struct_ops programs, even when they are not the first argument. This ensures that the restriction on tail calls for programs with __ref arguments is properly enforced regardless of which argument they appear in. This test verifies the fix for check_struct_ops_btf_id() proposed by Keisuke Nishimura [0], which corrected a bug where only the first argument was checked for the refcounted flag. The test includes: - An update to bpf_testmod to add 'test_refcounted_multi', an operator with three arguments where the third is tagged with "__ref". - A BPF program 'test_refcounted_multi' that attempts a tail call. - A test runner that asserts the verifier rejects the program with "program with __ref argument cannot tail call". [0]: https://lore.kernel.org/bpf/20260320130219.63711-1-keisuke.nishimura@inria.fr/ Signed-off-by: Varun R Mallya <varunrmallya@gmail.com> Link: https://lore.kernel.org/r/20260321214038.80479-1-varunrmallya@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-24 08:51:23 -07:00
Keisuke Nishimura	25e3e1f109	bpf: Fix refcount check in check_struct_ops_btf_id() The current implementation only checks whether the first argument is refcounted. Fix this by iterating over all arguments. Signed-off-by: Keisuke Nishimura <keisuke.nishimura@inria.fr> Fixes: `38f1e66abd` ("bpf: Do not allow tail call in strcut_ops program with __ref argument") Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Acked-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20260320130219.63711-1-keisuke.nishimura@inria.fr Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-24 08:50:20 -07:00
Weixie Cui	ad2f7ed0ee	bpf: propagate kvmemdup_bpfptr errors from bpf_prog_verify_signature kvmemdup_bpfptr() returns -EFAULT when the user pointer cannot be copied, and -ENOMEM on allocation failure. The error path always returned -ENOMEM, misreporting bad addresses as out-of-memory. Return PTR_ERR(sig) so user space gets the correct errno. Signed-off-by: Weixie Cui <cuiweixie@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/tencent_C9C5B2B28413D6303D505CD02BFEA4708C07@qq.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-24 08:48:51 -07:00
Martin KaFai Lau	280de43e88	bpf: Remove ipv6_bpf_stub usage in test_run bpf_prog_test_run_skb() uses net->ipv6.ip6_null_entry for BPF_PROG_TYPE_LWT_XMIT test runs. It currently checks ipv6_bpf_stub before using ip6_null_entry. ipv6_bpf_stub will be removed by the CONFIG_IPV6=m support removal series posted at [1], so switch this check to ipv6_mod_enabled() instead. This change depends on that series [1]. Without it, CONFIG_IPV6=m is still possible, and net->ipv6.ip6_null_entry remains NULL until the IPv6 module is loaded. [1] https://lore.kernel.org/netdev/20260320185649.5411-1-fmancera@suse.de/ Cc: Jakub Kicinski <kuba@kernel.org> Cc: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Sun Jian <sun.jian.kdev@gmail.com> Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de> Link: https://lore.kernel.org/r/20260323225250.1623542-1-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-24 08:47:33 -07:00
Amery Hung	03b7b389fe	selftests/bpf: Fix compiler warnings in task_local_data.h Fix compiler warnings about unused parameter, narrowing non-constant into a smaller type and comparison between integers of different size. Signed-off-by: Amery Hung <ameryhung@gmail.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260323231133.859941-1-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-24 08:46:54 -07:00
Hao Sun	833ef4a954	bpf: Simplify tnum_step() Simplify tnum_step() from a 10-variable algorithm into a straight line sequence of bitwise operations. Problem Reduction: tnum_step(): Given a tnum `(tval, tmask)` where `tval & tmask == 0`, and a value `z` with `tval ≤ z < (tval \| tmask)`, find the smallest `r > z`, a tnum-satisfying value, i.e., `r & ~tmask == tval`. Every tnum-satisfying value has the form tval \| s where s is a subset of tmask bits (s & ~tmask == 0). Since tval and tmask are disjoint: tval \| s = tval + s Similarly z = tval + d where d = z - tval, so r > z becomes: tval + s > tval + d s > d The problem reduces to: find the smallest s, a subset of tmask, such that s > d. Notice that `s` must be a subset of tmask, the problem now is simplified. Algorithm: The mask bits of `d` form a "counter" that we want to increment by one, but the counter has gaps at the fixed-bit positions. A normal +1 would stop at the first 0-bit it meets; we need it to skip over fixed-bit gaps and land on the next mask bit. Step 1 -- plug the gaps: d \| carry_mask \| ~tmask - ~tmask fills all fixed-bit positions with 1. - carry_mask = (1 << fls64(d & ~tmask)) - 1 fills all positions (including mask positions) below the highest non-mask bit of d. After this, the only remaining 0s are mask bits above the highest non-mask bit of d where d is also 0 -- exactly the positions where the carry can validly land. Step 2 -- increment: (d \| carry_mask \| ~tmask) + 1 Adding 1 flips all trailing 1s to 0 and sets the first 0 to 1. Since every gap has been plugged, that first 0 is guaranteed to be a mask bit above all non-mask bits of d. Step 3 -- mask: ((d \| carry_mask \| ~tmask) + 1) & tmask Strip the scaffolding, keeping only mask bits. Call the result inc. Step 4 -- result: tval \| inc Reattach the fixed bits. A simple 8-bit example: tmask: 1 1 0 1 0 1 1 0 d: 1 0 1 0 0 0 1 0 (d = 162) ^ non-mask 1 at bit 5 With carry_mask = 0b00111111 (smeared from bit 5): d\|carry\|~tm 1 0 1 1 1 1 1 1 + 1 1 1 0 0 0 0 0 0 & tmask 1 1 0 0 0 0 0 0 The patch passes my local test: test_verifier, test_progs for `-t verifier` and `-t reg_bounds`. CBMC shows the new code is equiv to original one[1], and a lean4 proof of correctness is available[2]: theorem tnumStep_correct (tval tmask z : BitVec 64) -- Precondition: valid tnum and input z (h_consistent : (tval &&& tmask) = 0) (h_lo : tval ≤ z) (h_hi : z < (tval \|\|\| tmask)) : -- Postcondition: r must be: -- (1) tnum member -- (2) z < r -- (3) for any other member w > z, r <= w let r := tnumStep tval tmask z satisfiesTnum64 r tval tmask ∧ tval ≤ r ∧ r ≤ (tval \|\|\| tmask) ∧ z < r ∧ ∀ w, satisfiesTnum64 w tval tmask → z < w → r ≤ w := by -- unfold definition unfold tnumStep satisfiesTnum64 simp only [] refine ⟨?_, ?_, ?_, ?_, ?_⟩ -- the solver proves each conjunct · bv_decide · bv_decide · bv_decide · bv_decide · intro w hw1 hw2; bv_decide [1] https://github.com/eddyz87/tnum-step-verif/blob/master/main.c [2] https://pastebin.com/raw/czHKiyY0 Signed-off-by: Hao Sun <hao.sun@inf.ethz.ch> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Reviewed-by: Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com> Link: https://lore.kernel.org/r/20260320162336.166542-1-hao.sun@inf.ethz.ch Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-24 08:45:29 -07:00
Puranjay Mohan	4ea43a4355	bpftool: Enable aarch64 ISA extensions for JIT disassembly The LLVM disassembler needs ISA extension features enabled to correctly decode instructions from those extensions. On aarch64, without these features, instructions like LSE atomics (e.g. ldaddal) are silently decoded as incorrect instructions and disassembly is truncated. Use LLVMCreateDisasmCPUFeatures() with "+all" features for aarch64 targets so that the disassembler can handle any instruction the kernel JIT might emit. Before: int bench_trigger_uprobe(void * ctx): bpf_prog_538c6a43d1c6b84c_bench_trigger_uprobe: ; int cpu = bpf_get_smp_processor_id(); 0: mov x9, x30 4: nop 8: stp x29, x30, [sp, #-16]! c: mov x29, sp 10: stp xzr, x26, [sp, #-16]! 14: mov x26, sp 18: mrs x10, SP_EL0 1c: ldr w7, [x10, #16] ; __sync_add_and_fetch(&hits[cpu & CPU_MASK].value, 1); 20: and w7, w7, #0xff ; __sync_add_and_fetch(&hits[cpu & CPU_MASK].value, 1); 24: lsl x7, x7, #7 28: mov x0, #-281474976710656 2c: movk x0, #32768, lsl #32 30: movk x0, #35407, lsl #16 34: add x0, x0, x7 38: mov x1, #1 ; __sync_add_and_fetch(&hits[cpu & CPU_MASK].value, 1); 3c: mov x1, #1 After: int bench_trigger_uprobe(void * ctx): bpf_prog_538c6a43d1c6b84c_bench_trigger_uprobe: ; int cpu = bpf_get_smp_processor_id(); 0: mov x9, x30 4: nop 8: stp x29, x30, [sp, #-16]! c: mov x29, sp 10: stp xzr, x26, [sp, #-16]! 14: mov x26, sp 18: mrs x10, SP_EL0 1c: ldr w7, [x10, #16] ; __sync_add_and_fetch(&hits[cpu & CPU_MASK].value, 1); 20: and w7, w7, #0xff ; __sync_add_and_fetch(&hits[cpu & CPU_MASK].value, 1); 24: lsl x7, x7, #7 28: mov x0, #-281474976710656 2c: movk x0, #32768, lsl #32 30: movk x0, #35407, lsl #16 34: add x0, x0, x7 38: mov x1, #1 ; __sync_add_and_fetch(&hits[cpu & CPU_MASK].value, 1); 3c: ldaddal x1, x1, [x0] ; return 0; 40: mov w7, #0 44: ldp xzr, x26, [sp], #16 48: ldp x29, x30, [sp], #16 4c: mov x0, x7 50: ret 54: nop 58: ldr x10, #8 5c: br x10 Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Leon Hwang <leon.hwang@linux.dev> Acked-by: Quentin Monnet <qmo@kernel.org> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260318172259.2882792-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-24 08:44:29 -07:00
Carlos Llamas	9b0cf064ea	bpf: Switch CONFIG_CFI_CLANG to CONFIG_CFI This was renamed in commit `23ef9d4397` ("kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI") as it is now a compiler-agnostic option. Using the wrong name results in the code getting compiled out. Meaning the CFI failures for btf_dtor_kfunc_t would still trigger. Fixes: `99fde4d062` ("bpf, btf: Enforce destructor kfunc type with CFI") Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260312183818.2721750-1-cmllamas@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-24 08:42:24 -07:00
Eric Biggers	a02327413a	bpf: Remove inclusions of crypto/sha1.h Since commit `603b441623` ("bpf: Update the bpf_prog_calc_tag to use SHA256") made BPF program tags use SHA-256 instead of SHA-1, the header <crypto/sha1.h> no longer needs to be included. Remove the relevant inclusions so that they no longer unnecessarily come up in searches for which kernel code is still using the obsolete SHA-1 algorithm. Since net/ipv6/addrconf.c was relying on the transitive inclusion of <crypto/sha1.h> (for an unrelated purpose) via <linux/filter.h>, make it include <crypto/sha1.h> explicitly in order to keep that file building. Signed-off-by: Eric Biggers <ebiggers@kernel.org> Acked-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/20260314214555.112386-1-ebiggers@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-24 08:40:45 -07:00
Varun R Mallya	bb6da652c5	selftests/bpf: Improve connect_force_port test reliability The connect_force_port test fails intermittently in CI because the hardcoded server ports (60123/60124) may already be in use by other tests or processes [1]. Fix this by passing port 0 to start_server(), letting the kernel assign a free port dynamically. The actual assigned port is then propagated to the BPF programs by writing it into the .bss map's initial value (via bpf_map__initial_value()) before loading, so the BPF programs use the correct backend port at runtime. [1] https://github.com/kernel-patches/bpf/actions/runs/22697676317/job/65808536038 Suggested-by: Jiayuan Chen <jiayuan.chen@linux.dev> Signed-off-by: Varun R Mallya <varunrmallya@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Sun Jian <sun.jian.kdev@gmail.com> Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev> Link: https://patch.msgid.link/20260323081131.65604-1-varunrmallya@gmail.com	2026-03-23 11:37:34 -07:00
Alexei Starovoitov	bfec8e88ff	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf 7.0-rc5 Cross-merge BPF and other fixes after downstream PR. Minor conflicts in: tools/testing/selftests/bpf/progs/exceptions_fail.c tools/testing/selftests/bpf/progs/verifier_bounds.c Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-22 19:33:29 -07:00
Linus Torvalds	c369299895	Linux 7.0-rc5 v7.0-rc5	2026-03-22 14:42:17 -07:00
Mikko Perttunen	ec69c9e883	i2c: tegra: Don't mark devices with pins as IRQ safe I2C devices with associated pinctrl states (DPAUX I2C controllers) will change pinctrl state during runtime PM. This requires taking a mutex, so these devices cannot be marked as IRQ safe. Add PINCTRL as dependency to avoid build errors. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Reported-by: Russell King <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/all/E1vsNBv-00000009nfA-27ZK@rmk-PC.armlinux.org.uk/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-03-22 11:37:58 -07:00
Linus Torvalds	d5273fd3ca	Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Alexei Starovoitov: - Fix how linked registers track zero extension of subregisters (Daniel Borkmann) - Fix unsound scalar fork for OR instructions (Daniel Wade) - Fix exception exit lock check for subprogs (Ihor Solodrai) - Fix undefined behavior in interpreter for SDIV/SMOD instructions (Jenny Guanni Qu) - Release module's BTF when module is unloaded (Kumar Kartikeya Dwivedi) - Fix constant blinding for PROBE_MEM32 instructions (Sachin Kumar) - Reset register ID for END instructions to prevent incorrect value tracking (Yazhou Tang) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Add a test cases for sync_linked_regs regarding zext propagation bpf: Fix sync_linked_regs regarding BPF_ADD_CONST32 zext propagation selftests/bpf: Add tests for maybe_fork_scalars() OR vs AND handling bpf: Fix unsound scalar forking in maybe_fork_scalars() for BPF_OR selftests/bpf: Add tests for sdiv32/smod32 with INT_MIN dividend bpf: Fix undefined behavior in interpreter sdiv/smod for INT_MIN selftests/bpf: Add tests for bpf_throw lock leak from subprogs bpf: Fix exception exit lock checking for subprogs bpf: Release module BTF IDR before module unload selftests/bpf: Fix pkg-config call on static builds bpf: Fix constant blinding for PROBE_MEM32 stores selftests/bpf: Add test for BPF_END register ID reset bpf: Reset register ID for BPF_END value tracking	2026-03-22 11:16:06 -07:00
Linus Torvalds	ac57fa9faf	Merge tag 'trace-v7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Revert "tracing: Remove pid in task_rename tracing output" A change was made to remove the pid field from the task_rename event because it was thought that it was always done for the current task and recording the pid would be redundant. This turned out to be incorrect and there are a few corner case where this is not true and caused some regressions in tooling. - Fix the reading from user space for migration The reading of user space uses a seq lock type of logic where it uses a per-cpu temporary buffer and disables migration, then enables preemption, does the copy from user space, disables preemption, enables migration and checks if there was any schedule switches while preemption was enabled. If there was a context switch, then it is considered that the per-cpu buffer could be corrupted and it tries again. There's a protection check that tests if it takes a hundred tries, it issues a warning and exits out to prevent a live lock. This was triggered because the task was selected by the load balancer to be migrated to another CPU, every time preemption is enabled the migration task would schedule in try to migrate the task but can't because migration is disabled and let it run again. This caused the scheduler to schedule out the task every time it enabled preemption and made the loop never exit (until the 100 iteration test triggered). Fix this by enabling and disabling preemption and keeping migration enabled if the reading from user space needs to be done again. This will let the migration thread migrate the task and the copy from user space will likely pass on the next iteration. - Fix trace_marker copy option freeing The "copy_trace_marker" option allows a tracing instance to get a copy of a write to the trace_marker file of the top level instance. This is managed by a link list protected by RCU. When an instance is removed, a check is made if the option is set, and if so synchronized_rcu() is called. The problem is that an iteration is made to reset all the flags to what they were when the instance was created (to perform clean ups) was done before the check of the copy_trace_marker option and that option was cleared, so the synchronize_rcu() was never called. Move the clearing of all the flags after the check of copy_trace_marker to do synchronize_rcu() so that the option is still set if it was before and the synchronization is performed. - Fix entries setting when validating the persistent ring buffer When validating the persistent ring buffer on boot up, the number of events per sub-buffer is added to the sub-buffer meta page. The validator was updating cpu_buffer->head_page (the first sub-buffer of the per-cpu buffer) and not the "head_page" variable that was iterating the sub-buffers. This was causing the first sub-buffer to be assigned the entries for each sub-buffer and not the sub-buffer that was supposed to be updated. - Use "hash" value to update the direct callers When updating the ftrace direct callers, it assigned a temporary callback to all the callback functions of the ftrace ops and not just the functions represented by the passed in hash. This causes an unnecessary slow down of the functions of the ftrace_ops that is not being modified. Only update the functions that are going to be modified to call the ftrace loop function so that the update can be made on those functions. * tag 'trace-v7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ftrace: Use hash argument for tmp_ops in update_ftrace_direct_mod ring-buffer: Fix to update per-subbuf entries of persistent ring buffer tracing: Fix trace_marker copy link list updates tracing: Fix failure to read user space from system call trace events tracing: Revert "tracing: Remove pid in task_rename tracing output"	2026-03-22 11:10:31 -07:00
Linus Torvalds	11ac4ce3f7	Merge tag 'i2c-for-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: - fix broken I2C communication on Armada 3700 with recovery - fix device_node reference leak in probe (fsi) - fix NULL-deref when serial string is missing (cp2615) * tag 'i2c-for-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: pxa: defer reset on Armada 3700 when recovery is used i2c: fsi: Fix a potential leak in fsi_i2c_probe() i2c: cp2615: fix serial string NULL-deref at probe	2026-03-22 11:05:34 -07:00
Linus Torvalds	8d8bd2a5aa	Merge tag 'x86-urgent-2026-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: - Improve Qemu MCE-injection behavior by only using AMD SMCA MSRs if the feature bit is set - Fix the relative path of gettimeofday.c inclusion in vclock_gettime.c - Fix a boot crash on UV clusters when a socket is marked as 'deconfigured' which are mapped to the SOCK_EMPTY node ID by the UV firmware, while Linux APIs expect NUMA_NO_NODE. The difference being (0xffff [unsigned short ~0]) vs [int -1] * tag 'x86-urgent-2026-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/uv: Handle deconfigured sockets x86/entry/vdso: Fix path of included gettimeofday.c x86/mce/amd: Check SMCA feature bit before accessing SMCA MSRs	2026-03-22 10:54:12 -07:00
Linus Torvalds	ebfd9b7af2	Merge tag 'perf-urgent-2026-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: - Fix a PMU driver crash on AMD EPYC systems, caused by a race condition in x86_pmu_enable() - Fix a possible counter-initialization bug in x86_pmu_enable() - Fix a counter inheritance bug in inherit_event() and __perf_event_read() - Fix an Intel PMU driver branch constraints handling bug found by UBSAN - Fix the Intel PMU driver's new Off-Module Response (OMR) support code for Diamond Rapids / Nova lake, to fix a snoop information parsing bug * tag 'perf-urgent-2026-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Fix OMR snoop information parsing issues perf/x86/intel: Add missing branch counters constraint apply perf: Make sure to use pmu_ctx->pmu for groups x86/perf: Make sure to program the counter value for stopped events on migration perf/x86: Move event pointer setup earlier in x86_pmu_enable()	2026-03-22 10:31:51 -07:00
Linus Torvalds	dea622e183	Merge tag 'objtool-urgent-2026-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fixes from Ingo Molnar: "Fix three more livepatching related build environment bugs, and a false positive warning with Clang jump tables" * tag 'objtool-urgent-2026-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Fix Clang jump table detection livepatch/klp-build: Fix inconsistent kernel version objtool/klp: fix mkstemp() failure with long paths objtool/klp: fix data alignment in __clone_symbol()	2026-03-22 10:17:50 -07:00
Linus Torvalds	d56d4a110f	Merge tag 'locking-urgent-2026-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Ingo Molnar: "Fix a sparse build error regression in <linux/local_lock_internal.h> caused by the locking context-analysis changes" * tag 'locking-urgent-2026-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: include/linux/local_lock_internal.h: Make this header file again compatible with sparse	2026-03-22 09:57:20 -07:00
Linus Torvalds	b5fddfad34	Merge tag 'irq-urgent-2026-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Ingo Molnar: "Fix a mailbox channel leak in the riscv-rpmi-sysmsi irqchip driver" * tag 'irq-urgent-2026-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/riscv-rpmi-sysmsi: Fix mailbox channel leak in rpmi_sysmsi_probe()	2026-03-22 09:55:58 -07:00
Linus Torvalds	d723091c8c	Merge tag 'driver-core-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core Pull driver core fixes from Danilo Krummrich: - Generalize driver_override in the driver core, providing a common sysfs implementation and concurrency-safe accessors for bus implementations - Do not use driver_override as IRQ name in the hwmon axi-fan driver - Remove an unnecessary driver_override check in sh platform_early - Migrate the platform bus to use the generic driver_override infrastructure, fixing a UAF condition caused by accessing the driver_override field without proper locking in the platform_match() callback * tag 'driver-core-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: driver core: platform: use generic driver_override infrastructure sh: platform_early: remove pdev->driver_override check hwmon: axi-fan: don't use driver_override as IRQ name docs: driver-model: document driver_override driver core: generalize driver_override in struct device	2026-03-21 16:59:09 -07:00
Jiri Olsa	50b35c9e50	ftrace: Use hash argument for tmp_ops in update_ftrace_direct_mod The modify logic registers temporary ftrace_ops object (tmp_ops) to trigger the slow path for all direct callers to be able to safely modify attached addresses. At the moment we use ops->func_hash for tmp_ops filter, which represents all the systems attachments. It's faster to use just the passed hash filter, which contains only the modified sites and is always a subset of the ops->func_hash. Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Menglong Dong <menglong8.dong@gmail.com> Cc: Song Liu <song@kernel.org> Link: https://patch.msgid.link/20260312123738.129926-1-jolsa@kernel.org Fixes: `e93672f770` ("ftrace: Add update_ftrace_direct_mod function") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2026-03-21 16:51:04 -04:00
Masami Hiramatsu (Google)	f35dbac694	ring-buffer: Fix to update per-subbuf entries of persistent ring buffer Since the validation loop in rb_meta_validate_events() updates the same cpu_buffer->head_page->entries, the other subbuf entries are not updated. Fix to use head_page to update the entries field, since it is the cursor in this loop. Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Ian Rogers <irogers@google.com> Fixes: `5f3b6e839f` ("ring-buffer: Validate boot range memory events") Link: https://patch.msgid.link/177391153882.193994.17158784065013676533.stgit@mhiramat.tok.corp.google.com Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2026-03-21 16:47:28 -04:00
Steven Rostedt	07183aac4a	tracing: Fix trace_marker copy link list updates When the "copy_trace_marker" option is enabled for an instance, anything written into /sys/kernel/tracing/trace_marker is also copied into that instances buffer. When the option is set, that instance's trace_array descriptor is added to the marker_copies link list. This list is protected by RCU, as all iterations uses an RCU protected list traversal. When the instance is deleted, all the flags that were enabled are cleared. This also clears the copy_trace_marker flag and removes the trace_array descriptor from the list. The issue is after the flags are called, a direct call to update_marker_trace() is performed to clear the flag. This function returns true if the state of the flag changed and false otherwise. If it returns true here, synchronize_rcu() is called to make sure all readers see that its removed from the list. But since the flag was already cleared, the state does not change and the synchronization is never called, leaving a possible UAF bug. Move the clearing of all flags below the updating of the copy_trace_marker option which then makes sure the synchronization is performed. Also use the flag for checking the state in update_marker_trace() instead of looking at if the list is empty. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20260318185512.1b6c7db4@gandalf.local.home Fixes: `7b382efd5e` ("tracing: Allow the top level trace_marker to write into another instances") Reported-by: Sasha Levin <sashal@kernel.org> Closes: https://lore.kernel.org/all/20260225133122.237275-1-sashal@kernel.org/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2026-03-21 16:43:53 -04:00
Steven Rostedt	edca33a562	tracing: Fix failure to read user space from system call trace events The system call trace events call trace_user_fault_read() to read the user space part of some system calls. This is done by grabbing a per-cpu buffer, disabling migration, enabling preemption, calling copy_from_user(), disabling preemption, enabling migration and checking if the task was preempted while preemption was enabled. If it was, the buffer is considered corrupted and it tries again. There's a safety mechanism that will fail out of this loop if it fails 100 times (with a warning). That warning message was triggered in some pi_futex stress tests. Enabling the sched_switch trace event and traceoff_on_warning, showed the problem: pi_mutex_hammer-1375 [006] d..21 138.981648: sched_switch: prev_comm=pi_mutex_hammer prev_pid=1375 prev_prio=95 prev_state=R+ ==> next_comm=migration/6 next_pid=47 next_prio=0 migration/6-47 [006] d..2. 138.981651: sched_switch: prev_comm=migration/6 prev_pid=47 prev_prio=0 prev_state=S ==> next_comm=pi_mutex_hammer next_pid=1375 next_prio=95 pi_mutex_hammer-1375 [006] d..21 138.981656: sched_switch: prev_comm=pi_mutex_hammer prev_pid=1375 prev_prio=95 prev_state=R+ ==> next_comm=migration/6 next_pid=47 next_prio=0 migration/6-47 [006] d..2. 138.981659: sched_switch: prev_comm=migration/6 prev_pid=47 prev_prio=0 prev_state=S ==> next_comm=pi_mutex_hammer next_pid=1375 next_prio=95 pi_mutex_hammer-1375 [006] d..21 138.981664: sched_switch: prev_comm=pi_mutex_hammer prev_pid=1375 prev_prio=95 prev_state=R+ ==> next_comm=migration/6 next_pid=47 next_prio=0 migration/6-47 [006] d..2. 138.981667: sched_switch: prev_comm=migration/6 prev_pid=47 prev_prio=0 prev_state=S ==> next_comm=pi_mutex_hammer next_pid=1375 next_prio=95 pi_mutex_hammer-1375 [006] d..21 138.981671: sched_switch: prev_comm=pi_mutex_hammer prev_pid=1375 prev_prio=95 prev_state=R+ ==> next_comm=migration/6 next_pid=47 next_prio=0 migration/6-47 [006] d..2. 138.981675: sched_switch: prev_comm=migration/6 prev_pid=47 prev_prio=0 prev_state=S ==> next_comm=pi_mutex_hammer next_pid=1375 next_prio=95 pi_mutex_hammer-1375 [006] d..21 138.981679: sched_switch: prev_comm=pi_mutex_hammer prev_pid=1375 prev_prio=95 prev_state=R+ ==> next_comm=migration/6 next_pid=47 next_prio=0 migration/6-47 [006] d..2. 138.981682: sched_switch: prev_comm=migration/6 prev_pid=47 prev_prio=0 prev_state=S ==> next_comm=pi_mutex_hammer next_pid=1375 next_prio=95 pi_mutex_hammer-1375 [006] d..21 138.981687: sched_switch: prev_comm=pi_mutex_hammer prev_pid=1375 prev_prio=95 prev_state=R+ ==> next_comm=migration/6 next_pid=47 next_prio=0 migration/6-47 [006] d..2. 138.981690: sched_switch: prev_comm=migration/6 prev_pid=47 prev_prio=0 prev_state=S ==> next_comm=pi_mutex_hammer next_pid=1375 next_prio=95 pi_mutex_hammer-1375 [006] d..21 138.981695: sched_switch: prev_comm=pi_mutex_hammer prev_pid=1375 prev_prio=95 prev_state=R+ ==> next_comm=migration/6 next_pid=47 next_prio=0 migration/6-47 [006] d..2. 138.981698: sched_switch: prev_comm=migration/6 prev_pid=47 prev_prio=0 prev_state=S ==> next_comm=pi_mutex_hammer next_pid=1375 next_prio=95 pi_mutex_hammer-1375 [006] d..21 138.981703: sched_switch: prev_comm=pi_mutex_hammer prev_pid=1375 prev_prio=95 prev_state=R+ ==> next_comm=migration/6 next_pid=47 next_prio=0 migration/6-47 [006] d..2. 138.981706: sched_switch: prev_comm=migration/6 prev_pid=47 prev_prio=0 prev_state=S ==> next_comm=pi_mutex_hammer next_pid=1375 next_prio=95 pi_mutex_hammer-1375 [006] d..21 138.981711: sched_switch: prev_comm=pi_mutex_hammer prev_pid=1375 prev_prio=95 prev_state=R+ ==> next_comm=migration/6 next_pid=47 next_prio=0 migration/6-47 [006] d..2. 138.981714: sched_switch: prev_comm=migration/6 prev_pid=47 prev_prio=0 prev_state=S ==> next_comm=pi_mutex_hammer next_pid=1375 next_prio=95 pi_mutex_hammer-1375 [006] d..21 138.981719: sched_switch: prev_comm=pi_mutex_hammer prev_pid=1375 prev_prio=95 prev_state=R+ ==> next_comm=migration/6 next_pid=47 next_prio=0 migration/6-47 [006] d..2. 138.981722: sched_switch: prev_comm=migration/6 prev_pid=47 prev_prio=0 prev_state=S ==> next_comm=pi_mutex_hammer next_pid=1375 next_prio=95 pi_mutex_hammer-1375 [006] d..21 138.981727: sched_switch: prev_comm=pi_mutex_hammer prev_pid=1375 prev_prio=95 prev_state=R+ ==> next_comm=migration/6 next_pid=47 next_prio=0 migration/6-47 [006] d..2. 138.981730: sched_switch: prev_comm=migration/6 prev_pid=47 prev_prio=0 prev_state=S ==> next_comm=pi_mutex_hammer next_pid=1375 next_prio=95 pi_mutex_hammer-1375 [006] d..21 138.981735: sched_switch: prev_comm=pi_mutex_hammer prev_pid=1375 prev_prio=95 prev_state=R+ ==> next_comm=migration/6 next_pid=47 next_prio=0 migration/6-47 [006] d..2. 138.981738: sched_switch: prev_comm=migration/6 prev_pid=47 prev_prio=0 prev_state=S ==> next_comm=pi_mutex_hammer next_pid=1375 next_prio=95 What happened was the task 1375 was flagged to be migrated. When preemption was enabled, the migration thread woke up to migrate that task, but failed because migration for that task was disabled. This caused the loop to fail to exit because the task scheduled out while trying to read user space. Every time the task enabled preemption the migration thread would schedule in, try to migrate the task, fail and let the task continue. But because the loop would only enable preemption with migration disabled, it would always fail because each time it enabled preemption to read user space, the migration thread would try to migrate it. To solve this, when the loop fails to read user space without being scheduled out, enabled and disable preemption with migration enabled. This will allow the migration task to successfully migrate the task and the next loop should succeed to read user space without being scheduled out. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20260316130734.1858a998@gandalf.local.home Fixes: `64cf7d058a` ("tracing: Have trace_marker use per-cpu data to read user space") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2026-03-21 16:42:36 -04:00
Xuewen Yan	a6f22e50c7	tracing: Revert "tracing: Remove pid in task_rename tracing output" This reverts commit `e3f6a42272`. The commit says that the tracepoint only deals with the current task, however the following case is not current task: comm_write() { p = get_proc_task(inode); if (!p) return -ESRCH; if (same_thread_group(current, p)) set_task_comm(p, buffer); } where set_task_comm() calls __set_task_comm() which records the update of p and not current. So revert the patch to show pid. Cc: <mhiramat@kernel.org> Cc: <mathieu.desnoyers@efficios.com> Cc: <elver@google.com> Cc: <kees@kernel.org> Link: https://patch.msgid.link/20260306075954.4533-1-xuewen.yan@unisoc.com Fixes: `e3f6a42272` ("tracing: Remove pid in task_rename tracing output") Reported-by: Guohua Yan <guohua.yan@unisoc.com> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2026-03-21 16:41:18 -04:00
Daniel Borkmann	4a04d13576	selftests/bpf: Add a test cases for sync_linked_regs regarding zext propagation Add multiple test cases for linked register tracking with alu32 ops: - Add a test that checks sync_linked_regs() regarding reg->id (the linked target register) for BPF_ADD_CONST32 rather than known_reg->id (the branch register). - Add a test case for linked register tracking that exposes the cross-type sync_linked_regs() bug. One register uses alu32 (w7 += 1, BPF_ADD_CONST32) and another uses alu64 (r8 += 2, BPF_ADD_CONST64), both linked to the same base register. - Add a test case that exercises regsafe() path pruning when two execution paths reach the same program point with linked registers carrying different ADD_CONST flags (BPF_ADD_CONST32 from alu32 vs BPF_ADD_CONST64 from alu64). This particular test passes with and without the fix since the pruning will fail due to different ranges, but it would still be useful to carry this one as a regression test for the unreachable div by zero. With the fix applied all the tests pass: # LDLIBS=-static PKG_CONFIG='pkg-config --static' ./vmtest.sh -- ./test_progs -t verifier_linked_scalars [...] ./test_progs -t verifier_linked_scalars #602/1 verifier_linked_scalars/scalars: find linked scalars:OK #602/2 verifier_linked_scalars/sync_linked_regs_preserves_id:OK #602/3 verifier_linked_scalars/scalars_neg:OK #602/4 verifier_linked_scalars/scalars_neg_sub:OK #602/5 verifier_linked_scalars/scalars_neg_alu32_add:OK #602/6 verifier_linked_scalars/scalars_neg_alu32_sub:OK #602/7 verifier_linked_scalars/scalars_pos:OK #602/8 verifier_linked_scalars/scalars_sub_neg_imm:OK #602/9 verifier_linked_scalars/scalars_double_add:OK #602/10 verifier_linked_scalars/scalars_sync_delta_overflow:OK #602/11 verifier_linked_scalars/scalars_sync_delta_overflow_large_range:OK #602/12 verifier_linked_scalars/scalars_alu32_big_offset:OK #602/13 verifier_linked_scalars/scalars_alu32_basic:OK #602/14 verifier_linked_scalars/scalars_alu32_wrap:OK #602/15 verifier_linked_scalars/scalars_alu32_zext_linked_reg:OK #602/16 verifier_linked_scalars/scalars_alu32_alu64_cross_type:OK #602/17 verifier_linked_scalars/scalars_alu32_alu64_regsafe_pruning:OK #602/18 verifier_linked_scalars/alu32_negative_offset:OK #602/19 verifier_linked_scalars/spurious_precision_marks:OK #602 verifier_linked_scalars:OK Summary: 1/19 PASSED, 0 SKIPPED, 0 FAILED Co-developed-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260319211507.213816-2-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 13:19:40 -07:00
Daniel Borkmann	bc308be380	bpf: Fix sync_linked_regs regarding BPF_ADD_CONST32 zext propagation Jenny reported that in sync_linked_regs() the BPF_ADD_CONST32 flag is checked on known_reg (the register narrowed by a conditional branch) instead of reg (the linked target register created by an alu32 operation). Example case with reg: 1. r6 = bpf_get_prandom_u32() 2. r7 = r6 (linked, same id) 3. w7 += 5 (alu32 -- r7 gets BPF_ADD_CONST32, zero-extended by CPU) 4. if w6 < 0xFFFFFFFC goto safe (narrows r6 to [0xFFFFFFFC, 0xFFFFFFFF]) 5. sync_linked_regs() propagates to r7 but does NOT call zext_32_to_64() 6. Verifier thinks r7 is [0x100000001, 0x100000004] instead of [1, 4] Since known_reg above does not have BPF_ADD_CONST32 set above, zext_32_to_64() is never called on alu32-derived linked registers. This causes the verifier to track incorrect 64-bit bounds, while the CPU correctly zero-extends the 32-bit result. The code checking known_reg->id was correct however (see scalars_alu32_wrap selftest case), but the real fix needs to handle both directions - zext propagation should be done when either register has BPF_ADD_CONST32, since the linked relationship involves a 32-bit operation regardless of which side has the flag. Example case with known_reg (exercised also by scalars_alu32_wrap): 1. r1 = r0; w1 += 0x100 (alu32 -- r1 gets BPF_ADD_CONST32) 2. if r1 > 0x80 - known_reg = r1 (has BPF_ADD_CONST32), reg = r0 (doesn't) Hence, fix it by checking for (reg->id \| known_reg->id) & BPF_ADD_CONST32. Moreover, sync_linked_regs() also has a soundness issue when two linked registers used different ALU widths: one with BPF_ADD_CONST32 and the other with BPF_ADD_CONST64. The delta relationship between linked registers assumes the same arithmetic width though. When one register went through alu32 (CPU zero-extends the 32-bit result) and the other went through alu64 (no zero-extension), the propagation produces incorrect bounds. Example: r6 = bpf_get_prandom_u32() // fully unknown if r6 >= 0x100000000 goto out // constrain r6 to [0, U32_MAX] r7 = r6 w7 += 1 // alu32: r7.id = N \| BPF_ADD_CONST32 r8 = r6 r8 += 2 // alu64: r8.id = N \| BPF_ADD_CONST64 if r7 < 0xFFFFFFFF goto out // narrows r7 to [0xFFFFFFFF, 0xFFFFFFFF] At the branch on r7, sync_linked_regs() runs with known_reg=r7 (BPF_ADD_CONST32) and reg=r8 (BPF_ADD_CONST64). The delta path computes: r8 = r7 + (delta_r8 - delta_r7) = 0xFFFFFFFF + (2 - 1) = 0x100000000 Then, because known_reg->id has BPF_ADD_CONST32, zext_32_to_64(r8) is called, truncating r8 to [0, 0]. But r8 used a 64-bit ALU op -- the CPU does NOT zero-extend it. The actual CPU value of r8 is 0xFFFFFFFE + 2 = 0x100000000, not 0. The verifier now underestimates r8's 64-bit bounds, which is a soundness violation. Fix sync_linked_regs() by skipping propagation when the two registers have mixed ALU widths (one BPF_ADD_CONST32, the other BPF_ADD_CONST64). Lastly, fix regsafe() used for path pruning: the existing checks used "& BPF_ADD_CONST" to test for offset linkage, which treated BPF_ADD_CONST32 and BPF_ADD_CONST64 as equivalent. Fixes: `7a433e5193` ("bpf: Support negative offsets, BPF_SUB, and alu32 for linked register tracking") Reported-by: Jenny Guanni Qu <qguanni@gmail.com> Co-developed-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260319211507.213816-1-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 13:19:40 -07:00
Alexei Starovoitov	61bc846081	Merge branch 'libbpf-add-bpf_program__clone-for-individual-program-loading' Mykyta Yatsenko says: ==================== libbpf: Add bpf_program__clone() for individual program loading This series adds bpf_program__clone() to libbpf and converts veristat to use it, replacing the costly per-program object re-opening pattern. veristat needs to load each BPF program in isolation to collect per-program verification statistics. Previously it achieved this by opening a fresh bpf_object for every program, disabling autoload on all but the target, and loading the whole object. For object files with many programs this meant repeating ELF parsing and BTF processing N times. Patch 1 introduces bpf_program__clone(), which loads a single program from a prepared object into the kernel and returns an fd owned by the caller. It populates load parameters from the prepared object and lets callers override any field via bpf_prog_load_opts. Fields written by the prog_prepare_load_fn callback (expected_attach_type, attach_btf_id, attach_btf_obj_fd) are seeded from prog/obj defaults before the callback, then overridden with caller opts after, so explicit values always win. Patch 2 converts veristat to prepare the object once and clone each program individually, eliminating redundant work. Patch 3 adds a selftest verifying that caller-provided attach_btf_id overrides are respected by bpf_program__clone(). Performance Tested on selftests: 918 objects, ~4270 programs: - Wall time: 36.88s -> 23.18s (37% faster) - User time: 20.80s -> 16.07s (23% faster) - Kernel time: 12.07s -> 6.06s (50% faster) Per-program loading also improves coverage: 83 programs that previously failed now succeed. Known regression: - Program-containing maps (PROG_ARRAY, DEVMAP, CPUMAP) track owner program type. Programs with incompatible attributes loaded against a shared map will be rejected. This is expected kernel behavior. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> --- Changes in v5: - Fix overriding of the attach_btf_id, attach_btf_fd, etc: the override provided by the caller is applied after prog_prepare_load_fn(). - Added selftest to verify attach_btf_id override works as expected. - Link to v4: https://lore.kernel.org/all/20260316-veristat_prepare-v3-0-94e5691e0494@meta.com/ Changes in v4: - Replace OPTS_SET() with direct struct assignment for local bpf_prog_load_opts in bpf_program__clone() (libbpf.c) - Remove unnecessary pattr pointer indirection (libbpf.c) - Separate input and output fields in bpf_program__clone(): input fields (prog_flags, fd_array, etc.) are merged from caller opts before the callback; output fields (expected_attach_type, attach_btf_id, attach_btf_obj_fd) are initialized from prog/obj defaults for the callback, then overridden with caller opts after, so explicit caller values always win (libbpf.c) - Add selftest for attach_btf_id override - Link to v3: https://lore.kernel.org/r/20260206-veristat_prepare-a4a041873c53-v3@meta.com Changes in v3: - Clone fd_array_cnt in bpf_object__clone() - In veristat do not fail if bpf_object__prepare() fails, continue per-program processing to produce per program output - Link to v2: https://lore.kernel.org/r/20260220-veristat_prepare-v2-0-15bff49022a7@meta.com Changes in v2: - Removed map cloning entirely (libbpf.c) - Renamed bpf_prog_clone() -> bpf_program__clone() - Removed unnecessary obj NULL check (libbpf.c) - Fixed opts handling — no longer mutates caller's opts (libbpf.c) - Link to v1: https://lore.kernel.org/all/20260212-veristat_prepare-v1-0-c351023fb0db@meta.com/ --- ==================== Link: https://patch.msgid.link/20260317-veristat_prepare-v4-0-74193d4cc9d9@meta.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 13:17:14 -07:00
Mykyta Yatsenko	ceebdeec6e	selftests/bpf: Test bpf_program__clone() attach_btf_id override Add a test that verifies bpf_program__clone() respects caller-provided attach_btf_id in bpf_prog_load_opts. The BPF program has SEC("fentry/bpf_fentry_test1"). It is cloned twice from the same prepared object: first with no opts, verifying the callback resolves attach_btf_id from sec_name to bpf_fentry_test1; then with attach_btf_id overridden to bpf_fentry_test2, verifying the loaded program is actually attached to bpf_fentry_test2. Both results are checked via bpf_prog_get_info_by_fd(). Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260317-veristat_prepare-v4-3-74193d4cc9d9@meta.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 13:17:14 -07:00
Mykyta Yatsenko	3be706b937	selftests/bpf: Use bpf_program__clone() in veristat Replace veristat's per-program object re-opening with bpf_program__clone(). Previously, veristat opened a separate bpf_object for every program in a multi-program object file, iterated all programs to enable only the target one, and then loaded the entire object. Use bpf_object__prepare() once, then call bpf_program__clone() for each program individually. This lets veristat load programs one at a time from a single prepared object. The caller now owns the returned fd and closes it after collecting stats. Remove the special single-program fast path and the per-file early exit in handle_verif_mode() so all files are always processed. Split fixup_obj() into fixup_obj_maps() for object-wide map fixups that must run before bpf_object__prepare(), and fixup_obj() for per-program fixups (struct_ops masking, freplace type guessing) that run before each bpf_program__clone() call. Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Link: https://lore.kernel.org/r/20260317-veristat_prepare-v4-2-74193d4cc9d9@meta.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 13:17:14 -07:00
Mykyta Yatsenko	970bd2dced	libbpf: Introduce bpf_program__clone() Add bpf_program__clone() API that loads a single BPF program from a prepared BPF object into the kernel, returning a file descriptor owned by the caller. After bpf_object__prepare(), callers can use bpf_program__clone() to load individual programs with custom bpf_prog_load_opts, instead of loading all programs at once via bpf_object__load(). Non-zero fields in opts override the defaults derived from the program and object internals; passing NULL opts populates everything automatically. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260317-veristat_prepare-v4-1-74193d4cc9d9@meta.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 13:17:13 -07:00
Alexei Starovoitov	06880982c6	Merge branch 'bpf-fix-unsound-scalar-forking-for-bpf_or' Daniel Wade says: ==================== bpf: Fix unsound scalar forking for BPF_OR maybe_fork_scalars() unconditionally sets the pushed path dst register to 0 for both BPF_AND and BPF_OR. For AND this is correct (0 & K == 0), but for OR it is wrong (0 \| K == K, not 0). This causes the verifier to track an incorrect value on the pushed path, leading to a verifier/runtime divergence that allows out-of-bounds map value access. v4: Use block comment style for multi-line comments in selftests (Amery Hung) Add Reviewed-by/Acked-by tags v3: Use single-line comment style in selftests (Alexei Starovoitov) v2: Use push_stack(env, env->insn_idx, ...) to re-execute the insn on the pushed path (Eduard Zingerman) ==================== Link: https://patch.msgid.link/20260314021521.128361-1-danjwade95@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 13:14:29 -07:00
Daniel Wade	0ad1734cc5	selftests/bpf: Add tests for maybe_fork_scalars() OR vs AND handling Add three test cases to verifier_bounds.c to verify that maybe_fork_scalars() correctly tracks register values for BPF_OR operations with constant source operands: 1. or_scalar_fork_rejects_oob: After ARSH 63 + OR 8, the pushed path should have dst = 8. With value_size = 8, accessing map_value + 8 is out of bounds and must be rejected. 2. and_scalar_fork_still_works: Regression test ensuring AND forking continues to work. ARSH 63 + AND 4 produces pushed dst = 0 and current dst = 4, both within value_size = 8. 3. or_scalar_fork_allows_inbounds: After ARSH 63 + OR 4, the pushed path has dst = 4, which is within value_size = 8 and should be accepted. These tests exercise the fix in the previous patch, which makes the pushed path re-execute the ALU instruction so it computes the correct result for BPF_OR. Signed-off-by: Daniel Wade <danjwade95@gmail.com> Reviewed-by: Amery Hung <ameryhung@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260314021521.128361-3-danjwade95@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 13:14:28 -07:00
Daniel Wade	c845894ebd	bpf: Fix unsound scalar forking in maybe_fork_scalars() for BPF_OR maybe_fork_scalars() is called for both BPF_AND and BPF_OR when the source operand is a constant. When dst has signed range [-1, 0], it forks the verifier state: the pushed path gets dst = 0, the current path gets dst = -1. For BPF_AND this is correct: 0 & K == 0. For BPF_OR this is wrong: 0 \| K == K, not 0. The pushed path therefore tracks dst as 0 when the runtime value is K, producing an exploitable verifier/runtime divergence that allows out-of-bounds map access. Fix this by passing env->insn_idx (instead of env->insn_idx + 1) to push_stack(), so the pushed path re-executes the ALU instruction with dst = 0 and naturally computes the correct result for any opcode. Fixes: `bffacdb80b` ("bpf: Recognize special arithmetic shift in the verifier") Signed-off-by: Daniel Wade <danjwade95@gmail.com> Reviewed-by: Amery Hung <ameryhung@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260314021521.128361-2-danjwade95@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 13:14:28 -07:00
Alexei Starovoitov	1abd3feb36	Merge branch 'bpf-fix-abs-int_min-undefined-behavior-in-interpreter-sdiv-smod' Jenny Guanni Qu says: ==================== bpf: Fix abs(INT_MIN) undefined behavior in interpreter sdiv/smod The BPF interpreter's signed 32-bit division and modulo handlers use abs() on s32 operands, which is undefined for S32_MIN. This causes the interpreter to compute wrong results, creating a mismatch with the verifier's range tracking. For example, INT_MIN / 2 returns 0x40000000 instead of the correct 0xC0000000. The verifier tracks the correct range, so a crafted BPF program can exploit the mismatch for out-of-bounds map value access (confirmed by KASAN). Patch 1 introduces abs_s32() which handles S32_MIN correctly and replaces all 8 abs((s32)...) call sites. s32 is the only affected case -- the s64 handlers do not use abs(). Patch 2 adds selftests covering sdiv32 and smod32 with INT_MIN dividend to prevent regression. Changes since v4: - Renamed __safe_abs32() to abs_s32() and dropped inline keyword per Alexei Starovoitov's feedback Changes since v3: - Fixed stray blank line deletion in the file header - Improved comment per Yonghong Song's suggestion - Added JIT vs interpreter context to selftest commit message Changes since v2: - Simplified to use -(u32)x per Mykyta Yatsenko's suggestion Changes since v1: - Moved helper above kerneldoc comment block to fix build warnings ==================== Link: https://patch.msgid.link/20260311011116.2108005-1-qguanni@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 13:12:17 -07:00
Jenny Guanni Qu	4ac95c65ef	selftests/bpf: Add tests for sdiv32/smod32 with INT_MIN dividend Add tests to verify that signed 32-bit division and modulo operations produce correct results when the dividend is INT_MIN (0x80000000). The bug fixed in the previous commit only affects the BPF interpreter path. When JIT is enabled (the default on most architectures), the native CPU division instruction produces the correct result and these tests pass regardless. With bpf_jit_enable=0, the interpreter is used and without the previous fix, INT_MIN / 2 incorrectly returns 0x40000000 instead of 0xC0000000 due to abs(S32_MIN) undefined behavior, causing these tests to fail. Test cases: - SDIV32 INT_MIN / 2 = -1073741824 (imm and reg divisor) - SMOD32 INT_MIN % 2 = 0 (positive and negative divisor) Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev> Acked-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Jenny Guanni Qu <qguanni@gmail.com> Link: https://lore.kernel.org/r/20260311011116.2108005-3-qguanni@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 13:12:17 -07:00
Jenny Guanni Qu	c77b30bd1d	bpf: Fix undefined behavior in interpreter sdiv/smod for INT_MIN The BPF interpreter's signed 32-bit division and modulo handlers use the kernel abs() macro on s32 operands. The abs() macro documentation (include/linux/math.h) explicitly states the result is undefined when the input is the type minimum. When DST contains S32_MIN (0x80000000), abs((s32)DST) triggers undefined behavior and returns S32_MIN unchanged on arm64/x86. This value is then sign-extended to u64 as 0xFFFFFFFF80000000, causing do_div() to compute the wrong result. The verifier's abstract interpretation (scalar32_min_max_sdiv) computes the mathematically correct result for range tracking, creating a verifier/interpreter mismatch that can be exploited for out-of-bounds map value access. Introduce abs_s32() which handles S32_MIN correctly by casting to u32 before negating, avoiding signed overflow entirely. Replace all 8 abs((s32)...) call sites in the interpreter's sdiv32/smod32 handlers. s32 is the only affected case -- the s64 division/modulo handlers do not use abs(). Fixes: `ec0e2da95f` ("bpf: Support new signed div/mod instructions.") Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Jenny Guanni Qu <qguanni@gmail.com> Link: https://lore.kernel.org/r/20260311011116.2108005-2-qguanni@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 13:12:16 -07:00
Tiezhu Yang	f4706504e2	selftests/bpf: Add alignment flag for test_verifier 190 testcase There exists failure when executing the testcase "./test_verifier 190" if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is not set on LoongArch. #190/p calls: two calls that return map_value with incorrect bool check FAIL ... misaligned access off (0x0; 0xffffffffffffffff)+0 size 8 ... Summary: 0 PASSED, 0 SKIPPED, 1 FAILED It means that the program has unaligned accesses, but the kernel sets CONFIG_ARCH_STRICT_ALIGN by default to enable -mstrict-align to prevent unaligned accesses, so add a flag F_NEEDS_EFFICIENT_UNALIGNED_ACCESS into the testcase to avoid the failure. This is somehow similar with the commit `ce1f289f54` ("selftests/bpf: Add F_NEEDS_EFFICIENT_UNALIGNED_ACCESS to some tests"). Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Acked-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/20260310064507.4228-3-yangtiezhu@loongson.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 13:11:18 -07:00
Alexei Starovoitov	21337b58f5	Merge branch 'bpf-consolidate-sleepable-context-checks-in-verifier' Puranjay Mohan says: ==================== bpf: Consolidate sleepable context checks in verifier The BPF verifier has multiple call-checking functions that independently validate whether sleepable operations are permitted in the current context. Each function open-codes its own checks against active_rcu_locks, active_preempt_locks, active_irq_id, and in_sleepable, duplicating the logic already provided by in_sleepable_context(). This series consolidates these scattered checks into calls to in_sleepable_context() across check_helper_call(), check_kfunc_call(), and check_func_call(), reducing code duplication and making the error reporting consistent. No functional change. ==================== Link: https://patch.msgid.link/20260318174327.3151925-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 13:09:35 -07:00
Puranjay Mohan	a2542a91aa	bpf: Consolidate sleepable checks in check_func_call() The sleepable context check for global function calls in check_func_call() open-codes the same checks that in_sleepable_context() already performs. Replace the open-coded check with a call to in_sleepable_context() and use non_sleepable_context_description() for the error message, consistent with check_helper_call() and check_kfunc_call(). Note that in_sleepable_context() also checks active_locks, which overlaps with the existing active_locks check above it. However, the two checks serve different purposes: the active_locks check rejects all global function calls while holding a lock (not just sleepable ones), so it must remain as a separate guard. Update the expected error messages in the irq and preempt_lock selftests to match. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260318174327.3151925-4-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 13:09:35 -07:00
Puranjay Mohan	cd9840c413	bpf: Consolidate sleepable checks in check_kfunc_call() check_kfunc_call() has multiple scattered checks that reject sleepable kfuncs in various non-sleepable contexts (RCU, preempt-disabled, IRQ- disabled). These are the same conditions already checked by in_sleepable_context(), so replace them with a single consolidated check. This also simplifies the preempt lock tracking by flattening the nested if/else structure into a linear chain: preempt_disable increments, preempt_enable checks for underflow and decrements. The sleepable check is kept as a separate block since it is logically distinct from the lock accounting. No functional change since in_sleepable_context() checks all the same state (active_rcu_locks, active_preempt_locks, active_locks, active_irq_id, in_sleepable). Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260318174327.3151925-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 13:09:35 -07:00
Puranjay Mohan	a0d06cf102	bpf: Consolidate sleepable checks in check_helper_call() check_helper_call() prints the error message for every env->cur_state->active* element when calling a sleepable helper. Consolidate all of them into a single print statement. The check for env->cur_state->active_locks was not part of the removed print statements and will not be triggered with the consolidated print as well because it is checked in do_check() before check_helper_call() is even reached. Acked-by: Mykyta Yatsenko <yatsenko@meta.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260318174327.3151925-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 13:09:34 -07:00
Ihor Solodrai	a1e5c46eae	selftests/bpf: Add tests for bpf_throw lock leak from subprogs Add test cases to ensure the verifier correctly rejects bpf_throw from subprogs when RCU, preempt, or IRQ locks are held: * reject_subprog_rcu_lock_throw: subprog acquires bpf_rcu_read_lock and then calls bpf_throw * reject_subprog_throw_preempt_lock: always-throwing subprog called while caller holds bpf_preempt_disable * reject_subprog_throw_irq_lock: always-throwing subprog called while caller holds bpf_local_irq_save Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260320000809.643798-2-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 12:51:44 -07:00
Ihor Solodrai	6c2128505f	bpf: Fix exception exit lock checking for subprogs process_bpf_exit_full() passes check_lock = !curframe to check_resource_leak(), which is false in cases when bpf_throw() is called from a static subprog. This makes check_resource_leak() to skip validation of active_rcu_locks, active_preempt_locks, and active_irq_id on exception exits from subprogs. At runtime bpf_throw() unwinds the stack via ORC without releasing any user-acquired locks, which may cause various issues as the result. Fix by setting check_lock = true for exception exits regardless of curframe, since exceptions bypass all intermediate frame cleanup. Update the error message prefix to "bpf_throw" for exception exits to distinguish them from normal BPF_EXIT. Fix reject_subprog_with_rcu_read_lock test which was previously passing for the wrong reason. Test program returned directly from the subprog call without closing the RCU section, so the error was triggered by the unclosed RCU lock on normal exit, not by bpf_throw. Update __msg annotations for affected tests to match the new "bpf_throw" error prefix. The spin_lock case is not affected because they are already checked [1] at the call site in do_check_insn() before bpf_throw can run. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v7.0-rc4#n21098 Assisted-by: Claude:claude-opus-4-6 Fixes: `f18b03faba` ("bpf: Implement BPF exceptions") Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260320000809.643798-1-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-21 12:51:44 -07:00
Wolfram Sang	2f42e85622	Merge tag 'i2c-host-fixes-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current i2c-fixes for v7.0-rc5 pxa: fix broken I2C communication on Armada 3700 with recovery fsi: fix device_node reference leak in probe cp2615: fix NULL-deref when serial string is missing	2026-03-21 19:52:12 +01:00
Linus Torvalds	113ae7b4de	Merge tag 'hwmon-for-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - max6639: Fix pulses-per-revolution implementation - Several PMBus drivers: Add missing error checks * tag 'hwmon-for-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (max6639) Fix pulses-per-revolution implementation hwmon: (pmbus/isl68137) Fix unchecked return value and use sysfs_emit() hwmon: (pmbus/ina233) Add error check for pmbus_read_word_data() return value hwmon: (pmbus/mp2869) Check pmbus_read_byte_data() before using its return value hwmon: (pmbus/mp2975) Add error check for pmbus_read_word_data() return value hwmon: (pmbus/hac300s) Add error check for pmbus_read_word_data() return value	2026-03-21 09:09:51 -07:00

1 2 3 4 5 ...

1428924 Commits