linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-04-10 20:38:50 -04:00

Author	SHA1	Message	Date
Eduard Zingerman	fdcecdff90	selftests/bpf: test cases for callchain sensitive live stack tracking - simple propagation of read/write marks; - joining read/write marks from conditional branches; - avoid must_write marks in when same instruction accesses different stack offsets on different execution paths; - avoid must_write marks in case same instruction accesses stack and non-stack pointers on different execution paths; - read/write marks propagation to outer stack frame; - independent read marks for different callchains ending with the same function; - bpf_calls_callback() dependent logic in liveness.c:bpf_stack_slot_alive(). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-12-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:24 -07:00
Eduard Zingerman	34c513be3d	selftests/bpf: __not_msg() tag for test_loader framework This patch adds tags __not_msg(<msg>) and __not_msg_unpriv(<msg>). Test fails if <msg> is found in verifier log. If __msg_not() is situated between __msg() tags framework matches __msg() tags first, and then checks that <msg> is not present in a portion of a log between bracketing __msg() tags. __msg_not() tags bracketed by a same __msg() group are effectively unordered. The idea is borrowed from LLVM's CheckFile with its CHECK-NOT syntax. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-11-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:23 -07:00
Eduard Zingerman	79f047c7d9	bpf: table based bpf_insn_successors() Converting bpf_insn_successors() to use lookup table makes it ~1.5 times faster. Also remove unnecessary conditionals: - `idx + 1 < prog->len` is unnecessary because after check_cfg() all jump targets are guaranteed to be within a program; - `i == 0 \|\| succ[0] != dst` is unnecessary because any client of bpf_insn_successors() can handle duplicate edges: - compute_live_registers() - compute_scc() Moving bpf_insn_successors() to liveness.c allows its inlining in liveness.c:__update_stack_liveness(). Such inlining speeds up __update_stack_liveness() by ~40%. bpf_insn_successors() is used in both verifier.c and liveness.c. perf shows such move does not negatively impact users in verifier.c, as these are executed only once before main varification pass. Unlike __update_stack_liveness() which can be triggered multiple times. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-10-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:23 -07:00
Eduard Zingerman	107e169799	bpf: disable and remove registers chain based liveness Remove register chain based liveness tracking: - struct bpf_reg_state->{parent,live} fields are no longer needed; - REG_LIVE_WRITTEN marks are superseded by bpf_mark_stack_write() calls; - mark_reg_read() calls are superseded by bpf_mark_stack_read(); - log.c:print_liveness() is superseded by logging in liveness.c; - propagate_liveness() is superseded by bpf_update_live_stack(); - no need to establish register chains in is_state_visited() anymore; - fix a bunch of tests expecting "_w" suffixes in verifier log messages. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-9-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:23 -07:00
Eduard Zingerman	ccf25a67c7	bpf: signal error if old liveness is more conservative than new Unlike the new algorithm, register chain based liveness tracking is fully path sensitive, and thus should be strictly more accurate. Validate the new algorithm by signaling an error whenever it considers a stack slot dead while the old algorithm considers it alive. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-8-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:23 -07:00
Eduard Zingerman	e41c237953	bpf: enable callchain sensitive stack liveness tracking Allocate analysis instance: - Add bpf_stack_liveness_{init,free}() calls to bpf_check(). Notify the instance about any stack reads and writes: - Add bpf_mark_stack_write() call at every location where REG_LIVE_WRITTEN is recorded for a stack slot. - Add bpf_mark_stack_read() call at every location mark_reg_read() is called. - Both bpf_mark_stack_{read,write}() rely on env->liveness->cur_instance callchain being in sync with env->cur_state. It is possible to update env->liveness->cur_instance every time a mark read/write is called, but that costs a hash table lookup and is noticeable in the performance profile. Hence, manually reset env->liveness->cur_instance whenever the verifier changes env->cur_state call stack: - call bpf_reset_live_stack_callchain() when the verifier enters a subprogram; - call bpf_update_live_stack() when the verifier exits a subprogram (it implies the reset). Make sure bpf_update_live_stack() is called for a callchain before issuing liveness queries. And make sure that bpf_update_live_stack() is called for any callee callchain first: - Add bpf_update_live_stack() call at every location that processes BPF_EXIT: - exit from a subprogram; - before pop_stack() call. This makes sure that bpf_update_live_stack() is called for callee callchains before caller callchains. Make sure must_write marks are set to zero for instructions that do not always access the stack: - Wrap do_check_insn() with bpf_reset_stack_write_marks() / bpf_commit_stack_write_marks() calls. Any calls to bpf_mark_stack_write() are accumulated between this pair of calls. If no bpf_mark_stack_write() calls were made it means that the instruction does not access stack (at-least on the current verification path) and it is important to record this fact. Finally, use bpf_live_stack_query_init() / bpf_stack_slot_alive() to query stack liveness info. The manual tracking of the correct order for callee/caller bpf_update_live_stack() calls is a bit convoluted and may warrant some automation in future revisions. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-7-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:23 -07:00
Eduard Zingerman	b3698c356a	bpf: callchain sensitive stack liveness tracking using CFG This commit adds a flow-sensitive, context-sensitive, path-insensitive data flow analysis for live stack slots: - flow-sensitive: uses program control flow graph to compute data flow values; - context-sensitive: collects data flow values for each possible call chain in a program; - path-insensitive: does not distinguish between separate control flow graph paths reaching the same instruction. Compared to the current path-sensitive analysis, this approach trades some precision for not having to enumerate every path in the program. This gives a theoretical capability to run the analysis before main verification pass. See cover letter for motivation. The basic idea is as follows: - Data flow values indicate stack slots that might be read and stack slots that are definitely written. - Data flow values are collected for each (call chain, instruction number) combination in the program. - Within a subprogram, data flow values are propagated using control flow graph. - Data flow values are transferred from entry instructions of callee subprograms to call sites in caller subprograms. In other words, a tree of all possible call chains is constructed. Each node of this tree represents a subprogram. Read and write marks are collected for each instruction of each node. Live stack slots are first computed for lower level nodes. Then, information about outer stack slots that might be read or are definitely written by a subprogram is propagated one level up, to the corresponding call instructions of the upper nodes. Procedure repeats until root node is processed. In the absence of value range analysis, stack read/write marks are collected during main verification pass, and data flow computation is triggered each time verifier.c:states_equal() needs to query the information. Implementation details are documented in kernel/bpf/liveness.c. Quantitative data about verification performance changes and memory consumption is in the cover letter. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-6-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:23 -07:00
Eduard Zingerman	efcda22aa5	bpf: compute instructions postorder per subprogram The next patch would require doing postorder traversal of individual subprograms. Facilitate this by moving env->cfg.insn_postorder computation from check_cfg() to a separate pass, as check_cfg() descends into called subprograms (and it needs to, because of merge_callee_effects() logic). env->cfg.insn_postorder is used only by compute_live_registers(), this function does not track cross subprogram dependencies, thus the change does not affect it's operation. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-5-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:23 -07:00
Eduard Zingerman	3b20d3c120	bpf: declare a few utility functions as internal api Namely, rename the following functions and add prototypes to bpf_verifier.h: - find_containing_subprog -> bpf_find_containing_subprog - insn_successors -> bpf_insn_successors - calls_callback -> bpf_calls_callback - fmt_stack_mask -> bpf_fmt_stack_mask Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-4-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:22 -07:00
Eduard Zingerman	12a23f93a5	bpf: remove redundant REG_LIVE_READ check in stacksafe() stacksafe() is called in exact == NOT_EXACT mode only for states that had been porcessed by clean_verifier_states(). The latter replaces dead stack spills with a series of STACK_INVALID masks. Such masks are already handled by stacksafe(). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-3-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:22 -07:00
Eduard Zingerman	6cd21eb9ad	bpf: use compute_live_registers() info in clean_func_state Prepare for bpf_reg_state->live field removal by leveraging insn_aux_data->live_regs_before instead of bpf_reg_state->live in compute_live_registers(). This is similar to logic in func_states_equal(). No changes in verification performance for selftests or sched_ext. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-2-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:22 -07:00
Eduard Zingerman	daf4c2929f	bpf: bpf_verifier_state->cleaned flag instead of REG_LIVE_DONE Prepare for bpf_reg_state->live field removal by introducing a separate flag to track if clean_verifier_state() had been applied to the state. No functional changes. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-1-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:22 -07:00
KP Singh	8cd189e414	bpf: Move the signature kfuncs to helpers.c No functional changes, except for the addition of the headers for the kfuncs so that they can be used for signature verification. Signed-off-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/r/20250914215141.15144-8-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-18 19:11:42 -07:00
KP Singh	ea2e6467ac	bpf: Return hashes of maps in BPF_OBJ_GET_INFO_BY_FD Currently only array maps are supported, but the implementation can be extended for other maps and objects. The hash is memoized only for exclusive and frozen maps as their content is stable until the exclusive program modifies the map. This is required for BPF signing, enabling a trusted loader program to verify a map's integrity. The loader retrieves the map's runtime hash from the kernel and compares it against an expected hash computed at build time. Signed-off-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/r/20250914215141.15144-7-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-18 19:11:42 -07:00
KP Singh	6c850cbca8	selftests/bpf: Add tests for exclusive maps Check if access is denied to another program for an exclusive map Signed-off-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/r/20250914215141.15144-6-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-18 19:11:42 -07:00
KP Singh	567010a547	libbpf: Support exclusive map creation Implement setters and getters that allow map to be registered as exclusive to the specified program. The registration should be done before the exclusive program is loaded. Signed-off-by: KP Singh <kpsingh@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20250914215141.15144-5-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-18 19:11:42 -07:00
KP Singh	c297fe3e9f	libbpf: Implement SHA256 internal helper Use AF_ALG sockets to not have libbpf depend on OpenSSL. The helper is used for the loader generation code to embed the metadata hash in the loader program and also by the bpf_map__make_exclusive API to calculate the hash of the program the map is exclusive to. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/r/20250914215141.15144-4-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-18 19:11:42 -07:00
KP Singh	baefdbdf68	bpf: Implement exclusive map creation Exclusive maps allow maps to only be accessed by program with a program with a matching hash which is specified in the excl_prog_hash attr. For the signing use-case, this allows the trusted loader program to load the map and verify the integrity Signed-off-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/r/20250914215141.15144-3-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-18 19:11:42 -07:00
KP Singh	603b441623	bpf: Update the bpf_prog_calc_tag to use SHA256 Exclusive maps restrict map access to specific programs using a hash. The current hash used for this is SHA1, which is prone to collisions. This patch uses SHA256, which is more resilient against collisions. This new hash is stored in bpf_prog and used by the verifier to determine if a program can access a given exclusive map. The original 64-bit tags are kept, as they are used by users as a short, possibly colliding program identifier for non-security purposes. Signed-off-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/r/20250914215141.15144-2-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-18 19:10:20 -07:00
Alexei Starovoitov	3547a61ee2	Merge branch 'update-kf_rcu_protected' Kumar Kartikeya Dwivedi says: ==================== Update KF_RCU_PROTECTED Currently, KF_RCU_PROTECTED only applies to iterator APIs and that too in a convoluted fashion: the presence of this flag on the kfunc is used to set MEM_RCU in iterator type, and the lack of RCU protection results in an error only later, once next() or destroy() methods are invoked on the iterator. While there is no bug, this is certainly a bit unintuitive, and makes the enforcement of the flag iterator specific. In the interest of making this flag useful for other upcoming kfuncs, e.g. scx_bpf_cpu_curr() [0][1], add enforcement for invoking the kfunc in an RCU critical section in general. In addition to this, the aforementioned kfunc also needs to return an RCU protected pointer, which currently has no generic kfunc flag or annotation. Add such a flag as well while we are at it. [0]: https://lore.kernel.org/all/20250903212311.369697-3-christian.loehle@arm.com [1]: https://lore.kernel.org/all/20250909195709.92669-1-arighi@nvidia.com Changelog: ---------- v2 -> v3 v2: https://lore.kernel.org/bpf/20250917032014.4060112-1-memxor@gmail.com * Add back lost hunk reworking documentation for KF_RCU_PROTECTED. v1 -> v2 v1: https://lore.kernel.org/bpf/20250915024731.1494251-1-memxor@gmail.com * Drop KF_RET_RCU and fold change into KF_RCU_PROTECTED. (Andrea, Alexei) * Update tests for non-struct pointer return values with KF_RCU_PROTECTED. ==================== Link: https://patch.msgid.link/20250917032755.4068726-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-18 15:36:17 -07:00
Kumar Kartikeya Dwivedi	8b788d6638	selftests/bpf: Add tests for KF_RCU_PROTECTED Add a couple of test cases to ensure RCU protection is kicked in automatically, and the return type is as expected. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250917032755.4068726-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-18 15:36:17 -07:00
Kumar Kartikeya Dwivedi	1512231b6c	bpf: Enforce RCU protection for KF_RCU_PROTECTED Currently, KF_RCU_PROTECTED only applies to iterator APIs and that too in a convoluted fashion: the presence of this flag on the kfunc is used to set MEM_RCU in iterator type, and the lack of RCU protection results in an error only later, once next() or destroy() methods are invoked on the iterator. While there is no bug, this is certainly a bit unintuitive, and makes the enforcement of the flag iterator specific. In the interest of making this flag useful for other upcoming kfuncs, e.g. scx_bpf_cpu_curr() [0][1], add enforcement for invoking the kfunc in an RCU critical section in general. This would also mean that iterator APIs using KF_RCU_PROTECTED will error out earlier, instead of throwing an error for lack of RCU CS protection when next() or destroy() methods are invoked. In addition to this, if the kfuncs tagged KF_RCU_PROTECTED return a pointer value, ensure that this pointer value is only usable in an RCU critical section. There might be edge cases where the return value is special and doesn't need to imply MEM_RCU semantics, but in general, the assumption should hold for the majority of kfuncs, and we can revisit things if necessary later. [0]: https://lore.kernel.org/all/20250903212311.369697-3-christian.loehle@arm.com [1]: https://lore.kernel.org/all/20250909195709.92669-1-arighi@nvidia.com Tested-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250917032755.4068726-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-18 15:36:17 -07:00
Hengqi Chen	6ff4a0fa3e	bpf, arm64: Call bpf_jit_binary_pack_finalize() in bpf_jit_free() The current implementation seems incorrect and does NOT match the comment above, use bpf_jit_binary_pack_finalize() instead. Fixes: `1dad391dae` ("bpf, arm64: use bpf_prog_pack for memory management") Acked-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Acked-by: Song Liu <song@kernel.org> Acked-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20250916232653.101004-1-hengqi.chen@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-17 11:20:32 -07:00
Eduard Zingerman	a24a2dda70	selftests/bpf: trigger verifier.c:maybe_exit_scc() for a speculative state This is a test case minimized from a syzbot reproducer from [1]. The test case triggers verifier.c:maybe_exit_scc() w/o preceding call to verifier.c:maybe_enter_scc() on a speculative symbolic execution path. Here is verifier log for the test case: Live regs before insn: 0: .......... (b7) r0 = 100 1 1: 0......... (7b) (u64 )(r10 -512) = r0 1 2: 0......... (b5) if r0 <= 0x0 goto pc-2 3: 0......... (95) exit 0: R1=ctx() R10=fp0 0: (b7) r0 = 100 ; R0_w=100 1: (7b) (u64 )(r10 -512) = r0 ; R0_w=100 R10=fp0 fp-512_w=100 2: (b5) if r0 <= 0x0 goto pc-2 mark_precise: ... 2: R0_w=100 3: (95) exit from 2 to 1 (speculative execution): R0_w=scalar() R1=ctx() R10=fp0 fp-512_w=100 1: R0_w=scalar() R1=ctx() R10=fp0 fp-512_w=100 1: (7b) (u64 )(r10 -512) = r0 processed 5 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 - Non-speculative execution path 0-3 does not allocate any checkpoints (and hence does not call maybe_enter_scc()), and schedules a speculative jump from 2 to 1. - Speculative execution path stops immediately because of an infinite loop detection and triggers verifier.c:update_branch_counts() -> maybe_exit_scc() calls. [1] https://lore.kernel.org/bpf/68c85acd.050a0220.2ff435.03a4.GAE@google.com/ Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250916212251.3490455-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-17 11:19:58 -07:00
Eduard Zingerman	a3c73d629e	bpf: dont report verifier bug for missing bpf_scc_visit on speculative path Syzbot generated a program that triggers a verifier_bug() call in maybe_exit_scc(). maybe_exit_scc() assumes that, when called for a state with insn_idx in some SCC, there should be an instance of struct bpf_scc_visit allocated for that SCC. Turns out the assumption does not hold for speculative execution paths. See example in the next patch. maybe_scc_exit() is called from update_branch_counts() for states that reach branch count of zero, meaning that path exploration for a particular path is finished. Path exploration can finish in one of three ways: a. Verification error is found. In this case, update_branch_counts() is called only for non-speculative paths. b. Top level BPF_EXIT is reached. Such instructions are never a part of an SCC, so compute_scc_callchain() in maybe_scc_exit() will return false, and maybe_scc_exit() will return early. c. A checkpoint is reached and matched. Checkpoints are created by is_state_visited(), which calls maybe_enter_scc(), which allocates bpf_scc_visit instances for checkpoints within SCCs. Hence, for non-speculative symbolic execution paths, the assumption still holds: if maybe_scc_exit() is called for a state within an SCC, bpf_scc_visit instance must exist. This patch removes the verifier_bug() call for speculative paths. Fixes: `c9e31900b5` ("bpf: propagate read/precision marks over state graph backedges") Reported-by: syzbot+3afc814e8df1af64b653@syzkaller.appspotmail.com Closes: https://lore.kernel.org/bpf/68c85acd.050a0220.2ff435.03a4.GAE@google.com/ Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250916212251.3490455-1-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-17 11:19:58 -07:00
Paul Chaignon	180a46bc1a	selftests/bpf: Test accesses to ctx padding This patch adds tests covering the various paddings in ctx structures. In case of sk_lookup BPF programs, the behavior is a bit different because accesses to the padding are explicitly allowed. Other cases result in a clear reject from the verifier. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/3dc5f025e350aeb2bb1c257b87c577518e574aeb.1758094761.git.paul.chaignon@gmail.com	2025-09-17 16:15:57 +02:00
Paul Chaignon	7c60f6e488	selftests/bpf: Move macros to bpf_misc.h Move the sizeof_field and offsetofend macros from individual test files to the common bpf_misc.h to avoid duplication. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/97a3f3788bd3aec309100bc073a5c77130e371fd.1758094761.git.paul.chaignon@gmail.com	2025-09-17 16:15:37 +02:00
Paul Chaignon	6fabca2fc9	bpf: Explicitly check accesses to bpf_sock_addr Syzkaller found a kernel warning on the following sock_addr program: 0: r0 = 0 1: r2 = (u32 )(r1 +60) 2: exit which triggers: verifier bug: error during ctx access conversion (0) This is happening because offset 60 in bpf_sock_addr corresponds to an implicit padding of 4 bytes, right after msg_src_ip4. Access to this padding isn't rejected in sock_addr_is_valid_access and it thus later fails to convert the access. This patch fixes it by explicitly checking the various fields of bpf_sock_addr in sock_addr_is_valid_access. I checked the other ctx structures and is_valid_access functions and didn't find any other similar cases. Other cases of (properly handled) padding are covered in new tests in a subsequent patch. Fixes: `1cedee13d2` ("bpf: Hooks for sys_sendmsg") Reported-by: syzbot+136ca59d411f92e821b7@syzkaller.appspotmail.com Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Closes: https://syzkaller.appspot.com/bug?extid=136ca59d411f92e821b7 Link: https://lore.kernel.org/bpf/b58609d9490649e76e584b0361da0abd3c2c1779.1758094761.git.paul.chaignon@gmail.com	2025-09-17 16:15:17 +02:00
Eduard Zingerman	b13448dd64	bpf: potential double-free of env->insn_aux_data Function bpf_patch_insn_data() has the following structure: static struct bpf_prog bpf_patch_insn_data(... env ...) { struct bpf_prog new_prog; struct bpf_insn_aux_data *new_data = NULL; if (len > 1) { new_data = vrealloc(...); // <--------- (1) if (!new_data) return NULL; env->insn_aux_data = new_data; // <---- (2) } new_prog = bpf_patch_insn_single(env->prog, off, patch, len); if (IS_ERR(new_prog)) { ... vfree(new_data); // <----------------- (3) return NULL; } ... happy path ... } In case if bpf_patch_insn_single() returns an error the `new_data` allocated at (1) will be freed at (3). However, at (2) this pointer is stored in `env->insn_aux_data`. Which is freed unconditionally by verifier.c:bpf_check() on both happy and error paths. Thus, leading to double-free. Fix this by removing vfree() call at (3), ownership over `new_data` is already passed to `env->insn_aux_data` at this point. Fixes: `77620d1267` ("bpf: use realloc in bpf_patch_insn_data") Reported-by: Chris Mason <clm@meta.com> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250912-patch-insn-data-double-free-v1-1-af05bd85a21a@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-15 13:04:21 -07:00
Alan Maguire	3ae4c52708	selftests/bpf: More open-coded gettid syscall cleanup Commit `0e2fb011a0` ("selftests/bpf: Clean up open-coded gettid syscall invocations") addressed the issue that older libc may not have a gettid() function call wrapper for the associated syscall. A few more instances have crept into tests, use sys_gettid() instead, and poison raw gettid() usage to avoid future issues. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20250911163056.543071-1-alan.maguire@oracle.com	2025-09-15 20:39:24 +02:00
Alexei Starovoitov	61ee2cce3f	Merge branch 'remove-use-of-current-cgns-in-bpf_cgroup_from_id' Kumar Kartikeya Dwivedi says: ==================== Remove use of current->cgns in bpf_cgroup_from_id bpf_cgroup_from_id currently ends up doing a check on whether the cgroup being looked up is a descendant of the root cgroup of the current task's cgroup namespace. This leads to unreliable results since this kfunc can be invoked from any arbitrary context, for any arbitrary value of current. Fix this by removing namespace-awarness in the kfunc, and include a test that detects such a case and fails without the fix. Changelog: ---------- v2 -> v3 v2: https://lore.kernel.org/bpf/20250811195901.1651800-1-memxor@gmail.com * Refactor cgroup_get_from_id into non-ns version. (Andrii) * Address nits from Eduard. v1 -> v2 v1: https://lore.kernel.org/bpf/20250811175045.1055202-1-memxor@gmail.com * Add Ack from Tejun. * Fix selftest to perform namespace migration and cgroup setup in a child process to avoid changing test_progs namespace. ==================== Link: https://patch.msgid.link/20250915032618.1551762-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-15 10:53:15 -07:00
Kumar Kartikeya Dwivedi	a8250d167c	selftests/bpf: Add a test for bpf_cgroup_from_id lookup in non-root cgns Make sure that we only switch the cgroup namespace and enter a new cgroup in a child process separate from test_progs, to not mess up the environment for subsequent tests. To remove this cgroup, we need to wait for the child to exit, and then rmdir its cgroup. If the read call fails, or waitpid succeeds, we know the child exited (read call would fail when the last pipe end is closed, otherwise waitpid waits until exit(2) is called). We then invoke a newly introduced remove_cgroup_pid() helper, that identifies cgroup path using the passed in pid of the now dead child, instead of using the current process pid (getpid()). Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250915032618.1551762-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-15 10:53:15 -07:00
Kumar Kartikeya Dwivedi	2c89513395	bpf: Do not limit bpf_cgroup_from_id to current's namespace The bpf_cgroup_from_id kfunc relies on cgroup_get_from_id to obtain the cgroup corresponding to a given cgroup ID. This helper can be called in a lot of contexts where the current thread can be random. A recent example was its use in sched_ext's ops.tick(), to obtain the root cgroup pointer. Since the current task can be whatever random user space task preempted by the timer tick, this makes the behavior of the helper unreliable. Refactor out __cgroup_get_from_id as the non-namespace aware version of cgroup_get_from_id, and change bpf_cgroup_from_id to make use of it. There is no compatibility breakage here, since changing the namespace against which the lookup is being done to the root cgroup namespace only permits a wider set of lookups to succeed now. The cgroup IDs across namespaces are globally unique, and thus don't need to be retranslated. Reported-by: Dan Schatzberg <dschatzberg@meta.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20250915032618.1551762-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-15 10:53:15 -07:00
Saket Kumar Bhaskar	a9d4e9f0e8	selftests/bpf: Fix arena_spin_lock selftest failure For systems having CONFIG_NR_CPUS set to > 1024 in kernel config the selftest fails as arena_spin_lock_irqsave() returns EOPNOTSUPP. (eg - incase of powerpc default value for CONFIG_NR_CPUS is 8192) The selftest is skipped incase bpf program returns EOPNOTSUPP, with a descriptive message logged. Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Link: https://lore.kernel.org/r/20250913091337.1841916-1-skb99@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-15 10:49:18 -07:00
Leon Hwang	f7528e4412	selftests/bpf: Skip timer_interrupt case when bpf_timer is not supported Like commit `fbdd61c94b` ("selftests/bpf: Skip timer cases when bpf_timer is not supported"), 'timer_interrupt' test case should be skipped if verifier rejects bpf_timer with returning -EOPNOTSUPP. cd tools/testing/selftests/bpf ./test_progs -t timer 461 timer_interrupt:SKIP Summary: 6/0 PASSED, 7 SKIPPED, 0 FAILED Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250915121657.28084-1-leon.hwang@linux.dev	2025-09-15 10:22:58 -07:00
Quentin Monnet	32d376610b	bpftool: Search for tracefs at /sys/kernel/tracing first With "bpftool prog tracelog", bpftool prints messages from the trace pipe. To do so, it first needs to find the tracefs mount point to open the pipe. Bpftool looks at a few "default" locations, including /sys/kernel/debug/tracing and /sys/kernel/tracing. Some of these locations, namely /tracing and /trace, are not standard. They are in the list because some users used to hardcode the tracing directory to short names; but we have no compelling reason to look at these locations. If we fail to find the tracefs at the default locations, we have an additional step to find it by parsing /proc/mounts anyway, so it's safe to remove these entries from the list of default locations to check. Additionally, Alexei reports that looking for the tracefs at /sys/kernel/debug/tracing may automatically mount the file system under that location, and generate a kernel log message telling that auto-mounting there is deprecated. To avoid this message, let's swap the order for checking the potential mount points: try /sys/kernel/tracing first, which should be the standard location nowadays. The kernel log message may still appear if the tracefs is not mounted on /sys/kernel/tracing when we run bpftool. Reported-by: Alexei Starovoitov <ast@kernel.org> Closes: https://lore.kernel.org/r/CAADnVQLcMi5YQhZKsU4z3S2uVUAGu_62C33G2Zx_ruG3uXa-Ug@mail.gmail.com/ Signed-off-by: Quentin Monnet <qmo@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20250915134209.36568-1-qmo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-15 10:10:28 -07:00
Hengqi Chen	fd2e081289	riscv, bpf: Sign extend struct ops return values properly The ns_bpf_qdisc selftest triggers a kernel panic: Unable to handle kernel paging request at virtual address ffffffffa38dbf58 Current test_progs pgtable: 4K pagesize, 57-bit VAs, pgdp=0x00000001109cc000 [ffffffffa38dbf58] pgd=000000011fffd801, p4d=000000011fffd401, pud=000000011fffd001, pmd=0000000000000000 Oops [#1] Modules linked in: bpf_testmod(OE) xt_conntrack nls_iso8859_1 [...] [last unloaded: bpf_testmod(OE)] CPU: 1 UID: 0 PID: 23584 Comm: test_progs Tainted: G W OE 6.17.0-rc1-g2465bb83e0b4 #1 NONE Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: Unknown Unknown Product/Unknown Product, BIOS 2024.01+dfsg-1ubuntu5.1 01/01/2024 epc : __qdisc_run+0x82/0x6f0 ra : __qdisc_run+0x6e/0x6f0 epc : ffffffff80bd5c7a ra : ffffffff80bd5c66 sp : ff2000000eecb550 gp : ffffffff82472098 tp : ff60000096895940 t0 : ffffffff8001f180 t1 : ffffffff801e1664 t2 : 0000000000000000 s0 : ff2000000eecb5d0 s1 : ff60000093a6a600 a0 : ffffffffa38dbee8 a1 : 0000000000000001 a2 : ff2000000eecb510 a3 : 0000000000000001 a4 : 0000000000000000 a5 : 0000000000000010 a6 : 0000000000000000 a7 : 0000000000735049 s2 : ffffffffa38dbee8 s3 : 0000000000000040 s4 : ff6000008bcda000 s5 : 0000000000000008 s6 : ff60000093a6a680 s7 : ff60000093a6a6f0 s8 : ff60000093a6a6ac s9 : ff60000093140000 s10: 0000000000000000 s11: ff2000000eecb9d0 t3 : 0000000000000000 t4 : 0000000000ff0000 t5 : 0000000000000000 t6 : ff60000093a6a8b6 status: 0000000200000120 badaddr: ffffffffa38dbf58 cause: 000000000000000d [<ffffffff80bd5c7a>] __qdisc_run+0x82/0x6f0 [<ffffffff80b6fe58>] __dev_queue_xmit+0x4c0/0x1128 [<ffffffff80b80ae0>] neigh_resolve_output+0xd0/0x170 [<ffffffff80d2daf6>] ip6_finish_output2+0x226/0x6c8 [<ffffffff80d31254>] ip6_finish_output+0x10c/0x2a0 [<ffffffff80d31446>] ip6_output+0x5e/0x178 [<ffffffff80d2e232>] ip6_xmit+0x29a/0x608 [<ffffffff80d6f4c6>] inet6_csk_xmit+0xe6/0x140 [<ffffffff80c985e4>] __tcp_transmit_skb+0x45c/0xaa8 [<ffffffff80c995fe>] tcp_connect+0x9ce/0xd10 [<ffffffff80d66524>] tcp_v6_connect+0x4ac/0x5e8 [<ffffffff80cc19b8>] __inet_stream_connect+0xd8/0x318 [<ffffffff80cc1c36>] inet_stream_connect+0x3e/0x68 [<ffffffff80b42b20>] __sys_connect_file+0x50/0x88 [<ffffffff80b42bee>] __sys_connect+0x96/0xc8 [<ffffffff80b42c40>] __riscv_sys_connect+0x20/0x30 [<ffffffff80e5bcae>] do_trap_ecall_u+0x256/0x378 [<ffffffff80e69af2>] handle_exception+0x14a/0x156 Code: 892a 0363 1205 489c 8bc1 c7e5 2d03 084a 2703 080a (2783) 0709 ---[ end trace 0000000000000000 ]--- The bpf_fifo_dequeue prog returns a skb which is a pointer. The pointer is treated as a 32bit value and sign extend to 64bit in epilogue. This behavior is right for most bpf prog types but wrong for struct ops which requires RISC-V ABI. So let's sign extend struct ops return values according to the function model and RISC-V ABI([0]). [0]: https://riscv.org/wp-content/uploads/2024/12/riscv-calling.pdf Fixes: `25ad10658d` ("riscv, bpf: Adapt bpf trampoline to optimized riscv ftrace framework") Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Pu Lehui <pulehui@huawei.com> Reviewed-by: Pu Lehui <pulehui@huawei.com> Link: https://lore.kernel.org/bpf/20250908012448.1695-1-hengqi.chen@gmail.com	2025-09-15 13:21:32 +02:00
Hengqi Chen	6798668ab1	riscv, bpf: Remove duplicated bpf_flush_icache() The bpf_flush_icache() is done by bpf_arch_text_copy() already. Remove the duplicated one in arch_prepare_bpf_trampoline(). Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Pu Lehui <pulehui@huawei.com> Link: https://lore.kernel.org/bpf/20250904105119.21861-1-hengqi.chen@gmail.com	2025-09-15 13:16:55 +02:00
Alexei Starovoitov	a578b54a8a	Merge branch 'bpf-report-arena-faults-to-bpf-streams' Puranjay Mohan says: ==================== bpf: report arena faults to BPF streams Changes in v6->v7: v6: https://lore.kernel.org/all/20250908163638.23150-1-puranjay@kernel.org/ - Added comments about the usage of arena_reg in x86 and arm64 jits. (Alexei) - Used clear_lo32() for clearing the lower 32-bits of user_vm_start. (Alexei) - Moved update of the old tests to use __stderr to a separate commit (Eduard) - Used test__skip() in prog_tests/stream.c (Eduard) - Start a sub-test for read / write Changes in v5->v6: v5: https://lore.kernel.org/all/20250901193730.43543-1-puranjay@kernel.org/ - Introduces __stderr and __stdout for easy testing of bpf streams (Eduard) - Add more test cases for arena fault reporting (subprog and callback) - Fix main_prog_aux usage and return main_prog from find_from_stack_cb (Kumar) - Properly fix the build issue reported by kernel test robot Changes in v4->v5: v4: https://lore.kernel.org/all/20250827153728.28115-1-puranjay@kernel.org/ - Added patch 2 to introducing main_prog_aux for easier access to streams. - Fixed bug in fault handlers when arena_reg == dst_reg - Updated selftest to check test above edge case. - Added comments about the usage of barrier_var() in code and commit message. Changes in v3->v4: v3: https://lore.kernel.org/all/20250827150113.15763-1-puranjay@kernel.org/ - Fixed a build issue when CONFIG_BPF_JIT=y and # CONFIG_BPF_SYSCALL is not set Changes in v2->v3: v2: https://lore.kernel.org/all/20250811111828.13836-1-puranjay@kernel.org/ - Improved the selftest to check the exact fault address - Dropped BPF_NO_KFUNC_PROTOTYPES and bpf_arena_alloc/free_pages() usage - Rebased on bpf-next/master Changes in v1->v2: v1: https://lore.kernel.org/all/20250806085847.18633-1-puranjay@kernel.org/ - Changed variable and mask names for consistency (Yonghong) - Added Acked-by: Yonghong Song <yonghong.song@linux.dev> on two patches This set adds the support of reporting page faults inside arena to BPF stderr stream. The reported address is the one that a user would expect to see if they pass it to bpf_printk(); Here is an example output from the stderr stream and bpf_printk() ERROR: Arena WRITE access at unmapped address 0xdeaddead0000 CPU: 9 UID: 0 PID: 502 Comm: test_progs Call trace: bpf_stream_stage_dump_stack+0xc0/0x150 bpf_prog_report_arena_violation+0x98/0xf0 ex_handler_bpf+0x5c/0x78 fixup_exception+0xf8/0x160 __do_kernel_fault+0x40/0x188 do_bad_area+0x70/0x88 do_translation_fault+0x54/0x98 do_mem_abort+0x4c/0xa8 el1_abort+0x44/0x70 el1h_64_sync_handler+0x50/0x108 el1h_64_sync+0x6c/0x70 bpf_prog_a64a9778d31b8e88_stream_arena_write_fault+0x84/0xc8 *(page) = 1; @ stream.c:100 bpf_prog_test_run_syscall+0x100/0x328 __sys_bpf+0x508/0xb98 __arm64_sys_bpf+0x2c/0x48 invoke_syscall+0x50/0x120 el0_svc_common.constprop.0+0x48/0xf8 do_el0_svc+0x28/0x40 el0_svc+0x48/0xf8 el0t_64_sync_handler+0xa0/0xe8 el0t_64_sync+0x198/0x1a0 Same address is printed by bpf_printk(): 1389.078831: bpf_trace_printk: Read Address: 0xdeaddead0000 To make this possible, some extra metadata has to be passed to the bpf exception handler, so the bpf exception handling mechanism for both x86-64 and arm64 have been improved in this set. The streams selftest has been updated to test this new feature. ==================== Link: https://patch.msgid.link/20250911145808.58042-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-11 13:00:52 -07:00
Puranjay Mohan	86f2225065	selftests/bpf: Add tests for arena fault reporting Add selftests for testing the reporting of arena page faults through BPF streams. Two new bpf programs are added that read and write to an unmapped arena address and the fault reporting is verified in the userspace through streams. The added bpf programs need to access the user_vm_start in struct bpf_arena, this is done by casting &arena to struct bpf_arena *, but barrier_var() is used on this ptr before accessing ptr->user_vm_start; to stop GCC from issuing an out-of-bound access due to the cast from smaller map struct to larger "struct bpf_arena" Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250911145808.58042-7-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-11 13:00:44 -07:00
Puranjay Mohan	edd03fcd76	selftests: bpf: use __stderr in stream error tests Start using __stderr directly in the bpf programs to test the reporting of may_goto timeout detection and spin_lock dead lock detection. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250911145808.58042-6-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-11 13:00:44 -07:00
Puranjay Mohan	744eeb2b27	selftests: bpf: introduce __stderr and __stdout Add __stderr and __stdout to validate the output of BPF streams for bpf selftests. Similar to __xlated, __jited, etc., __stderr/out can be used in the BPF progs to compare a string (regex supported) to the output in the bpf streams. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250911145808.58042-5-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-11 13:00:44 -07:00
Puranjay Mohan	5c5240d020	bpf: Report arena faults to BPF stderr Begin reporting arena page faults and the faulting address to BPF program's stderr, this patch adds support in the arm64 and x86-64 JITs, support for other archs can be added later. The fault handlers receive the 32 bit address in the arena region so the upper 32 bits of user_vm_start is added to it before printing the address. This is what the user would expect to see as this is what is printed by bpf_printk() is you pass it an address returned by bpf_arena_alloc_pages(); Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20250911145808.58042-4-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-11 13:00:43 -07:00
Puranjay Mohan	70f23546d2	bpf: core: introduce main_prog_aux for stream access BPF streams are only valid for the main programs, to make it easier to access streams from subprogs, introduce main_prog_aux in struct bpf_prog_aux. prog->aux->main_prog_aux = prog->aux, for main programs and prog->aux->main_prog_aux = main_prog->aux, for subprograms. Make bpf_prog_find_from_stack() use the added main_prog_aux to return the mainprog when a subprog is found on the stack. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250911145808.58042-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-11 13:00:43 -07:00
Puranjay Mohan	0460484244	bpf: arm64: simplify exception table handling BPF loads with BPF_PROBE_MEM(SX) can load from unsafe pointers and the JIT adds an exception table entry for the JITed instruction which allows the exeption handler to set the destination register of the load to zero and continue execution from the next instruction. As all arm64 instructions are AARCH64_INSN_SIZE size, the exception handler can just increment the pc by AARCH64_INSN_SIZE without needing the exact address of the instruction following the the faulting instruction. Simplify the exception table usage in arm64 JIT by only saving the destination register in ex->fixup and drop everything related to the fixup_offset. The fault handler is modified to add AARCH64_INSN_SIZE to the pc. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20250911145808.58042-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-11 13:00:43 -07:00
Alexei Starovoitov	5d87e96a49	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc5 Cross-merge BPF and other fixes after downstream PR. No conflicts. Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-11 09:34:37 -07:00
Linus Torvalds	e59a039119	Merge tag 's390-6.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Alexander Gordeev: - ptep_modify_prot_start() may be called in a loop, which might lead to the preempt_count overflow due to the unnecessary preemption disabling. Do not disable preemption to prevent the overflow - Events of type PERF_TYPE_HARDWARE are not tested for sampling and return -EOPNOTSUPP eventually. Instead, deny all sampling events by CPUMF counter facility and return -ENOENT to allow other PMUs to be tried - The PAI PMU driver returns -EINVAL if an event out of its range. That aborts a search for an alternative PMU driver. Instead, return -ENOENT to allow other PMUs to be tried * tag 's390-6.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/cpum_cf: Deny all sampling events by counter PMU s390/pai: Deny all events not handled by this PMU s390/mm: Prevent possible preempt_count overflow	2025-09-11 08:46:30 -07:00
Linus Torvalds	a1228f048a	Merge tag 'pm-6.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix a nasty hibernation regression introduced during the 6.16 cycle, an issue related to energy model management occurring on Intel hybrid systems where some CPUs are offline to start with, and two regressions in the amd-pstate driver: - Restore a pm_restrict_gfp_mask() call in hibernation_snapshot() that was removed incorrectly during the 6.16 development cycle (Rafael Wysocki) - Introduce a function for registering a perf domain without triggering a system-wide CPU capacity update and make the intel_pstate driver use it to avoid reocurring unsuccessful attempts to update capacities of all CPUs in the system (Rafael Wysocki) - Fix setting of CPPC.min_perf in the active mode with performance governor in the amd-pstate driver to restore its expected behavior changed recently (Gautham Shenoy) - Avoid mistakenly setting EPP to 0 in the amd-pstate driver after system resume as a result of recent code changes (Mario Limonciello)" * tag 'pm-6.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: hibernate: Restrict GFP mask in hibernation_snapshot() PM: EM: Add function for registering a PD without capacity update cpufreq/amd-pstate: Fix a regression leading to EPP 0 after resume cpufreq/amd-pstate: Fix setting of CPPC.min_perf in active mode for performance governor	2025-09-11 08:11:16 -07:00
Linus Torvalds	b10c31b70b	Merge tag 'for-6.17-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fix delayed inode tracking in xarray, eviction can race with insertion and leave behind a disconnected inode - on systems with large page (64K) and small block size (4K) fix compression read that can return partially filled folio - slightly relax compression option format for backward compatibility, allow to specify level for LZO although there's only one - fix simple quota accounting of compressed extents - validate minimum device size in 'device add' - update maintainers' entry * tag 'for-6.17-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: don't allow adding block device of less than 1 MB MAINTAINERS: update btrfs entry btrfs: fix subvolume deletion lockup caused by inodes xarray race btrfs: fix corruption reading compressed range when block size is smaller than page size btrfs: accept and ignore compression level for lzo btrfs: fix squota compressed stats leak	2025-09-11 08:01:18 -07:00
Linus Torvalds	02ffd6f89c	Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Alexei Starovoitov: "A number of fixes accumulated due to summer vacations - Fix out-of-bounds dynptr write in bpf_crypto_crypt() kfunc which was misidentified as a security issue (Daniel Borkmann) - Update the list of BPF selftests maintainers (Eduard Zingerman) - Fix selftests warnings with icecc compiler (Ilya Leoshkevich) - Disable XDP/cpumap direct return optimization (Jesper Dangaard Brouer) - Fix unexpected get_helper_proto() result in unusual configuration BPF_SYSCALL=y and BPF_EVENTS=n (Jiri Olsa) - Allow fallback to interpreter when JIT support is limited (KaFai Wan) - Fix rqspinlock and choose trylock fallback for NMI waiters. Pick the simplest fix. More involved fix is targeted bpf-next (Kumar Kartikeya Dwivedi) - Fix cleanup when tcp_bpf_send_verdict() fails to allocate psock->cork (Kuniyuki Iwashima) - Disallow bpf_timer in PREEMPT_RT for now. Proper solution is being discussed for bpf-next. (Leon Hwang) - Fix XSK cq descriptor production (Maciej Fijalkowski) - Tell memcg to use allow_spinning=false path in bpf_timer_init() to avoid lockup in cgroup_file_notify() (Peilin Ye) - Fix bpf_strnstr() to handle suffix match cases (Rong Tao)" * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Skip timer cases when bpf_timer is not supported bpf: Reject bpf_timer for PREEMPT_RT tcp_bpf: Call sk_msg_free() when tcp_bpf_send_verdict() fails to allocate psock->cork. bpf: Tell memcg to use allow_spinning=false path in bpf_timer_init() bpf: Allow fall back to interpreter for programs with stack size <= 512 rqspinlock: Choose trylock fallback for NMI waiters xsk: Fix immature cq descriptor production bpf: Update the list of BPF selftests maintainers selftests/bpf: Add tests for bpf_strnstr selftests/bpf: Fix "expression result unused" warnings with icecc bpf: Fix bpf_strnstr() to handle suffix match cases better selftests/bpf: Extend crypto_sanity selftest with invalid dst buffer bpf: Fix out-of-bounds dynptr write in bpf_crypto_crypt bpf: Check the helper function is valid in get_helper_proto bpf, cpumap: Disable page_pool direct xdp_return need larger scope	2025-09-11 07:54:16 -07:00

1 2 3 4 5 ...

1383251 Commits