Merge branch 'bpf-fix-call-offset-truncation-and-oob-read-in-bpf_patch_call_args'

Yazhou Tang says:

====================
bpf: Fix call offset truncation and OOB read in bpf_patch_call_args()

From: Yazhou Tang <tangyazhou518@outlook.com>

This patchset addresses a silent truncation bug in the BPF verifier that
occurs when a bpf-to-bpf call involves a massive relative jump offset.
Additionally, it fixes a pre-existing out-of-bounds (OOB) read issue
in the interpreter fallback path.

Because the BPF instruction set utilizes a 32-bit imm field for bpf-to-bpf
calls, implicitly downcasting it to the 16-bit insn->off in bpf_patch_call_args()
causes incorrect call targets or subprog ID resolution for large BPF programs.
While fixing this by swapping the imm and off fields, it was discovered that
the original code also had a load-time OOB read vulnerability when the stack
depth exceeds MAX_BPF_STACK during JIT fallback.

Patch 1/3 fixes the pre-existing OOB read in bpf_patch_call_args(). It
changes the function to return an int and explicitly rejects the JIT
fallback if the stack depth exceeds MAX_BPF_STACK, preventing a potential
stack buffer overflow.

Patch 2/3 fixes the s16 truncation bug.
1. Keep the original imm field unchanged and use the off field to store
   the interpreter function index.
2. Adjust the JMP_CALL_ARGS case in ___bpf_prog_run() accordingly.
3. Restore the legacy xlated dump layout in bpf_insn_prepare_dump().

Patch 3/3 introduces a selftest for this fix.
---

Change log:

v10:
1. Make the error log in patch 1/3 more clear. (Kuohai)
2. Drop bpftool and disasm_helpers.c changes, and instead restore the
   legacy xlated dump layout in bpf_insn_prepare_dump(). This avoids
   requiring bpftool compatibility handling. (Quentin and Alexei)

v9: https://lore.kernel.org/bpf/20260429171904.107244-1-tangyazhou@zju.edu.cn/
1. Modify the selftest in patch 3/3: use __clobber_all in inline asm.
   (Sashiko AI reviewer)

v8: https://lore.kernel.org/bpf/20260429105608.92741-1-tangyazhou@zju.edu.cn/
1. Update cfg_partition_funcs() in bpftool to use insn->imm for call target
   calculation. (Sashiko AI reviewer)
2. Modify the selftest in patch 3/3: add a large padding before the call
   instruction, preventing the kernel panic on kernel without the fix.
   (Sashiko AI reviewer)
3. Modify the selftest in patch 3/3: make it more clear.

v7: https://lore.kernel.org/bpf/20260421144504.823756-1-tangyazhou@zju.edu.cn/
1. Rebase the patchset to the bpf-next tree to resolve the apply conflict.
   (Alexei)
2. Add Patch 1/3 to properly fix a pre-existing OOB read in bpf_patch_call_args().
   (Sashiko AI reviewer)

v6: https://lore.kernel.org/bpf/20260412170334.716778-1-tangyazhou@zju.edu.cn/
1. Use a different but clearer approach to resolve this issue: keeping
   the original imm field unchanged and using the off field to store the
   interpreter function index. (Kuohai)
2. Update the related dumper code and remove a previous workaround in the
   selftests disasm helpers, which is no longer needed after this fix.

v5: https://lore.kernel.org/bpf/20260326090133.221957-1-tangyazhou@zju.edu.cn/
1. Some minor changes in commit messages. (AI Reviewer)

v4: https://lore.kernel.org/bpf/20260326063329.10031-1-tangyazhou@zju.edu.cn/
1. Remove some redundant commit messages of patch 2/3. (Emil)
2. Change the number of instructions in padding_subprog() from 200,000
   to 32,765, which is the minimum number of instructions required to
   trigger the verifier failure. (Emil)

v3: https://lore.kernel.org/bpf/20260323122254.98540-1-tangyazhou@zju.edu.cn/
1. Resend to fix a typo in v2 and add "Fixes" tag. The rest of the changes
   are identical to v2.

v2 (incorrect): https://lore.kernel.org/bpf/20260323081748.106603-1-tangyazhou@zju.edu.cn/
1. Move the s16 boundary check from fixup_call_args() to bpf_patch_call_args(),
   and change the return type of bpf_patch_call_args() to int. (Emil)
2. Add Patch 3/3 to fix the incorrect subprog ID in dumped bpf_pseudo_call
   instructions, which is caused by the same truncation issue. (Puranjay)
3. Refine the new selftest for clarity and add detailed comments explaining
   the test design. (Emil)

v1: https://lore.kernel.org/bpf/20260316190220.113417-1-tangyazhou@zju.edu.cn/
====================

Link: https://patch.msgid.link/20260506094714.419842-1-tangyazhou@zju.edu.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This commit is contained in:
Alexei Starovoitov
2026-05-11 08:27:02 -07:00
7 changed files with 129 additions and 16 deletions

View File

@@ -2917,7 +2917,13 @@ int bpf_check_uarg_tail_zero(bpfptr_t uaddr, size_t expected_size,
int bpf_check(struct bpf_prog **fp, union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size);
#ifndef CONFIG_BPF_JIT_ALWAYS_ON
void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth);
int bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth);
s32 bpf_call_args_imm(s16 idx);
#else
static inline s32 bpf_call_args_imm(s16 idx)
{
return 0;
}
#endif
struct btf *bpf_get_btf_vmlinux(void);

View File

@@ -1151,9 +1151,6 @@ bool sk_filter_charge(struct sock *sk, struct sk_filter *fp);
void sk_filter_uncharge(struct sock *sk, struct sk_filter *fp);
u64 __bpf_call_base(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
#define __bpf_call_base_args \
((u64 (*)(u64, u64, u64, u64, u64, const struct bpf_insn *)) \
(void *)__bpf_call_base)
struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog);
void bpf_jit_compile(struct bpf_prog *prog);

View File

@@ -1771,6 +1771,9 @@ static u32 abs_s32(s32 x)
return x >= 0 ? (u32)x : -(u32)x;
}
static u64 (*interpreters_args[])(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5,
const struct bpf_insn *insn);
/**
* ___bpf_prog_run - run eBPF program on a given context
* @regs: is the array of MAX_BPF_EXT_REG eBPF pseudo-registers
@@ -2077,10 +2080,9 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn)
CONT;
JMP_CALL_ARGS:
BPF_R0 = (__bpf_call_base_args + insn->imm)(BPF_R1, BPF_R2,
BPF_R3, BPF_R4,
BPF_R5,
insn + insn->off + 1);
BPF_R0 = interpreters_args[insn->off](BPF_R1, BPF_R2, BPF_R3,
BPF_R4, BPF_R5,
insn + insn->imm + 1);
CONT;
JMP_TAIL_CALL: {
@@ -2394,13 +2396,22 @@ EVAL4(PROG_NAME_LIST, 416, 448, 480, 512)
#undef PROG_NAME_LIST
#ifdef CONFIG_BPF_SYSCALL
void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth)
int bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth)
{
stack_depth = max_t(u32, stack_depth, 1);
insn->off = (s16) insn->imm;
insn->imm = interpreters_args[(round_up(stack_depth, 32) / 32) - 1] -
__bpf_call_base_args;
/* Prevent out-of-bounds read to interpreters_args */
if (stack_depth > MAX_BPF_STACK)
return -EINVAL;
insn->off = (round_up(stack_depth, 32) / 32) - 1;
insn->code = BPF_JMP | BPF_CALL_ARGS;
return 0;
}
s32 bpf_call_args_imm(s16 idx)
{
if (WARN_ON_ONCE(idx < 0 || idx >= ARRAY_SIZE(interpreters_args)))
return 0;
return BPF_CALL_IMM(interpreters_args[idx]);
}
#endif
#endif

View File

@@ -1250,9 +1250,9 @@ static int jit_subprogs(struct bpf_verifier_env *env)
}
if (!bpf_pseudo_call(insn))
continue;
insn->off = env->insn_aux_data[i].call_imm;
subprog = bpf_find_subprog(env, i + insn->off + 1);
insn->imm = subprog;
insn->imm = env->insn_aux_data[i].call_imm;
subprog = bpf_find_subprog(env, i + insn->imm + 1);
insn->off = subprog;
}
prog->jited = 1;
@@ -1416,7 +1416,12 @@ int bpf_fixup_call_args(struct bpf_verifier_env *env)
depth = get_callee_stack_depth(env, insn, i);
if (depth < 0)
return depth;
bpf_patch_call_args(insn, depth);
err = bpf_patch_call_args(insn, depth);
if (err) {
verbose(env, "stack depth %d exceeds interpreter stack depth limit\n",
depth);
return err;
}
}
err = 0;
#endif

View File

@@ -4919,6 +4919,29 @@ static const struct bpf_map *bpf_map_from_imm(const struct bpf_prog *prog,
return map;
}
static void prepare_dump_pseudo_call(struct bpf_insn *insn)
{
s32 call_off = insn->imm;
/*
* BPF_CALL_ARGS only exists for interpreter fallback.
* 1. For interpreter (BPF_CALL_ARGS): insn->off is the index of
* interpreters_args array, so here using bpf_call_args_imm()
* to get the real address offset.
* 2. For JIT (BPF_CALL): insn->off is the subprog id.
*/
if (insn->code == (BPF_JMP | BPF_CALL_ARGS))
insn->imm = bpf_call_args_imm(insn->off);
else
insn->imm = insn->off;
/* Avoid dumping a truncated and misleading pc-relative offset. */
if (call_off > S16_MAX || call_off < S16_MIN)
insn->off = 0;
else
insn->off = call_off;
}
static struct bpf_insn *bpf_insn_prepare_dump(const struct bpf_prog *prog,
const struct cred *f_cred)
{
@@ -4944,6 +4967,9 @@ static struct bpf_insn *bpf_insn_prepare_dump(const struct bpf_prog *prog,
}
if (code == (BPF_JMP | BPF_CALL) ||
code == (BPF_JMP | BPF_CALL_ARGS)) {
/* Restore the legacy xlated dump layout. */
if (insns[i].src_reg == BPF_PSEUDO_CALL)
prepare_dump_pseudo_call(&insns[i]);
if (code == (BPF_JMP | BPF_CALL_ARGS))
insns[i].code = BPF_JMP | BPF_CALL;
if (!bpf_dump_raw_ok(f_cred))

View File

@@ -22,6 +22,7 @@
#include "verifier_bswap.skel.h"
#include "verifier_btf_ctx_access.skel.h"
#include "verifier_btf_unreliable_prog.skel.h"
#include "verifier_call_large_imm.skel.h"
#include "verifier_cfg.skel.h"
#include "verifier_cgroup_inv_retcode.skel.h"
#include "verifier_cgroup_skb.skel.h"
@@ -170,6 +171,7 @@ void test_verifier_bpf_trap(void) { RUN(verifier_bpf_trap); }
void test_verifier_bswap(void) { RUN(verifier_bswap); }
void test_verifier_btf_ctx_access(void) { RUN(verifier_btf_ctx_access); }
void test_verifier_btf_unreliable_prog(void) { RUN(verifier_btf_unreliable_prog); }
void test_verifier_call_large_imm(void) { RUN(verifier_call_large_imm); }
void test_verifier_cfg(void) { RUN(verifier_cfg); }
void test_verifier_cgroup_inv_retcode(void) { RUN(verifier_cgroup_inv_retcode); }
void test_verifier_cgroup_skb(void) { RUN(verifier_cgroup_skb); }

View File

@@ -0,0 +1,66 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include "bpf_misc.h"
int call_happened = 0;
/*
* 32765 is the exact minimum number of padding instructions needed to
* trigger the verifier failure, because:
* 1. Counting the wrapper instructions around the padding block (one
* "r0=0" and two "exit" instructions), the actual jump distance
* evaluates to N + 3.
* 2. To overflow the s16 max bound (32767), we need N + 3 > 32767.
* Thus, N = 32765 is the exact minimum padding size required.
*/
static __attribute__((noinline)) void padding_subprog(void)
{
asm volatile (
"r0 = 0;"
".rept 32765;"
"r0 += 0;"
".endr;"
::: __clobber_all);
}
static __attribute__((noinline)) int target_subprog(void)
{
/* Use volatile variable here to prevent optimization. */
volatile int magic_ret = 3;
return magic_ret;
}
SEC("syscall")
__success __retval(3)
int call_large_imm_test(void *ctx)
{
/*
* Landing pad to handle call error on kernel without the fix,
* preventing kernel panic.
*/
asm volatile (
"r0 = 0;"
".rept 32768;"
"r0 += 0;"
".endr;"
::: __clobber_all);
/*
* The call_happened variable is 1 only when the call insn wrongly
* go back to the landing pad above.
*/
if (call_happened == 1) {
/* Use volatile variable here to prevent optimization. */
volatile int flag = -1;
return flag;
}
call_happened = 1;
padding_subprog();
return target_subprog();
}
char LICENSE[] SEC("license") = "GPL";