Merge branch 'introduce-load-acquire-and-store-release-bpf-instructions'

Peilin Ye says:

====================
Introduce load-acquire and store-release BPF instructions

This patchset adds kernel support for BPF load-acquire and store-release
instructions (for background, please see [1]), including core/verifier
and arm64/x86-64 JIT compiler changes, as well as selftests.  riscv64 is
also planned to be supported.  The corresponding LLVM changes can be
found at:

  https://github.com/llvm/llvm-project/pull/108636

The first 3 patches from v4 have already been applied:

  - [bpf-next,v4,01/10] bpf/verifier: Factor out atomic_ptr_type_ok()
    https://git.kernel.org/bpf/bpf-next/c/b2d9ef71d4c9
  - [bpf-next,v4,02/10] bpf/verifier: Factor out check_atomic_rmw()
    https://git.kernel.org/bpf/bpf-next/c/d430c46c7580
  - [bpf-next,v4,03/10] bpf/verifier: Factor out check_load_mem() and check_store_reg()
    https://git.kernel.org/bpf/bpf-next/c/d38ad248fb7a

Please refer to the LLVM PR and individual kernel patches for details.
Thanks!

v5: https://lore.kernel.org/all/cover.1741046028.git.yepeilin@google.com/
v5..v6 change:

  o (Alexei) avoid using #ifndef in verifier.c

v4: https://lore.kernel.org/bpf/cover.1740978603.git.yepeilin@google.com/
v4..v5 notable changes:

  o (kernel test robot) for 32-bit arches: make the verifier reject
                        64-bit load-acquires/store-releases, and fix
                        build error in interpreter changes
    * tested ARCH=arc build following instructions from kernel test
      robot
  o (Alexei) drop Documentation/ patch (v4 10/10) for now

v3: https://lore.kernel.org/bpf/cover.1740009184.git.yepeilin@google.com/
v3..v4 notable changes:

  o (Alexei) add x86-64 JIT support (including arena)
  o add Acked-by: tags from Xu

v2: https://lore.kernel.org/bpf/cover.1738888641.git.yepeilin@google.com/
v2..v3 notable changes:

  o (Alexei) change encoding to BPF_LOAD_ACQ=0x100, BPF_STORE_REL=0x110
  o add Acked-by: tags from Ilya and Eduard
  o make new selftests depend on:
    * __clang_major__ >= 18, and
    * ENABLE_ATOMICS_TESTS is defined (currently this means -mcpu=v3 or
      v4), and
    * JIT supports load_acq/store_rel (currenty only arm64)
  o work around llvm-17 CI job failure by conditionally define
    __arena_global variables as 64-bit if __clang_major__ < 18, to make
    sure .addr_space.1 has no holes
  o add Google copyright notice in new files

v1: https://lore.kernel.org/all/cover.1737763916.git.yepeilin@google.com/
v1..v2 notable changes:

  o (Eduard) for x86 and s390, make
             bpf_jit_supports_insn(..., /*in_arena=*/true) return false
	     for load_acq/store_rel
  o add Eduard's Acked-by: tag
  o (Eduard) extract LDX and non-ATOMIC STX handling into helpers, see
             PATCH v2 3/9
  o allow unpriv programs to store-release pointers to stack
  o (Alexei) make it clearer in the interpreter code (PATCH v2 4/9) that
             only W and DW are supported for atomic RMW
  o test misaligned load_acq/store_rel
  o (Eduard) other selftests/ changes:
    * test load_acq/store_rel with !atomic_ptr_type_ok() pointers:
      - PTR_TO_CTX, for is_ctx_reg()
      - PTR_TO_PACKET, for is_pkt_reg()
      - PTR_TO_FLOW_KEYS, for is_flow_key_reg()
      - PTR_TO_SOCKET, for is_sk_reg()
    * drop atomics/ tests
    * delete unnecessary 'pid' checks from arena_atomics/ tests
    * avoid depending on __BPF_FEATURE_LOAD_ACQ_STORE_REL, use
      __imm_insn() and inline asm macros instead

RFC v1: https://lore.kernel.org/all/cover.1734742802.git.yepeilin@google.com
RFC v1..v1 notable changes:

  o 1-2/8: minor verifier.c refactoring patches
  o   3/8: core/verifier changes
         * (Eduard) handle load-acquire properly in backtrack_insn()
         * (Eduard) avoid skipping checks (e.g.,
                    bpf_jit_supports_insn()) for load-acquires
         * track the value stored by store-releases, just like how
           non-atomic STX instructions are handled
         * (Eduard) add missing link in commit message
         * (Eduard) always print 'r' for disasm.c changes
  o   4/8: arm64/insn: avoid treating load_acq/store_rel as
           load_ex/store_ex
  o   5/8: arm64/insn: add load_acq/store_rel
         * (Xu) include Should-Be-One (SBO) bits in "mask" and "value",
                to avoid setting fixed bits during runtime (JIT-compile
                time)
  o   6/8: arm64 JIT compiler changes
         * (Xu) use emit_a64_add_i() for "pointer + offset" to optimize
                code emission
  o   7/8: selftests
         * (Eduard) avoid adding new tests to the 'test_verifier' runner
         * add more tests, e.g., checking mark_precise logic
  o   8/8: instruction-set.rst changes

[1] https://lore.kernel.org/all/20240729183246.4110549-1-yepeilin@google.com/

Thanks,
====================

Link: https://patch.msgid.link/cover.1741049567.git.yepeilin@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This commit is contained in:
Alexei Starovoitov
2025-03-03 21:00:15 -08:00
19 changed files with 1081 additions and 33 deletions

View File

@@ -188,8 +188,10 @@ enum aarch64_insn_ldst_type {
AARCH64_INSN_LDST_STORE_PAIR_PRE_INDEX,
AARCH64_INSN_LDST_LOAD_PAIR_POST_INDEX,
AARCH64_INSN_LDST_STORE_PAIR_POST_INDEX,
AARCH64_INSN_LDST_LOAD_ACQ,
AARCH64_INSN_LDST_LOAD_EX,
AARCH64_INSN_LDST_LOAD_ACQ_EX,
AARCH64_INSN_LDST_STORE_REL,
AARCH64_INSN_LDST_STORE_EX,
AARCH64_INSN_LDST_STORE_REL_EX,
AARCH64_INSN_LDST_SIGNED_LOAD_IMM_OFFSET,
@@ -351,8 +353,10 @@ __AARCH64_INSN_FUNCS(ldr_imm, 0x3FC00000, 0x39400000)
__AARCH64_INSN_FUNCS(ldr_lit, 0xBF000000, 0x18000000)
__AARCH64_INSN_FUNCS(ldrsw_lit, 0xFF000000, 0x98000000)
__AARCH64_INSN_FUNCS(exclusive, 0x3F800000, 0x08000000)
__AARCH64_INSN_FUNCS(load_ex, 0x3F400000, 0x08400000)
__AARCH64_INSN_FUNCS(store_ex, 0x3F400000, 0x08000000)
__AARCH64_INSN_FUNCS(load_acq, 0x3FDFFC00, 0x08DFFC00)
__AARCH64_INSN_FUNCS(store_rel, 0x3FDFFC00, 0x089FFC00)
__AARCH64_INSN_FUNCS(load_ex, 0x3FC00000, 0x08400000)
__AARCH64_INSN_FUNCS(store_ex, 0x3FC00000, 0x08000000)
__AARCH64_INSN_FUNCS(mops, 0x3B200C00, 0x19000400)
__AARCH64_INSN_FUNCS(stp, 0x7FC00000, 0x29000000)
__AARCH64_INSN_FUNCS(ldp, 0x7FC00000, 0x29400000)
@@ -602,6 +606,10 @@ u32 aarch64_insn_gen_load_store_pair(enum aarch64_insn_register reg1,
int offset,
enum aarch64_insn_variant variant,
enum aarch64_insn_ldst_type type);
u32 aarch64_insn_gen_load_acq_store_rel(enum aarch64_insn_register reg,
enum aarch64_insn_register base,
enum aarch64_insn_size_type size,
enum aarch64_insn_ldst_type type);
u32 aarch64_insn_gen_load_store_ex(enum aarch64_insn_register reg,
enum aarch64_insn_register base,
enum aarch64_insn_register state,

View File

@@ -540,6 +540,35 @@ u32 aarch64_insn_gen_load_store_pair(enum aarch64_insn_register reg1,
offset >> shift);
}
u32 aarch64_insn_gen_load_acq_store_rel(enum aarch64_insn_register reg,
enum aarch64_insn_register base,
enum aarch64_insn_size_type size,
enum aarch64_insn_ldst_type type)
{
u32 insn;
switch (type) {
case AARCH64_INSN_LDST_LOAD_ACQ:
insn = aarch64_insn_get_load_acq_value();
break;
case AARCH64_INSN_LDST_STORE_REL:
insn = aarch64_insn_get_store_rel_value();
break;
default:
pr_err("%s: unknown load-acquire/store-release encoding %d\n",
__func__, type);
return AARCH64_BREAK_FAULT;
}
insn = aarch64_insn_encode_ldst_size(size, insn);
insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RT, insn,
reg);
return aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn,
base);
}
u32 aarch64_insn_gen_load_store_ex(enum aarch64_insn_register reg,
enum aarch64_insn_register base,
enum aarch64_insn_register state,

View File

@@ -119,6 +119,26 @@
aarch64_insn_gen_load_store_ex(Rt, Rn, Rs, A64_SIZE(sf), \
AARCH64_INSN_LDST_STORE_REL_EX)
/* Load-acquire & store-release */
#define A64_LDAR(Rt, Rn, size) \
aarch64_insn_gen_load_acq_store_rel(Rt, Rn, AARCH64_INSN_SIZE_##size, \
AARCH64_INSN_LDST_LOAD_ACQ)
#define A64_STLR(Rt, Rn, size) \
aarch64_insn_gen_load_acq_store_rel(Rt, Rn, AARCH64_INSN_SIZE_##size, \
AARCH64_INSN_LDST_STORE_REL)
/* Rt = [Rn] (load acquire) */
#define A64_LDARB(Wt, Xn) A64_LDAR(Wt, Xn, 8)
#define A64_LDARH(Wt, Xn) A64_LDAR(Wt, Xn, 16)
#define A64_LDAR32(Wt, Xn) A64_LDAR(Wt, Xn, 32)
#define A64_LDAR64(Xt, Xn) A64_LDAR(Xt, Xn, 64)
/* [Rn] = Rt (store release) */
#define A64_STLRB(Wt, Xn) A64_STLR(Wt, Xn, 8)
#define A64_STLRH(Wt, Xn) A64_STLR(Wt, Xn, 16)
#define A64_STLR32(Wt, Xn) A64_STLR(Wt, Xn, 32)
#define A64_STLR64(Xt, Xn) A64_STLR(Xt, Xn, 64)
/*
* LSE atomics
*

View File

@@ -647,6 +647,81 @@ static int emit_bpf_tail_call(struct jit_ctx *ctx)
return 0;
}
static int emit_atomic_ld_st(const struct bpf_insn *insn, struct jit_ctx *ctx)
{
const s32 imm = insn->imm;
const s16 off = insn->off;
const u8 code = insn->code;
const bool arena = BPF_MODE(code) == BPF_PROBE_ATOMIC;
const u8 arena_vm_base = bpf2a64[ARENA_VM_START];
const u8 dst = bpf2a64[insn->dst_reg];
const u8 src = bpf2a64[insn->src_reg];
const u8 tmp = bpf2a64[TMP_REG_1];
u8 reg;
switch (imm) {
case BPF_LOAD_ACQ:
reg = src;
break;
case BPF_STORE_REL:
reg = dst;
break;
default:
pr_err_once("unknown atomic load/store op code %02x\n", imm);
return -EINVAL;
}
if (off) {
emit_a64_add_i(1, tmp, reg, tmp, off, ctx);
reg = tmp;
}
if (arena) {
emit(A64_ADD(1, tmp, reg, arena_vm_base), ctx);
reg = tmp;
}
switch (imm) {
case BPF_LOAD_ACQ:
switch (BPF_SIZE(code)) {
case BPF_B:
emit(A64_LDARB(dst, reg), ctx);
break;
case BPF_H:
emit(A64_LDARH(dst, reg), ctx);
break;
case BPF_W:
emit(A64_LDAR32(dst, reg), ctx);
break;
case BPF_DW:
emit(A64_LDAR64(dst, reg), ctx);
break;
}
break;
case BPF_STORE_REL:
switch (BPF_SIZE(code)) {
case BPF_B:
emit(A64_STLRB(src, reg), ctx);
break;
case BPF_H:
emit(A64_STLRH(src, reg), ctx);
break;
case BPF_W:
emit(A64_STLR32(src, reg), ctx);
break;
case BPF_DW:
emit(A64_STLR64(src, reg), ctx);
break;
}
break;
default:
pr_err_once("unexpected atomic load/store op code %02x\n",
imm);
return -EINVAL;
}
return 0;
}
#ifdef CONFIG_ARM64_LSE_ATOMICS
static int emit_lse_atomic(const struct bpf_insn *insn, struct jit_ctx *ctx)
{
@@ -1641,11 +1716,17 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
return ret;
break;
case BPF_STX | BPF_ATOMIC | BPF_B:
case BPF_STX | BPF_ATOMIC | BPF_H:
case BPF_STX | BPF_ATOMIC | BPF_W:
case BPF_STX | BPF_ATOMIC | BPF_DW:
case BPF_STX | BPF_PROBE_ATOMIC | BPF_B:
case BPF_STX | BPF_PROBE_ATOMIC | BPF_H:
case BPF_STX | BPF_PROBE_ATOMIC | BPF_W:
case BPF_STX | BPF_PROBE_ATOMIC | BPF_DW:
if (cpus_have_cap(ARM64_HAS_LSE_ATOMICS))
if (bpf_atomic_is_load_store(insn))
ret = emit_atomic_ld_st(insn, ctx);
else if (cpus_have_cap(ARM64_HAS_LSE_ATOMICS))
ret = emit_lse_atomic(insn, ctx);
else
ret = emit_ll_sc_atomic(insn, ctx);
@@ -2669,7 +2750,8 @@ bool bpf_jit_supports_insn(struct bpf_insn *insn, bool in_arena)
switch (insn->code) {
case BPF_STX | BPF_ATOMIC | BPF_W:
case BPF_STX | BPF_ATOMIC | BPF_DW:
if (!cpus_have_cap(ARM64_HAS_LSE_ATOMICS))
if (!bpf_atomic_is_load_store(insn) &&
!cpus_have_cap(ARM64_HAS_LSE_ATOMICS))
return false;
}
return true;

View File

@@ -2919,10 +2919,16 @@ bool bpf_jit_supports_arena(void)
bool bpf_jit_supports_insn(struct bpf_insn *insn, bool in_arena)
{
/*
* Currently the verifier uses this function only to check which
* atomic stores to arena are supported, and they all are.
*/
if (!in_arena)
return true;
switch (insn->code) {
case BPF_STX | BPF_ATOMIC | BPF_B:
case BPF_STX | BPF_ATOMIC | BPF_H:
case BPF_STX | BPF_ATOMIC | BPF_W:
case BPF_STX | BPF_ATOMIC | BPF_DW:
if (bpf_atomic_is_load_store(insn))
return false;
}
return true;
}

View File

@@ -1242,8 +1242,8 @@ static void emit_st_r12(u8 **pprog, u32 size, u32 dst_reg, int off, int imm)
emit_st_index(pprog, size, dst_reg, X86_REG_R12, off, imm);
}
static int emit_atomic(u8 **pprog, u8 atomic_op,
u32 dst_reg, u32 src_reg, s16 off, u8 bpf_size)
static int emit_atomic_rmw(u8 **pprog, u32 atomic_op,
u32 dst_reg, u32 src_reg, s16 off, u8 bpf_size)
{
u8 *prog = *pprog;
@@ -1283,8 +1283,9 @@ static int emit_atomic(u8 **pprog, u8 atomic_op,
return 0;
}
static int emit_atomic_index(u8 **pprog, u8 atomic_op, u32 size,
u32 dst_reg, u32 src_reg, u32 index_reg, int off)
static int emit_atomic_rmw_index(u8 **pprog, u32 atomic_op, u32 size,
u32 dst_reg, u32 src_reg, u32 index_reg,
int off)
{
u8 *prog = *pprog;
@@ -1297,7 +1298,7 @@ static int emit_atomic_index(u8 **pprog, u8 atomic_op, u32 size,
EMIT1(add_3mod(0x48, dst_reg, src_reg, index_reg));
break;
default:
pr_err("bpf_jit: 1 and 2 byte atomics are not supported\n");
pr_err("bpf_jit: 1- and 2-byte RMW atomics are not supported\n");
return -EFAULT;
}
@@ -1331,6 +1332,49 @@ static int emit_atomic_index(u8 **pprog, u8 atomic_op, u32 size,
return 0;
}
static int emit_atomic_ld_st(u8 **pprog, u32 atomic_op, u32 dst_reg,
u32 src_reg, s16 off, u8 bpf_size)
{
switch (atomic_op) {
case BPF_LOAD_ACQ:
/* dst_reg = smp_load_acquire(src_reg + off16) */
emit_ldx(pprog, bpf_size, dst_reg, src_reg, off);
break;
case BPF_STORE_REL:
/* smp_store_release(dst_reg + off16, src_reg) */
emit_stx(pprog, bpf_size, dst_reg, src_reg, off);
break;
default:
pr_err("bpf_jit: unknown atomic load/store opcode %02x\n",
atomic_op);
return -EFAULT;
}
return 0;
}
static int emit_atomic_ld_st_index(u8 **pprog, u32 atomic_op, u32 size,
u32 dst_reg, u32 src_reg, u32 index_reg,
int off)
{
switch (atomic_op) {
case BPF_LOAD_ACQ:
/* dst_reg = smp_load_acquire(src_reg + idx_reg + off16) */
emit_ldx_index(pprog, size, dst_reg, src_reg, index_reg, off);
break;
case BPF_STORE_REL:
/* smp_store_release(dst_reg + idx_reg + off16, src_reg) */
emit_stx_index(pprog, size, dst_reg, src_reg, index_reg, off);
break;
default:
pr_err("bpf_jit: unknown atomic load/store opcode %02x\n",
atomic_op);
return -EFAULT;
}
return 0;
}
#define DONT_CLEAR 1
bool ex_handler_bpf(const struct exception_table_entry *x, struct pt_regs *regs)
@@ -2113,6 +2157,13 @@ st: if (is_imm8(insn->off))
}
break;
case BPF_STX | BPF_ATOMIC | BPF_B:
case BPF_STX | BPF_ATOMIC | BPF_H:
if (!bpf_atomic_is_load_store(insn)) {
pr_err("bpf_jit: 1- and 2-byte RMW atomics are not supported\n");
return -EFAULT;
}
fallthrough;
case BPF_STX | BPF_ATOMIC | BPF_W:
case BPF_STX | BPF_ATOMIC | BPF_DW:
if (insn->imm == (BPF_AND | BPF_FETCH) ||
@@ -2148,10 +2199,10 @@ st: if (is_imm8(insn->off))
EMIT2(simple_alu_opcodes[BPF_OP(insn->imm)],
add_2reg(0xC0, AUX_REG, real_src_reg));
/* Attempt to swap in new value */
err = emit_atomic(&prog, BPF_CMPXCHG,
real_dst_reg, AUX_REG,
insn->off,
BPF_SIZE(insn->code));
err = emit_atomic_rmw(&prog, BPF_CMPXCHG,
real_dst_reg, AUX_REG,
insn->off,
BPF_SIZE(insn->code));
if (WARN_ON(err))
return err;
/*
@@ -2166,17 +2217,35 @@ st: if (is_imm8(insn->off))
break;
}
err = emit_atomic(&prog, insn->imm, dst_reg, src_reg,
insn->off, BPF_SIZE(insn->code));
if (bpf_atomic_is_load_store(insn))
err = emit_atomic_ld_st(&prog, insn->imm, dst_reg, src_reg,
insn->off, BPF_SIZE(insn->code));
else
err = emit_atomic_rmw(&prog, insn->imm, dst_reg, src_reg,
insn->off, BPF_SIZE(insn->code));
if (err)
return err;
break;
case BPF_STX | BPF_PROBE_ATOMIC | BPF_B:
case BPF_STX | BPF_PROBE_ATOMIC | BPF_H:
if (!bpf_atomic_is_load_store(insn)) {
pr_err("bpf_jit: 1- and 2-byte RMW atomics are not supported\n");
return -EFAULT;
}
fallthrough;
case BPF_STX | BPF_PROBE_ATOMIC | BPF_W:
case BPF_STX | BPF_PROBE_ATOMIC | BPF_DW:
start_of_ldx = prog;
err = emit_atomic_index(&prog, insn->imm, BPF_SIZE(insn->code),
dst_reg, src_reg, X86_REG_R12, insn->off);
if (bpf_atomic_is_load_store(insn))
err = emit_atomic_ld_st_index(&prog, insn->imm,
BPF_SIZE(insn->code), dst_reg,
src_reg, X86_REG_R12, insn->off);
else
err = emit_atomic_rmw_index(&prog, insn->imm, BPF_SIZE(insn->code),
dst_reg, src_reg, X86_REG_R12,
insn->off);
if (err)
return err;
goto populate_extable;

View File

@@ -991,6 +991,21 @@ static inline bool bpf_pseudo_func(const struct bpf_insn *insn)
return bpf_is_ldimm64(insn) && insn->src_reg == BPF_PSEUDO_FUNC;
}
/* Given a BPF_ATOMIC instruction @atomic_insn, return true if it is an
* atomic load or store, and false if it is a read-modify-write instruction.
*/
static inline bool
bpf_atomic_is_load_store(const struct bpf_insn *atomic_insn)
{
switch (atomic_insn->imm) {
case BPF_LOAD_ACQ:
case BPF_STORE_REL:
return true;
default:
return false;
}
}
struct bpf_prog_ops {
int (*test_run)(struct bpf_prog *prog, const union bpf_attr *kattr,
union bpf_attr __user *uattr);

View File

@@ -364,6 +364,8 @@ static inline bool insn_is_cast_user(const struct bpf_insn *insn)
* BPF_XOR | BPF_FETCH src_reg = atomic_fetch_xor(dst_reg + off16, src_reg);
* BPF_XCHG src_reg = atomic_xchg(dst_reg + off16, src_reg)
* BPF_CMPXCHG r0 = atomic_cmpxchg(dst_reg + off16, r0, src_reg)
* BPF_LOAD_ACQ dst_reg = smp_load_acquire(src_reg + off16)
* BPF_STORE_REL smp_store_release(dst_reg + off16, src_reg)
*/
#define BPF_ATOMIC_OP(SIZE, OP, DST, SRC, OFF) \

View File

@@ -51,6 +51,9 @@
#define BPF_XCHG (0xe0 | BPF_FETCH) /* atomic exchange */
#define BPF_CMPXCHG (0xf0 | BPF_FETCH) /* atomic compare-and-write */
#define BPF_LOAD_ACQ 0x100 /* load-acquire */
#define BPF_STORE_REL 0x110 /* store-release */
enum bpf_cond_pseudo_jmp {
BPF_MAY_GOTO = 0,
};

View File

@@ -1663,14 +1663,17 @@ EXPORT_SYMBOL_GPL(__bpf_call_base);
INSN_3(JMP, JSET, K), \
INSN_2(JMP, JA), \
INSN_2(JMP32, JA), \
/* Atomic operations. */ \
INSN_3(STX, ATOMIC, B), \
INSN_3(STX, ATOMIC, H), \
INSN_3(STX, ATOMIC, W), \
INSN_3(STX, ATOMIC, DW), \
/* Store instructions. */ \
/* Register based. */ \
INSN_3(STX, MEM, B), \
INSN_3(STX, MEM, H), \
INSN_3(STX, MEM, W), \
INSN_3(STX, MEM, DW), \
INSN_3(STX, ATOMIC, W), \
INSN_3(STX, ATOMIC, DW), \
/* Immediate based. */ \
INSN_3(ST, MEM, B), \
INSN_3(ST, MEM, H), \
@@ -2152,24 +2155,33 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn)
if (BPF_SIZE(insn->code) == BPF_W) \
atomic_##KOP((u32) SRC, (atomic_t *)(unsigned long) \
(DST + insn->off)); \
else \
else if (BPF_SIZE(insn->code) == BPF_DW) \
atomic64_##KOP((u64) SRC, (atomic64_t *)(unsigned long) \
(DST + insn->off)); \
else \
goto default_label; \
break; \
case BOP | BPF_FETCH: \
if (BPF_SIZE(insn->code) == BPF_W) \
SRC = (u32) atomic_fetch_##KOP( \
(u32) SRC, \
(atomic_t *)(unsigned long) (DST + insn->off)); \
else \
else if (BPF_SIZE(insn->code) == BPF_DW) \
SRC = (u64) atomic64_fetch_##KOP( \
(u64) SRC, \
(atomic64_t *)(unsigned long) (DST + insn->off)); \
else \
goto default_label; \
break;
STX_ATOMIC_DW:
STX_ATOMIC_W:
STX_ATOMIC_H:
STX_ATOMIC_B:
switch (IMM) {
/* Atomic read-modify-write instructions support only W and DW
* size modifiers.
*/
ATOMIC_ALU_OP(BPF_ADD, add)
ATOMIC_ALU_OP(BPF_AND, and)
ATOMIC_ALU_OP(BPF_OR, or)
@@ -2181,20 +2193,63 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn)
SRC = (u32) atomic_xchg(
(atomic_t *)(unsigned long) (DST + insn->off),
(u32) SRC);
else
else if (BPF_SIZE(insn->code) == BPF_DW)
SRC = (u64) atomic64_xchg(
(atomic64_t *)(unsigned long) (DST + insn->off),
(u64) SRC);
else
goto default_label;
break;
case BPF_CMPXCHG:
if (BPF_SIZE(insn->code) == BPF_W)
BPF_R0 = (u32) atomic_cmpxchg(
(atomic_t *)(unsigned long) (DST + insn->off),
(u32) BPF_R0, (u32) SRC);
else
else if (BPF_SIZE(insn->code) == BPF_DW)
BPF_R0 = (u64) atomic64_cmpxchg(
(atomic64_t *)(unsigned long) (DST + insn->off),
(u64) BPF_R0, (u64) SRC);
else
goto default_label;
break;
/* Atomic load and store instructions support all size
* modifiers.
*/
case BPF_LOAD_ACQ:
switch (BPF_SIZE(insn->code)) {
#define LOAD_ACQUIRE(SIZEOP, SIZE) \
case BPF_##SIZEOP: \
DST = (SIZE)smp_load_acquire( \
(SIZE *)(unsigned long)(SRC + insn->off)); \
break;
LOAD_ACQUIRE(B, u8)
LOAD_ACQUIRE(H, u16)
LOAD_ACQUIRE(W, u32)
#ifdef CONFIG_64BIT
LOAD_ACQUIRE(DW, u64)
#endif
#undef LOAD_ACQUIRE
default:
goto default_label;
}
break;
case BPF_STORE_REL:
switch (BPF_SIZE(insn->code)) {
#define STORE_RELEASE(SIZEOP, SIZE) \
case BPF_##SIZEOP: \
smp_store_release( \
(SIZE *)(unsigned long)(DST + insn->off), (SIZE)SRC); \
break;
STORE_RELEASE(B, u8)
STORE_RELEASE(H, u16)
STORE_RELEASE(W, u32)
#ifdef CONFIG_64BIT
STORE_RELEASE(DW, u64)
#endif
#undef STORE_RELEASE
default:
goto default_label;
}
break;
default:

View File

@@ -267,6 +267,18 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs,
BPF_SIZE(insn->code) == BPF_DW ? "64" : "",
bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
insn->dst_reg, insn->off, insn->src_reg);
} else if (BPF_MODE(insn->code) == BPF_ATOMIC &&
insn->imm == BPF_LOAD_ACQ) {
verbose(cbs->private_data, "(%02x) r%d = load_acquire((%s *)(r%d %+d))\n",
insn->code, insn->dst_reg,
bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
insn->src_reg, insn->off);
} else if (BPF_MODE(insn->code) == BPF_ATOMIC &&
insn->imm == BPF_STORE_REL) {
verbose(cbs->private_data, "(%02x) store_release((%s *)(r%d %+d), r%d)\n",
insn->code,
bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
insn->dst_reg, insn->off, insn->src_reg);
} else {
verbose(cbs->private_data, "BUG_%02x\n", insn->code);
}

View File

@@ -579,6 +579,13 @@ static bool is_cmpxchg_insn(const struct bpf_insn *insn)
insn->imm == BPF_CMPXCHG;
}
static bool is_atomic_load_insn(const struct bpf_insn *insn)
{
return BPF_CLASS(insn->code) == BPF_STX &&
BPF_MODE(insn->code) == BPF_ATOMIC &&
insn->imm == BPF_LOAD_ACQ;
}
static int __get_spi(s32 off)
{
return (-off - 1) / BPF_REG_SIZE;
@@ -3567,7 +3574,7 @@ static bool is_reg64(struct bpf_verifier_env *env, struct bpf_insn *insn,
}
if (class == BPF_STX) {
/* BPF_STX (including atomic variants) has multiple source
/* BPF_STX (including atomic variants) has one or more source
* operands, one of which is a ptr. Check whether the caller is
* asking about it.
*/
@@ -4181,7 +4188,7 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
* dreg still needs precision before this insn
*/
}
} else if (class == BPF_LDX) {
} else if (class == BPF_LDX || is_atomic_load_insn(insn)) {
if (!bt_is_reg_set(bt, dreg))
return 0;
bt_clear_reg(bt, dreg);
@@ -7766,6 +7773,32 @@ static int check_atomic_rmw(struct bpf_verifier_env *env,
return 0;
}
static int check_atomic_load(struct bpf_verifier_env *env,
struct bpf_insn *insn)
{
if (!atomic_ptr_type_ok(env, insn->src_reg, insn)) {
verbose(env, "BPF_ATOMIC loads from R%d %s is not allowed\n",
insn->src_reg,
reg_type_str(env, reg_state(env, insn->src_reg)->type));
return -EACCES;
}
return check_load_mem(env, insn, true, false, false, "atomic_load");
}
static int check_atomic_store(struct bpf_verifier_env *env,
struct bpf_insn *insn)
{
if (!atomic_ptr_type_ok(env, insn->dst_reg, insn)) {
verbose(env, "BPF_ATOMIC stores into R%d %s is not allowed\n",
insn->dst_reg,
reg_type_str(env, reg_state(env, insn->dst_reg)->type));
return -EACCES;
}
return check_store_reg(env, insn, true);
}
static int check_atomic(struct bpf_verifier_env *env, struct bpf_insn *insn)
{
switch (insn->imm) {
@@ -7780,6 +7813,20 @@ static int check_atomic(struct bpf_verifier_env *env, struct bpf_insn *insn)
case BPF_XCHG:
case BPF_CMPXCHG:
return check_atomic_rmw(env, insn);
case BPF_LOAD_ACQ:
if (BPF_SIZE(insn->code) == BPF_DW && BITS_PER_LONG != 64) {
verbose(env,
"64-bit load-acquires are only supported on 64-bit arches\n");
return -EOPNOTSUPP;
}
return check_atomic_load(env, insn);
case BPF_STORE_REL:
if (BPF_SIZE(insn->code) == BPF_DW && BITS_PER_LONG != 64) {
verbose(env,
"64-bit store-releases are only supported on 64-bit arches\n");
return -EOPNOTSUPP;
}
return check_atomic_store(env, insn);
default:
verbose(env, "BPF_ATOMIC uses invalid atomic opcode %02x\n",
insn->imm);
@@ -20605,7 +20652,9 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
insn->code == (BPF_ST | BPF_MEM | BPF_W) ||
insn->code == (BPF_ST | BPF_MEM | BPF_DW)) {
type = BPF_WRITE;
} else if ((insn->code == (BPF_STX | BPF_ATOMIC | BPF_W) ||
} else if ((insn->code == (BPF_STX | BPF_ATOMIC | BPF_B) ||
insn->code == (BPF_STX | BPF_ATOMIC | BPF_H) ||
insn->code == (BPF_STX | BPF_ATOMIC | BPF_W) ||
insn->code == (BPF_STX | BPF_ATOMIC | BPF_DW)) &&
env->insn_aux_data[i + delta].ptr_type == PTR_TO_ARENA) {
insn->code = BPF_STX | BPF_PROBE_ATOMIC | BPF_SIZE(insn->code);

View File

@@ -51,6 +51,9 @@
#define BPF_XCHG (0xe0 | BPF_FETCH) /* atomic exchange */
#define BPF_CMPXCHG (0xf0 | BPF_FETCH) /* atomic compare-and-write */
#define BPF_LOAD_ACQ 0x100 /* load-acquire */
#define BPF_STORE_REL 0x110 /* store-release */
enum bpf_cond_pseudo_jmp {
BPF_MAY_GOTO = 0,
};

View File

@@ -162,6 +162,66 @@ static void test_uaf(struct arena_atomics *skel)
ASSERT_EQ(skel->arena->uaf_recovery_fails, 0, "uaf_recovery_fails");
}
static void test_load_acquire(struct arena_atomics *skel)
{
LIBBPF_OPTS(bpf_test_run_opts, topts);
int err, prog_fd;
if (skel->data->skip_lacq_srel_tests) {
printf("%s:SKIP: ENABLE_ATOMICS_TESTS not defined, Clang doesn't support addr_space_cast, and/or JIT doesn't support load-acquire\n",
__func__);
test__skip();
return;
}
/* No need to attach it, just run it directly */
prog_fd = bpf_program__fd(skel->progs.load_acquire);
err = bpf_prog_test_run_opts(prog_fd, &topts);
if (!ASSERT_OK(err, "test_run_opts err"))
return;
if (!ASSERT_OK(topts.retval, "test_run_opts retval"))
return;
ASSERT_EQ(skel->arena->load_acquire8_result, 0x12,
"load_acquire8_result");
ASSERT_EQ(skel->arena->load_acquire16_result, 0x1234,
"load_acquire16_result");
ASSERT_EQ(skel->arena->load_acquire32_result, 0x12345678,
"load_acquire32_result");
ASSERT_EQ(skel->arena->load_acquire64_result, 0x1234567890abcdef,
"load_acquire64_result");
}
static void test_store_release(struct arena_atomics *skel)
{
LIBBPF_OPTS(bpf_test_run_opts, topts);
int err, prog_fd;
if (skel->data->skip_lacq_srel_tests) {
printf("%s:SKIP: ENABLE_ATOMICS_TESTS not defined, Clang doesn't support addr_space_cast, and/or JIT doesn't support store-release\n",
__func__);
test__skip();
return;
}
/* No need to attach it, just run it directly */
prog_fd = bpf_program__fd(skel->progs.store_release);
err = bpf_prog_test_run_opts(prog_fd, &topts);
if (!ASSERT_OK(err, "test_run_opts err"))
return;
if (!ASSERT_OK(topts.retval, "test_run_opts retval"))
return;
ASSERT_EQ(skel->arena->store_release8_result, 0x12,
"store_release8_result");
ASSERT_EQ(skel->arena->store_release16_result, 0x1234,
"store_release16_result");
ASSERT_EQ(skel->arena->store_release32_result, 0x12345678,
"store_release32_result");
ASSERT_EQ(skel->arena->store_release64_result, 0x1234567890abcdef,
"store_release64_result");
}
void test_arena_atomics(void)
{
struct arena_atomics *skel;
@@ -171,7 +231,7 @@ void test_arena_atomics(void)
if (!ASSERT_OK_PTR(skel, "arena atomics skeleton open"))
return;
if (skel->data->skip_tests) {
if (skel->data->skip_all_tests) {
printf("%s:SKIP:no ENABLE_ATOMICS_TESTS or no addr_space_cast support in clang",
__func__);
test__skip();
@@ -198,6 +258,10 @@ void test_arena_atomics(void)
test_xchg(skel);
if (test__start_subtest("uaf"))
test_uaf(skel);
if (test__start_subtest("load_acquire"))
test_load_acquire(skel);
if (test__start_subtest("store_release"))
test_store_release(skel);
cleanup:
arena_atomics__destroy(skel);

View File

@@ -45,6 +45,7 @@
#include "verifier_ldsx.skel.h"
#include "verifier_leak_ptr.skel.h"
#include "verifier_linked_scalars.skel.h"
#include "verifier_load_acquire.skel.h"
#include "verifier_loops1.skel.h"
#include "verifier_lwt.skel.h"
#include "verifier_map_in_map.skel.h"
@@ -80,6 +81,7 @@
#include "verifier_spill_fill.skel.h"
#include "verifier_spin_lock.skel.h"
#include "verifier_stack_ptr.skel.h"
#include "verifier_store_release.skel.h"
#include "verifier_subprog_precision.skel.h"
#include "verifier_subreg.skel.h"
#include "verifier_tailcall_jit.skel.h"
@@ -173,6 +175,7 @@ void test_verifier_int_ptr(void) { RUN(verifier_int_ptr); }
void test_verifier_iterating_callbacks(void) { RUN(verifier_iterating_callbacks); }
void test_verifier_jeq_infer_not_null(void) { RUN(verifier_jeq_infer_not_null); }
void test_verifier_jit_convergence(void) { RUN(verifier_jit_convergence); }
void test_verifier_load_acquire(void) { RUN(verifier_load_acquire); }
void test_verifier_ld_ind(void) { RUN(verifier_ld_ind); }
void test_verifier_ldsx(void) { RUN(verifier_ldsx); }
void test_verifier_leak_ptr(void) { RUN(verifier_leak_ptr); }
@@ -211,6 +214,7 @@ void test_verifier_sockmap_mutate(void) { RUN(verifier_sockmap_mutate); }
void test_verifier_spill_fill(void) { RUN(verifier_spill_fill); }
void test_verifier_spin_lock(void) { RUN(verifier_spin_lock); }
void test_verifier_stack_ptr(void) { RUN(verifier_stack_ptr); }
void test_verifier_store_release(void) { RUN(verifier_store_release); }
void test_verifier_subprog_precision(void) { RUN(verifier_subprog_precision); }
void test_verifier_subreg(void) { RUN(verifier_subreg); }
void test_verifier_tailcall_jit(void) { RUN(verifier_tailcall_jit); }

View File

@@ -6,6 +6,8 @@
#include <stdbool.h>
#include <stdatomic.h>
#include "bpf_arena_common.h"
#include "../../../include/linux/filter.h"
#include "bpf_misc.h"
struct {
__uint(type, BPF_MAP_TYPE_ARENA);
@@ -19,9 +21,17 @@ struct {
} arena SEC(".maps");
#if defined(ENABLE_ATOMICS_TESTS) && defined(__BPF_FEATURE_ADDR_SPACE_CAST)
bool skip_tests __attribute((__section__(".data"))) = false;
bool skip_all_tests __attribute((__section__(".data"))) = false;
#else
bool skip_tests = true;
bool skip_all_tests = true;
#endif
#if defined(ENABLE_ATOMICS_TESTS) && \
defined(__BPF_FEATURE_ADDR_SPACE_CAST) && \
(defined(__TARGET_ARCH_arm64) || defined(__TARGET_ARCH_x86))
bool skip_lacq_srel_tests __attribute((__section__(".data"))) = false;
#else
bool skip_lacq_srel_tests = true;
#endif
__u32 pid = 0;
@@ -274,4 +284,111 @@ int uaf(const void *ctx)
return 0;
}
#if __clang_major__ >= 18
__u8 __arena_global load_acquire8_value = 0x12;
__u16 __arena_global load_acquire16_value = 0x1234;
__u32 __arena_global load_acquire32_value = 0x12345678;
__u64 __arena_global load_acquire64_value = 0x1234567890abcdef;
__u8 __arena_global load_acquire8_result = 0;
__u16 __arena_global load_acquire16_result = 0;
__u32 __arena_global load_acquire32_result = 0;
__u64 __arena_global load_acquire64_result = 0;
#else
/* clang-17 crashes if the .addr_space.1 ELF section has holes. Work around
* this issue by defining the below variables as 64-bit.
*/
__u64 __arena_global load_acquire8_value;
__u64 __arena_global load_acquire16_value;
__u64 __arena_global load_acquire32_value;
__u64 __arena_global load_acquire64_value;
__u64 __arena_global load_acquire8_result;
__u64 __arena_global load_acquire16_result;
__u64 __arena_global load_acquire32_result;
__u64 __arena_global load_acquire64_result;
#endif
SEC("raw_tp/sys_enter")
int load_acquire(const void *ctx)
{
#if defined(ENABLE_ATOMICS_TESTS) && \
defined(__BPF_FEATURE_ADDR_SPACE_CAST) && \
(defined(__TARGET_ARCH_arm64) || defined(__TARGET_ARCH_x86))
#define LOAD_ACQUIRE_ARENA(SIZEOP, SIZE, SRC, DST) \
{ asm volatile ( \
"r1 = %[" #SRC "] ll;" \
"r1 = addr_space_cast(r1, 0x0, 0x1);" \
".8byte %[load_acquire_insn];" \
"r3 = %[" #DST "] ll;" \
"r3 = addr_space_cast(r3, 0x0, 0x1);" \
"*(" #SIZE " *)(r3 + 0) = r2;" \
: \
: __imm_addr(SRC), \
__imm_insn(load_acquire_insn, \
BPF_ATOMIC_OP(BPF_##SIZEOP, BPF_LOAD_ACQ, \
BPF_REG_2, BPF_REG_1, 0)), \
__imm_addr(DST) \
: __clobber_all); } \
LOAD_ACQUIRE_ARENA(B, u8, load_acquire8_value, load_acquire8_result)
LOAD_ACQUIRE_ARENA(H, u16, load_acquire16_value,
load_acquire16_result)
LOAD_ACQUIRE_ARENA(W, u32, load_acquire32_value,
load_acquire32_result)
LOAD_ACQUIRE_ARENA(DW, u64, load_acquire64_value,
load_acquire64_result)
#undef LOAD_ACQUIRE_ARENA
#endif
return 0;
}
#if __clang_major__ >= 18
__u8 __arena_global store_release8_result = 0;
__u16 __arena_global store_release16_result = 0;
__u32 __arena_global store_release32_result = 0;
__u64 __arena_global store_release64_result = 0;
#else
/* clang-17 crashes if the .addr_space.1 ELF section has holes. Work around
* this issue by defining the below variables as 64-bit.
*/
__u64 __arena_global store_release8_result;
__u64 __arena_global store_release16_result;
__u64 __arena_global store_release32_result;
__u64 __arena_global store_release64_result;
#endif
SEC("raw_tp/sys_enter")
int store_release(const void *ctx)
{
#if defined(ENABLE_ATOMICS_TESTS) && \
defined(__BPF_FEATURE_ADDR_SPACE_CAST) && \
(defined(__TARGET_ARCH_arm64) || defined(__TARGET_ARCH_x86))
#define STORE_RELEASE_ARENA(SIZEOP, DST, VAL) \
{ asm volatile ( \
"r1 = " VAL ";" \
"r2 = %[" #DST "] ll;" \
"r2 = addr_space_cast(r2, 0x0, 0x1);" \
".8byte %[store_release_insn];" \
: \
: __imm_addr(DST), \
__imm_insn(store_release_insn, \
BPF_ATOMIC_OP(BPF_##SIZEOP, BPF_STORE_REL, \
BPF_REG_2, BPF_REG_1, 0)) \
: __clobber_all); } \
STORE_RELEASE_ARENA(B, store_release8_result, "0x12")
STORE_RELEASE_ARENA(H, store_release16_result, "0x1234")
STORE_RELEASE_ARENA(W, store_release32_result, "0x12345678")
STORE_RELEASE_ARENA(DW, store_release64_result,
"0x1234567890abcdef ll")
#undef STORE_RELEASE_ARENA
#endif
return 0;
}
char _license[] SEC("license") = "GPL";

View File

@@ -0,0 +1,197 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2025 Google LLC. */
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include "../../../include/linux/filter.h"
#include "bpf_misc.h"
#if __clang_major__ >= 18 && defined(ENABLE_ATOMICS_TESTS) && \
(defined(__TARGET_ARCH_arm64) || defined(__TARGET_ARCH_x86))
SEC("socket")
__description("load-acquire, 8-bit")
__success __success_unpriv __retval(0x12)
__naked void load_acquire_8(void)
{
asm volatile (
"w1 = 0x12;"
"*(u8 *)(r10 - 1) = w1;"
".8byte %[load_acquire_insn];" // w0 = load_acquire((u8 *)(r10 - 1));
"exit;"
:
: __imm_insn(load_acquire_insn,
BPF_ATOMIC_OP(BPF_B, BPF_LOAD_ACQ, BPF_REG_0, BPF_REG_10, -1))
: __clobber_all);
}
SEC("socket")
__description("load-acquire, 16-bit")
__success __success_unpriv __retval(0x1234)
__naked void load_acquire_16(void)
{
asm volatile (
"w1 = 0x1234;"
"*(u16 *)(r10 - 2) = w1;"
".8byte %[load_acquire_insn];" // w0 = load_acquire((u16 *)(r10 - 2));
"exit;"
:
: __imm_insn(load_acquire_insn,
BPF_ATOMIC_OP(BPF_H, BPF_LOAD_ACQ, BPF_REG_0, BPF_REG_10, -2))
: __clobber_all);
}
SEC("socket")
__description("load-acquire, 32-bit")
__success __success_unpriv __retval(0x12345678)
__naked void load_acquire_32(void)
{
asm volatile (
"w1 = 0x12345678;"
"*(u32 *)(r10 - 4) = w1;"
".8byte %[load_acquire_insn];" // w0 = load_acquire((u32 *)(r10 - 4));
"exit;"
:
: __imm_insn(load_acquire_insn,
BPF_ATOMIC_OP(BPF_W, BPF_LOAD_ACQ, BPF_REG_0, BPF_REG_10, -4))
: __clobber_all);
}
SEC("socket")
__description("load-acquire, 64-bit")
__success __success_unpriv __retval(0x1234567890abcdef)
__naked void load_acquire_64(void)
{
asm volatile (
"r1 = 0x1234567890abcdef ll;"
"*(u64 *)(r10 - 8) = r1;"
".8byte %[load_acquire_insn];" // r0 = load_acquire((u64 *)(r10 - 8));
"exit;"
:
: __imm_insn(load_acquire_insn,
BPF_ATOMIC_OP(BPF_DW, BPF_LOAD_ACQ, BPF_REG_0, BPF_REG_10, -8))
: __clobber_all);
}
SEC("socket")
__description("load-acquire with uninitialized src_reg")
__failure __failure_unpriv __msg("R2 !read_ok")
__naked void load_acquire_with_uninitialized_src_reg(void)
{
asm volatile (
".8byte %[load_acquire_insn];" // r0 = load_acquire((u64 *)(r2 + 0));
"exit;"
:
: __imm_insn(load_acquire_insn,
BPF_ATOMIC_OP(BPF_DW, BPF_LOAD_ACQ, BPF_REG_0, BPF_REG_2, 0))
: __clobber_all);
}
SEC("socket")
__description("load-acquire with non-pointer src_reg")
__failure __failure_unpriv __msg("R1 invalid mem access 'scalar'")
__naked void load_acquire_with_non_pointer_src_reg(void)
{
asm volatile (
"r1 = 0;"
".8byte %[load_acquire_insn];" // r0 = load_acquire((u64 *)(r1 + 0));
"exit;"
:
: __imm_insn(load_acquire_insn,
BPF_ATOMIC_OP(BPF_DW, BPF_LOAD_ACQ, BPF_REG_0, BPF_REG_1, 0))
: __clobber_all);
}
SEC("socket")
__description("misaligned load-acquire")
__failure __failure_unpriv __msg("misaligned stack access off")
__flag(BPF_F_ANY_ALIGNMENT)
__naked void load_acquire_misaligned(void)
{
asm volatile (
"r1 = 0;"
"*(u64 *)(r10 - 8) = r1;"
".8byte %[load_acquire_insn];" // w0 = load_acquire((u32 *)(r10 - 5));
"exit;"
:
: __imm_insn(load_acquire_insn,
BPF_ATOMIC_OP(BPF_W, BPF_LOAD_ACQ, BPF_REG_0, BPF_REG_10, -5))
: __clobber_all);
}
SEC("socket")
__description("load-acquire from ctx pointer")
__failure __failure_unpriv __msg("BPF_ATOMIC loads from R1 ctx is not allowed")
__naked void load_acquire_from_ctx_pointer(void)
{
asm volatile (
".8byte %[load_acquire_insn];" // w0 = load_acquire((u8 *)(r1 + 0));
"exit;"
:
: __imm_insn(load_acquire_insn,
BPF_ATOMIC_OP(BPF_B, BPF_LOAD_ACQ, BPF_REG_0, BPF_REG_1, 0))
: __clobber_all);
}
SEC("xdp")
__description("load-acquire from pkt pointer")
__failure __msg("BPF_ATOMIC loads from R2 pkt is not allowed")
__naked void load_acquire_from_pkt_pointer(void)
{
asm volatile (
"r2 = *(u32 *)(r1 + %[xdp_md_data]);"
".8byte %[load_acquire_insn];" // w0 = load_acquire((u8 *)(r2 + 0));
"exit;"
:
: __imm_const(xdp_md_data, offsetof(struct xdp_md, data)),
__imm_insn(load_acquire_insn,
BPF_ATOMIC_OP(BPF_B, BPF_LOAD_ACQ, BPF_REG_0, BPF_REG_2, 0))
: __clobber_all);
}
SEC("flow_dissector")
__description("load-acquire from flow_keys pointer")
__failure __msg("BPF_ATOMIC loads from R2 flow_keys is not allowed")
__naked void load_acquire_from_flow_keys_pointer(void)
{
asm volatile (
"r2 = *(u64 *)(r1 + %[__sk_buff_flow_keys]);"
".8byte %[load_acquire_insn];" // w0 = load_acquire((u8 *)(r2 + 0));
"exit;"
:
: __imm_const(__sk_buff_flow_keys,
offsetof(struct __sk_buff, flow_keys)),
__imm_insn(load_acquire_insn,
BPF_ATOMIC_OP(BPF_B, BPF_LOAD_ACQ, BPF_REG_0, BPF_REG_2, 0))
: __clobber_all);
}
SEC("sk_reuseport")
__description("load-acquire from sock pointer")
__failure __msg("BPF_ATOMIC loads from R2 sock is not allowed")
__naked void load_acquire_from_sock_pointer(void)
{
asm volatile (
"r2 = *(u64 *)(r1 + %[sk_reuseport_md_sk]);"
".8byte %[load_acquire_insn];" // w0 = load_acquire((u8 *)(r2 + 0));
"exit;"
:
: __imm_const(sk_reuseport_md_sk, offsetof(struct sk_reuseport_md, sk)),
__imm_insn(load_acquire_insn,
BPF_ATOMIC_OP(BPF_B, BPF_LOAD_ACQ, BPF_REG_0, BPF_REG_2, 0))
: __clobber_all);
}
#else
SEC("socket")
__description("Clang version < 18, ENABLE_ATOMICS_TESTS not defined, and/or JIT doesn't support load-acquire, use a dummy test")
__success
int dummy_test(void)
{
return 0;
}
#endif
char _license[] SEC("license") = "GPL";

View File

@@ -2,6 +2,7 @@
/* Copyright (C) 2023 SUSE LLC */
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include "../../../include/linux/filter.h"
#include "bpf_misc.h"
SEC("?raw_tp")
@@ -90,6 +91,54 @@ __naked int bpf_end_bswap(void)
::: __clobber_all);
}
#if defined(ENABLE_ATOMICS_TESTS) && \
(defined(__TARGET_ARCH_arm64) || defined(__TARGET_ARCH_x86))
SEC("?raw_tp")
__success __log_level(2)
__msg("mark_precise: frame0: regs=r2 stack= before 3: (bf) r3 = r10")
__msg("mark_precise: frame0: regs=r2 stack= before 2: (db) r2 = load_acquire((u64 *)(r10 -8))")
__msg("mark_precise: frame0: regs= stack=-8 before 1: (7b) *(u64 *)(r10 -8) = r1")
__msg("mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 8")
__naked int bpf_load_acquire(void)
{
asm volatile (
"r1 = 8;"
"*(u64 *)(r10 - 8) = r1;"
".8byte %[load_acquire_insn];" /* r2 = load_acquire((u64 *)(r10 - 8)); */
"r3 = r10;"
"r3 += r2;" /* mark_precise */
"r0 = 0;"
"exit;"
:
: __imm_insn(load_acquire_insn,
BPF_ATOMIC_OP(BPF_DW, BPF_LOAD_ACQ, BPF_REG_2, BPF_REG_10, -8))
: __clobber_all);
}
SEC("?raw_tp")
__success __log_level(2)
__msg("mark_precise: frame0: regs=r1 stack= before 3: (bf) r2 = r10")
__msg("mark_precise: frame0: regs=r1 stack= before 2: (79) r1 = *(u64 *)(r10 -8)")
__msg("mark_precise: frame0: regs= stack=-8 before 1: (db) store_release((u64 *)(r10 -8), r1)")
__msg("mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 8")
__naked int bpf_store_release(void)
{
asm volatile (
"r1 = 8;"
".8byte %[store_release_insn];" /* store_release((u64 *)(r10 - 8), r1); */
"r1 = *(u64 *)(r10 - 8);"
"r2 = r10;"
"r2 += r1;" /* mark_precise */
"r0 = 0;"
"exit;"
:
: __imm_insn(store_release_insn,
BPF_ATOMIC_OP(BPF_DW, BPF_STORE_REL, BPF_REG_10, BPF_REG_1, -8))
: __clobber_all);
}
#endif /* load-acquire, store-release */
#endif /* v4 instruction */
SEC("?raw_tp")

View File

@@ -0,0 +1,264 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2025 Google LLC. */
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include "../../../include/linux/filter.h"
#include "bpf_misc.h"
#if __clang_major__ >= 18 && defined(ENABLE_ATOMICS_TESTS) && \
(defined(__TARGET_ARCH_arm64) || defined(__TARGET_ARCH_x86))
SEC("socket")
__description("store-release, 8-bit")
__success __success_unpriv __retval(0x12)
__naked void store_release_8(void)
{
asm volatile (
"w1 = 0x12;"
".8byte %[store_release_insn];" // store_release((u8 *)(r10 - 1), w1);
"w0 = *(u8 *)(r10 - 1);"
"exit;"
:
: __imm_insn(store_release_insn,
BPF_ATOMIC_OP(BPF_B, BPF_STORE_REL, BPF_REG_10, BPF_REG_1, -1))
: __clobber_all);
}
SEC("socket")
__description("store-release, 16-bit")
__success __success_unpriv __retval(0x1234)
__naked void store_release_16(void)
{
asm volatile (
"w1 = 0x1234;"
".8byte %[store_release_insn];" // store_release((u16 *)(r10 - 2), w1);
"w0 = *(u16 *)(r10 - 2);"
"exit;"
:
: __imm_insn(store_release_insn,
BPF_ATOMIC_OP(BPF_H, BPF_STORE_REL, BPF_REG_10, BPF_REG_1, -2))
: __clobber_all);
}
SEC("socket")
__description("store-release, 32-bit")
__success __success_unpriv __retval(0x12345678)
__naked void store_release_32(void)
{
asm volatile (
"w1 = 0x12345678;"
".8byte %[store_release_insn];" // store_release((u32 *)(r10 - 4), w1);
"w0 = *(u32 *)(r10 - 4);"
"exit;"
:
: __imm_insn(store_release_insn,
BPF_ATOMIC_OP(BPF_W, BPF_STORE_REL, BPF_REG_10, BPF_REG_1, -4))
: __clobber_all);
}
SEC("socket")
__description("store-release, 64-bit")
__success __success_unpriv __retval(0x1234567890abcdef)
__naked void store_release_64(void)
{
asm volatile (
"r1 = 0x1234567890abcdef ll;"
".8byte %[store_release_insn];" // store_release((u64 *)(r10 - 8), r1);
"r0 = *(u64 *)(r10 - 8);"
"exit;"
:
: __imm_insn(store_release_insn,
BPF_ATOMIC_OP(BPF_DW, BPF_STORE_REL, BPF_REG_10, BPF_REG_1, -8))
: __clobber_all);
}
SEC("socket")
__description("store-release with uninitialized src_reg")
__failure __failure_unpriv __msg("R2 !read_ok")
__naked void store_release_with_uninitialized_src_reg(void)
{
asm volatile (
".8byte %[store_release_insn];" // store_release((u64 *)(r10 - 8), r2);
"exit;"
:
: __imm_insn(store_release_insn,
BPF_ATOMIC_OP(BPF_DW, BPF_STORE_REL, BPF_REG_10, BPF_REG_2, -8))
: __clobber_all);
}
SEC("socket")
__description("store-release with uninitialized dst_reg")
__failure __failure_unpriv __msg("R2 !read_ok")
__naked void store_release_with_uninitialized_dst_reg(void)
{
asm volatile (
"r1 = 0;"
".8byte %[store_release_insn];" // store_release((u64 *)(r2 - 8), r1);
"exit;"
:
: __imm_insn(store_release_insn,
BPF_ATOMIC_OP(BPF_DW, BPF_STORE_REL, BPF_REG_2, BPF_REG_1, -8))
: __clobber_all);
}
SEC("socket")
__description("store-release with non-pointer dst_reg")
__failure __failure_unpriv __msg("R1 invalid mem access 'scalar'")
__naked void store_release_with_non_pointer_dst_reg(void)
{
asm volatile (
"r1 = 0;"
".8byte %[store_release_insn];" // store_release((u64 *)(r1 + 0), r1);
"exit;"
:
: __imm_insn(store_release_insn,
BPF_ATOMIC_OP(BPF_DW, BPF_STORE_REL, BPF_REG_1, BPF_REG_1, 0))
: __clobber_all);
}
SEC("socket")
__description("misaligned store-release")
__failure __failure_unpriv __msg("misaligned stack access off")
__flag(BPF_F_ANY_ALIGNMENT)
__naked void store_release_misaligned(void)
{
asm volatile (
"w0 = 0;"
".8byte %[store_release_insn];" // store_release((u32 *)(r10 - 5), w0);
"exit;"
:
: __imm_insn(store_release_insn,
BPF_ATOMIC_OP(BPF_W, BPF_STORE_REL, BPF_REG_10, BPF_REG_0, -5))
: __clobber_all);
}
SEC("socket")
__description("store-release to ctx pointer")
__failure __failure_unpriv __msg("BPF_ATOMIC stores into R1 ctx is not allowed")
__naked void store_release_to_ctx_pointer(void)
{
asm volatile (
"w0 = 0;"
".8byte %[store_release_insn];" // store_release((u8 *)(r1 + 0), w0);
"exit;"
:
: __imm_insn(store_release_insn,
BPF_ATOMIC_OP(BPF_B, BPF_STORE_REL, BPF_REG_1, BPF_REG_0, 0))
: __clobber_all);
}
SEC("xdp")
__description("store-release to pkt pointer")
__failure __msg("BPF_ATOMIC stores into R2 pkt is not allowed")
__naked void store_release_to_pkt_pointer(void)
{
asm volatile (
"w0 = 0;"
"r2 = *(u32 *)(r1 + %[xdp_md_data]);"
".8byte %[store_release_insn];" // store_release((u8 *)(r2 + 0), w0);
"exit;"
:
: __imm_const(xdp_md_data, offsetof(struct xdp_md, data)),
__imm_insn(store_release_insn,
BPF_ATOMIC_OP(BPF_B, BPF_STORE_REL, BPF_REG_2, BPF_REG_0, 0))
: __clobber_all);
}
SEC("flow_dissector")
__description("store-release to flow_keys pointer")
__failure __msg("BPF_ATOMIC stores into R2 flow_keys is not allowed")
__naked void store_release_to_flow_keys_pointer(void)
{
asm volatile (
"w0 = 0;"
"r2 = *(u64 *)(r1 + %[__sk_buff_flow_keys]);"
".8byte %[store_release_insn];" // store_release((u8 *)(r2 + 0), w0);
"exit;"
:
: __imm_const(__sk_buff_flow_keys,
offsetof(struct __sk_buff, flow_keys)),
__imm_insn(store_release_insn,
BPF_ATOMIC_OP(BPF_B, BPF_STORE_REL, BPF_REG_2, BPF_REG_0, 0))
: __clobber_all);
}
SEC("sk_reuseport")
__description("store-release to sock pointer")
__failure __msg("BPF_ATOMIC stores into R2 sock is not allowed")
__naked void store_release_to_sock_pointer(void)
{
asm volatile (
"w0 = 0;"
"r2 = *(u64 *)(r1 + %[sk_reuseport_md_sk]);"
".8byte %[store_release_insn];" // store_release((u8 *)(r2 + 0), w0);
"exit;"
:
: __imm_const(sk_reuseport_md_sk, offsetof(struct sk_reuseport_md, sk)),
__imm_insn(store_release_insn,
BPF_ATOMIC_OP(BPF_B, BPF_STORE_REL, BPF_REG_2, BPF_REG_0, 0))
: __clobber_all);
}
SEC("socket")
__description("store-release, leak pointer to stack")
__success __success_unpriv __retval(0)
__naked void store_release_leak_pointer_to_stack(void)
{
asm volatile (
".8byte %[store_release_insn];" // store_release((u64 *)(r10 - 8), r1);
"r0 = 0;"
"exit;"
:
: __imm_insn(store_release_insn,
BPF_ATOMIC_OP(BPF_DW, BPF_STORE_REL, BPF_REG_10, BPF_REG_1, -8))
: __clobber_all);
}
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__uint(max_entries, 1);
__type(key, long long);
__type(value, long long);
} map_hash_8b SEC(".maps");
SEC("socket")
__description("store-release, leak pointer to map")
__success __retval(0)
__failure_unpriv __msg_unpriv("R6 leaks addr into map")
__naked void store_release_leak_pointer_to_map(void)
{
asm volatile (
"r6 = r1;"
"r1 = %[map_hash_8b] ll;"
"r2 = 0;"
"*(u64 *)(r10 - 8) = r2;"
"r2 = r10;"
"r2 += -8;"
"call %[bpf_map_lookup_elem];"
"if r0 == 0 goto l0_%=;"
".8byte %[store_release_insn];" // store_release((u64 *)(r0 + 0), r6);
"l0_%=:"
"r0 = 0;"
"exit;"
:
: __imm_addr(map_hash_8b),
__imm(bpf_map_lookup_elem),
__imm_insn(store_release_insn,
BPF_ATOMIC_OP(BPF_DW, BPF_STORE_REL, BPF_REG_0, BPF_REG_6, 0))
: __clobber_all);
}
#else
SEC("socket")
__description("Clang version < 18, ENABLE_ATOMICS_TESTS not defined, and/or JIT doesn't support store-release, use a dummy test")
__success
int dummy_test(void)
{
return 0;
}
#endif
char _license[] SEC("license") = "GPL";