Commit Graph

1336430 Commits

Author SHA1 Message Date
Andrii Nakryiko
2941e21537 Merge branch 'introduce-bpf_object__prepare'
Mykyta Yatsenko says:

====================
Introduce bpf_object__prepare

From: Mykyta Yatsenko <yatsenko@meta.com>

We are introducing a new function in the libbpf API, bpf_object__prepare,
which provides more granular control over the process of loading a
bpf_object. bpf_object__prepare performs ELF processing, relocations,
prepares final state of BPF program instructions (accessible with
bpf_program__insns()), creates and potentially pins maps, and stops short
of loading BPF programs.

There are couple of anticipated usecases for this API:
* Use BPF token for freplace programs that might need to lookup BTF of
other programs (BPF token creation can't be moved to open step, as open
step is "no privilege assumption" step so that tools like bpftool can
generate skeleton, discover the structure of BPF object, etc).
* Stopping at prepare gives users finalized BPF program
instructions (with subprogs appended, everything relocated and
finalized, etc). And that property can be taken advantage of by
veristat (and similar tools) that might want to process one program at
a time, but would like to avoid relatively slow ELF parsing and
processing; and even BPF selftests itself (RUN_TESTS part of it at
least) would benefit from this by eliminating waste of re-processing
ELF many times.
====================

Link: https://patch.msgid.link/20250303135752.158343-1-mykyta.yatsenko5@gmail.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:28 -07:00
Mykyta Yatsenko
6419d08b6c selftests/bpf: Add tests for bpf_object__prepare
Add selftests, checking that running bpf_object__prepare successfully
creates maps before load step.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250303135752.158343-5-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:28 -07:00
Mykyta Yatsenko
1315c28ed8 libbpf: Split bpf object load into prepare/load
Introduce bpf_object__prepare API: additional intermediate preparation
step that performs ELF processing, relocations, prepares final state of
BPF program instructions (accessible with bpf_program__insns()), creates
and (potentially) pins maps, and stops short of loading BPF programs.

We anticipate few use cases for this API, such as:
* Use prepare to initialize bpf_token, without loading freplace
programs, unlocking possibility to lookup BTF of other programs.
* Execute prepare to obtain finalized BPF program instructions without
loading programs, enabling tools like veristat to process one program at
a time, without incurring cost of ELF parsing and processing.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250303135752.158343-4-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:28 -07:00
Mykyta Yatsenko
9a9e347835 libbpf: Introduce more granular state for bpf_object
We are going to split bpf_object loading into 2 stages: preparation and
loading. This will increase flexibility when working with bpf_object
and unlock some optimizations and use cases.
This patch substitutes a boolean flag (loaded) by more finely-grained
state for bpf_object.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250303135752.158343-3-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:28 -07:00
Breno Leitao
19856a5247 net: filter: Avoid shadowing variable in bpf_convert_ctx_access()
Rename the local variable 'off' to 'offset' to avoid shadowing the existing
'off' variable that is declared as an `int` in the outer scope of
bpf_convert_ctx_access().

This fixes a compiler warning:

 net/core/filter.c:9679:8: warning: declaration shadows a local variable [-Wshadow]

Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://patch.msgid.link/20250228-fix_filter-v1-1-ce13eae66fe9@debian.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:27 -07:00
Mykyta Yatsenko
6ef78c4191 libbpf: Use map_is_created helper in map setters
Refactoring: use map_is_created helper in map setters that need to check
the state of the map. This helps to reduce the number of the places that
depend explicitly on the loaded flag, simplifying refactoring in the
next patch of this set.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250303135752.158343-2-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:27 -07:00
Martin KaFai Lau
282f092271 Merge branch 'selftests-bpf-migrate-test_tunnel-sh-to-test_progs'
Bastien Curutchet says:

====================
selftests/bpf: Migrate test_tunnel.sh to test_progs

Hi all,

This patch series continues the work to migrate the *.sh tests into
prog_tests framework.

The test_tunnel.sh script has already been partly migrated to
test_progs in prog_tests/test_tunnel.c so I add my work to it.

PATCH 1 & 2 create some helpers to avoid code duplication and ease the
migration in the following patches.
PATCH 3 to 9 migrate the tests of gre, ip6gre, erspan, ip6erspan,
geneve, ip6geneve and ip6tnl tunnels.
PATCH 10 removes test_tunnel.sh
====================

Link: https://patch.msgid.link/20250303-tunnels-v2-0-8329f38f0678@bootlin.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)
a54e700696 selftests/bpf: test_tunnel: Remove test_tunnel.sh
All tests from test_tunnel.sh have been migrated into test test_progs.
The last test remaining in the script is the test_ipip() that is already
covered in the test_prog framework by the NONE case of test_ipip_tunnel().

Remove the test_tunnel.sh script and its Makefile entry

Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250303-tunnels-v2-10-8329f38f0678@bootlin.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)
05cd60ab57 selftests/bpf: test_tunnel: Move ip6tnl tunnel tests to test_progs
ip6tnl tunnels are tested in the test_tunnel.sh but not in the test_progs
framework.

Add a new test in test_progs to test ip6tnl tunnels. It uses the same
network topology and the same BPF programs than the script.
Remove test_ipip6() and test_ip6ip6() from the script.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250303-tunnels-v2-9-8329f38f0678@bootlin.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)
260f2da62d selftests/bpf: test_tunnel: Move ip6geneve tunnel test to test_progs
ip6geneve tunnels are tested in the test_tunnel.sh but not in the
test_progs framework.

Add a new test in test_progs to test ip6geneve tunnels. It uses the same
network topology and the same BPF programs than the script.
Remove test_ip6geneve() from the script.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250303-tunnels-v2-8-8329f38f0678@bootlin.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)
bd477738e6 selftests/bpf: test_tunnel: Move geneve tunnel test to test_progs
geneve tunnels are tested in the test_tunnel.sh but not in the test_progs
framework.

Add a new test in test_progs to test geneve tunnels. It uses the same
network topology and the same BPF programs than the script.
Remove test_geneve() from the script.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250303-tunnels-v2-7-8329f38f0678@bootlin.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)
ea60b6a524 selftests/bpf: test_tunnel: Move ip6erspan tunnel test to test_progs
ip6erspan tunnels are tested in the test_tunnel.sh but not in the
test_progs framework.

Add a new test in test_progs to test ip6erspan tunnels. It uses the same
network topology and the same BPF programs than the script.
Remove test_ip6erspan() from the script.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250303-tunnels-v2-6-8329f38f0678@bootlin.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)
cadb08a4d3 selftests/bpf: test_tunnel: Move erspan tunnel tests to test_progs
erspan tunnels are tested in the test_tunnel.sh but not in the test_progs
framework.

Add a new test in test_progs to test erspan tunnels. It uses the same
network topology and the same BPF programs than the script.
Remove test_erspan() from the script.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250303-tunnels-v2-5-8329f38f0678@bootlin.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)
856818b28f selftests/bpf: test_tunnel: Move ip6gre tunnel test to test_progs
ip6gre tunnels are tested in the test_tunnel.sh but not in the test_progs
framework.

Add a new test in test_progs to test ip6gre tunnels. It uses the same
network topology and the same BPF programs than the script. Disable the
IPv6 DAD feature because it can take lot of time and cause some tests to
fail depending on the environment they're run on.
Remove test_ip6gre() and test_ip6gretap() from the script.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250303-tunnels-v2-4-8329f38f0678@bootlin.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)
257dfd1c6b selftests/bpf: test_tunnel: Move gre tunnel test to test_progs
gre tunnels are tested in the test_tunnel.sh but not in the test_progs
framework.

Add a new test in test_progs to test gre tunnels. It uses the same
network topology and the same BPF programs than the script.
Remove test_gre() and test_gre_no_tunnel_key() from the script.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250303-tunnels-v2-3-8329f38f0678@bootlin.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:27 -07:00
Andrii Nakryiko
191598fc4d Merge branch 'veristat-files-list-txt-notation-for-object-files-list'
Eduard Zingerman says:

====================
veristat: @files-list.txt notation for object files list

A few small veristat improvements:
- It is possible to hit command line parameters number limit,
  e.g. when running veristat for all object files generated for
  test_progs. This patch-set adds an option to read objects files list
  from a file.
- Correct usage of strerror() function.
- Avoid printing log lines to CSV output.

Changelog:
- v1 -> v2:
  - replace strerror(errno) with strerror(-err) in patch #2 (Andrii)

v1: https://lore.kernel.org/bpf/3ee39a16-bc54-4820-984a-0add2b5b5f86@gmail.com/T/
====================

Link: https://patch.msgid.link/20250301000147.1583999-1-eddyz87@gmail.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:26 -07:00
Bastien Curutchet (eBPF Foundation)
fcb39996a2 selftests/bpf: test_tunnel: Add ping helpers
All tests use more or less the same ping commands as final validation.
Also test_ping()'s return value is checked with ASSERT_OK() while this
check is already done by the SYS() macro inside test_ping().

Create helpers around test_ping() and use them in the tests to avoid code
duplication.
Remove the unnecessary ASSERT_OK() from the tests.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250303-tunnels-v2-2-8329f38f0678@bootlin.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:26 -07:00
Peilin Ye
a752ba4332 bpf: Factor out check_load_mem() and check_store_reg()
Extract BPF_LDX and most non-ATOMIC BPF_STX instruction handling logic
in do_check() into helper functions to be used later.  While we are
here, make that comment about "reserved fields" more specific.

Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Peilin Ye <yepeilin@google.com>
Link: https://lore.kernel.org/r/8b39c94eac2bb7389ff12392ca666f939124ec4f.1740978603.git.yepeilin@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:26 -07:00
Eduard Zingerman
346e4ca462 veristat: Report program type guess results to sdterr
In order not to pollute CSV output, e.g.:

  $ ./veristat -o csv exceptions_ext.bpf.o > test.csv
  Using guessed program type 'sched_cls' for exceptions_ext.bpf.o/extension...
  Using guessed program type 'sched_cls' for exceptions_ext.bpf.o/throwing_extension...

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>
Link: https://lore.kernel.org/bpf/20250301000147.1583999-4-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:26 -07:00
Bastien Curutchet (eBPF Foundation)
5d6aa606c1 selftests/bpf: test_tunnel: Add generic_attach* helpers
A fair amount of code duplication is present among tests to attach BPF
programs.

Create generic_attach* helpers that attach BPF programs to a given
interface.
Use ASSERT_OK_FD() instead of ASSERT_GE() to check fd's validity.
Use these helpers in all the available tests.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250303-tunnels-v2-1-8329f38f0678@bootlin.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:26 -07:00
Peilin Ye
2626ffe9f3 bpf: Factor out check_atomic_rmw()
Currently, check_atomic() only handles atomic read-modify-write (RMW)
instructions.  Since we are planning to introduce other types of atomic
instructions (i.e., atomic load/store), extract the existing RMW
handling logic into its own function named check_atomic_rmw().

Remove the @insn_idx parameter as it is not really necessary.  Use
'env->insn_idx' instead, as in other places in verifier.c.

Signed-off-by: Peilin Ye <yepeilin@google.com>
Link: https://lore.kernel.org/r/6323ac8e73a10a1c8ee547c77ed68cf8eb6b90e1.1740978603.git.yepeilin@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:26 -07:00
Eduard Zingerman
2d95b3f582 veristat: Strerror expects positive number (errno)
Before:

  ./veristat -G @foobar iters.bpf.o
  Failed to open presets in 'foobar': Unknown error -2
  ...

After:

  ./veristat -G @foobar iters.bpf.o
  Failed to open presets in 'foobar': No such file or directory
  ...

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>
Link: https://lore.kernel.org/bpf/20250301000147.1583999-3-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:26 -07:00
Peilin Ye
66faaea94e bpf: Factor out atomic_ptr_type_ok()
Factor out atomic_ptr_type_ok() as a helper function to be used later.

Signed-off-by: Peilin Ye <yepeilin@google.com>
Link: https://lore.kernel.org/r/e5ef8b3116f3fffce78117a14060ddce05eba52a.1740978603.git.yepeilin@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:26 -07:00
Eduard Zingerman
c0d078da7a veristat: @files-list.txt notation for object files list
Allow reading object file list from file.
E.g. the following command:

  ./veristat @list.txt

Is equivalent to the following invocation:

  ./veristat line-1 line-2 ... line-N

Where line-i corresponds to lines from list.txt.
Lines starting with '#' are ignored.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>
Link: https://lore.kernel.org/bpf/20250301000147.1583999-2-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:26 -07:00
Eric Dumazet
f8ac5a4e1a bpf: no longer acquire map_idr_lock in bpf_map_inc_not_zero()
bpf_sk_storage_clone() is the only caller of bpf_map_inc_not_zero()
and is holding rcu_read_lock().

map_idr_lock does not add any protection, just remove the cost
for passive TCP flows.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Kui-Feng Lee <kuifeng@meta.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://lore.kernel.org/r/20250301191315.1532629-1-edumazet@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:26 -07:00
Alexei Starovoitov
dd4d3ef3c5 Merge branch 'global-subprogs-in-rcu-preempt-irq-disabled-sections'
Kumar Kartikeya Dwivedi says:

====================
Global subprogs in RCU/{preempt,irq}-disabled sections

Small change to allow non-sleepable global subprogs in
RCU, preempt-disabled, and irq-disabled sections. For
now, we don't lift the limitation for locks as it requires
more analysis, and will do this one resilient spin locks
land.

This surfaced a bug where sleepable global subprogs were
allowed in RCU read sections, that has been fixed. Tests
have been added to cover various cases.

Changelog:
----------
v2 -> v3
v2: https://lore.kernel.org/bpf/20250301030205.1221223-1-memxor@gmail.com

  * Fix broken to_be_replaced argument in the selftest.
  * Adjust selftest program type.

v1 -> v2
v1: https://lore.kernel.org/bpf/20250228162858.1073529-1-memxor@gmail.com

  * Rename subprog_info[i].sleepable to might_sleep, which more
    accurately reflects the nature of the bit. 'sleepable' means whether
    a given context is allowed to, while might_sleep captures if it
    does.
  * Disallow extensions that might sleep to attach to targets that don't
    sleep, since they'd be permitted to be called in atomic contexts. (Eduard)
  * Add tests for mixing non-sleepable and sleepable global function
    calls, and extensions attaching to non-sleepable global functions. (Eduard)
  * Rename changes_pkt_data -> summarization
====================

Link: https://patch.msgid.link/20250301151846.1552362-1-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:25 -07:00
Alexis Lothoré (eBPF Foundation)
93cf4e537e bpf/selftests: test_select_reuseport_kern: Remove unused header
test_select_reuseport_kern.c is currently including <stdlib.h>, but it
does not use any definition from there.

Remove stdlib.h inclusion from test_select_reuseport_kern.c

Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250227-remove_wrong_header-v1-1-bc94eb4e2f73@bootlin.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:25 -07:00
Kumar Kartikeya Dwivedi
72ed076abf selftests/bpf: Add tests for extending sleepable global subprogs
Add tests for freplace behavior with the combination of sleepable
and non-sleepable global subprogs. The changes_pkt_data selftest
did all the hardwork, so simply rename it and include new support
for more summarization tests for might_sleep bit.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20250301151846.1552362-4-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:25 -07:00
Yonghong Song
2222aa1c89 selftests/bpf: Add selftests allowing cgroup prog pre-ordering
Add a few selftests with cgroup prog pre-ordering.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250224230121.283601-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:25 -07:00
Kumar Kartikeya Dwivedi
b2bb703434 selftests/bpf: Test sleepable global subprogs in atomic contexts
Add tests for rejecting sleepable and accepting non-sleepable global
function calls in atomic contexts. For spin locks, we still reject
all global function calls. Once resilient spin locks land, we will
carefully lift in cases where we deem it safe.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20250301151846.1552362-3-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:25 -07:00
Yonghong Song
4b82b181a2 bpf: Allow pre-ordering for bpf cgroup progs
Currently for bpf progs in a cgroup hierarchy, the effective prog array
is computed from bottom cgroup to upper cgroups (post-ordering). For
example, the following cgroup hierarchy
    root cgroup: p1, p2
        subcgroup: p3, p4
have BPF_F_ALLOW_MULTI for both cgroup levels.
The effective cgroup array ordering looks like
    p3 p4 p1 p2
and at run time, progs will execute based on that order.

But in some cases, it is desirable to have root prog executes earlier than
children progs (pre-ordering). For example,
  - prog p1 intends to collect original pkt dest addresses.
  - prog p3 will modify original pkt dest addresses to a proxy address for
    security reason.
The end result is that prog p1 gets proxy address which is not what it
wants. Putting p1 to every child cgroup is not desirable either as it
will duplicate itself in many child cgroups. And this is exactly a use case
we are encountering in Meta.

To fix this issue, let us introduce a flag BPF_F_PREORDER. If the flag
is specified at attachment time, the prog has higher priority and the
ordering with that flag will be from top to bottom (pre-ordering).
For example, in the above example,
    root cgroup: p1, p2
        subcgroup: p3, p4
Let us say p2 and p4 are marked with BPF_F_PREORDER. The final
effective array ordering will be
    p2 p4 p3 p1

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250224230116.283071-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:25 -07:00
Kumar Kartikeya Dwivedi
e2d8f560d1 bpf: Summarize sleepable global subprogs
The verifier currently does not permit global subprog calls when a lock
is held, preemption is disabled, or when IRQs are disabled. This is
because we don't know whether the global subprog calls sleepable
functions or not.

In case of locks, there's an additional reason: functions called by the
global subprog may hold additional locks etc. The verifier won't know
while verifying the global subprog whether it was called in context
where a spin lock is already held by the program.

Perform summarization of the sleepable nature of a global subprog just
like changes_pkt_data and then allow calls to global subprogs for
non-sleepable ones from atomic context.

While making this change, I noticed that RCU read sections had no
protection against sleepable global subprog calls, include it in the
checks and fix this while we're at it.

Care needs to be taken to not allow global subprog calls when regular
bpf_spin_lock is held. When resilient spin locks is held, we want to
potentially have this check relaxed, but not for now.

Also make sure extensions freplacing global functions cannot do so
in case the target is non-sleepable, but the extension is. The other
combination is ok.

Tests are included in the next patch to handle all special conditions.

Fixes: 9bb00b2895 ("bpf: Add kfunc bpf_rcu_read_lock/unlock()")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20250301151846.1552362-2-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:25 -07:00
Alexei Starovoitov
7c2f207a52 Merge branch 'optimize-bpf-selftest-to-increase-ci-success-rate'
Jiayuan Chen says:

====================
Optimize bpf selftest to increase CI success rate

1. Optimized some static bound port selftests to avoid port occupation
when running test_progs -j.
2. Optimized the retry logic for test_maps.

Some Failed CI:
https://github.com/kernel-patches/bpf/actions/runs/13275542359/job/37064974076
https://github.com/kernel-patches/bpf/actions/runs/13549227497/job/37868926343
https://github.com/kernel-patches/bpf/actions/runs/13548089029/job/37865812030
https://github.com/kernel-patches/bpf/actions/runs/13553536268/job/37883329296
(Perhaps it's due to the large number of pull requests requiring CI runs?)
====================

Link: https://patch.msgid.link/20250227142646.59711-1-jiayuan.chen@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:25 -07:00
Jiayuan Chen
7f260af1f2 selftests/bpf: Fixes for test_maps test
BPF CI has failed 3 times in the last 24 hours. Add retry for ENOMEM.
It's similar to the optimization plan:
commit 2f553b032c ("selftsets/bpf: Retry map update for non-preallocated per-cpu map")

Failed CI:
https://github.com/kernel-patches/bpf/actions/runs/13549227497/job/37868926343
https://github.com/kernel-patches/bpf/actions/runs/13548089029/job/37865812030
https://github.com/kernel-patches/bpf/actions/runs/13553536268/job/37883329296

selftests/bpf: Fixes for test_maps test
Fork 100 tasks to 'test_update_delete'
Fork 100 tasks to 'test_update_delete'
Fork 100 tasks to 'test_update_delete'
Fork 100 tasks to 'test_update_delete'
......
test_task_storage_map_stress_lookup:PASS
test_maps: OK, 0 SKIPPED

Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Link: https://lore.kernel.org/r/20250227142646.59711-4-jiayuan.chen@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:25 -07:00
Andrii Nakryiko
ee2fdd020f Merge branch 'introduce-bpf_dynptr_copy-kfunc'
Mykyta Yatsenko says:

====================
introduce bpf_dynptr_copy kfunc

From: Mykyta Yatsenko <yatsenko@meta.com>

Introduce a new kfunc, bpf_dynptr_copy, which enables copying of
data from one dynptr to another. This functionality may be useful in
scenarios such as capturing XDP data to a ring buffer.
The patch set is split into 3 patches:
1. Refactor bpf_dynptr_read and bpf_dynptr_write by extracting code into
static functions, that allows calling them with no compiler warnings
2. Introduce bpf_dynptr_copy
3. Add tests for bpf_dynptr_copy

v2->v3:
  * Implemented bpf_memcmp in dynptr_success.c test, as __builtin_memcmp
  was not inlined on GCC-BPF.
====================

Link: https://patch.msgid.link/20250226183201.332713-1-mykyta.yatsenko5@gmail.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:24 -07:00
Jiayuan Chen
93a279b65a selftests/bpf: Allow auto port binding for bpf nf
Allow auto port binding for bpf nf test to avoid binding conflict.

./test_progs -a bpf_nf
24/1    bpf_nf/xdp-ct:OK
24/2    bpf_nf/tc-bpf-ct:OK
24/3    bpf_nf/alloc_release:OK
24/4    bpf_nf/insert_insert:OK
24/5    bpf_nf/lookup_insert:OK
24/6    bpf_nf/set_timeout_after_insert:OK
24/7    bpf_nf/set_status_after_insert:OK
24/8    bpf_nf/change_timeout_after_alloc:OK
24/9    bpf_nf/change_status_after_alloc:OK
24/10   bpf_nf/write_not_allowlisted_field:OK
24      bpf_nf:OK
Summary: 1/10 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Link: https://lore.kernel.org/r/20250227142646.59711-3-jiayuan.chen@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:24 -07:00
Jiayuan Chen
acf0d6f681 selftests/bpf: Allow auto port binding for cgroup connect
Allow auto port binding for cgroup connect test to avoid binding conflict.

Result:
./test_progs -a cgroup_v1v2
59      cgroup_v1v2:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Link: https://lore.kernel.org/r/20250227142646.59711-2-jiayuan.chen@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:24 -07:00
Mykyta Yatsenko
064e9aacfd selftests/bpf: Add tests for bpf_dynptr_copy
Add XDP setup type for dynptr tests, enabling testing for
non-contiguous buffer.
Add 2 tests:
 - test_dynptr_copy - verify correctness for the fast (contiguous
 buffer) code path.
 - test_dynptr_copy_xdp - verifies code paths that handle
 non-contiguous buffer.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250226183201.332713-4-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:20 -07:00
Mykyta Yatsenko
daec295a70 bpf/helpers: Introduce bpf_dynptr_copy kfunc
Introducing bpf_dynptr_copy kfunc allowing copying data from one dynptr to
another. This functionality is useful in scenarios such as capturing XDP
data to a ring buffer.
The implementation consists of 4 branches:
  * A fast branch for contiguous buffer capacity in both source and
destination dynptrs
  * 3 branches utilizing __bpf_dynptr_read and __bpf_dynptr_write to copy
data to/from non-contiguous buffer

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250226183201.332713-3-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:16 -07:00
Mykyta Yatsenko
09206af69c bpf/helpers: Refactor bpf_dynptr_read and bpf_dynptr_write
Refactor bpf_dynptr_read and bpf_dynptr_write helpers: extract code
into the static functions namely __bpf_dynptr_read and
__bpf_dynptr_write, this allows calling these without compiler warnings.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250226183201.332713-2-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:47:51 -07:00
Andrii Nakryiko
b0f2ee6077 Merge branch 'selftests-bpf-implement-setting-global-variables-in-veristat'
Mykyta Yatsenko says:

====================
selftests/bpf: implement setting global variables in veristat

From: Mykyta Yatsenko <yatsenko@meta.com>

To better verify some complex BPF programs by veristat, it would be useful
to preset global variables. This patch set implements this functionality
and introduces tests for veristat.

v4->v5
  * Rework parsing to use sscanf for integers
  * Addressing nits

v3->v4:
  * Fixing bug in set_global_var introduced by refactoring in previous patch set
  * Addressed nits from Eduard

v2->v3:
  * Reworked parsing of the presets, using sscanf to split into variable and
  value, but still use strtoll/strtoull to support range checks when parsing
  integers
  * Fix test failures for no_alu32 & cpuv4 by checking if veristat binary is in
  parent folder
  * Introduce __CHECK_STR macro for simplifying checks in test
  * Modify tests into sub-tests
====================

Link: https://patch.msgid.link/20250225163101.121043-1-mykyta.yatsenko5@gmail.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:35:24 -07:00
Mykyta Yatsenko
3d1033caf0 selftests/bpf: Introduce veristat test
Introducing test for veristat, part of test_progs.
Test cases cover functionality of setting global variables in BPF
program.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20250225163101.121043-3-mykyta.yatsenko5@gmail.com
2025-02-26 10:45:01 -08:00
Mykyta Yatsenko
e3c9abd0d1 selftests/bpf: Implement setting global variables in veristat
To better verify some complex BPF programs we'd like to preset global
variables.
This patch introduces CLI argument `--set-global-vars` or `-G` to
veristat, that allows presetting values to global variables defined
in BPF program. For example:

prog.c:
```
enum Enum { ELEMENT1 = 0, ELEMENT2 = 5 };
const volatile __s64 a = 5;
const volatile __u8 b = 5;
const volatile enum Enum c = ELEMENT2;
const volatile bool d = false;

char arr[4] = {0};

SEC("tp_btf/sched_switch")
int BPF_PROG(...)
{
	bpf_printk("%c\n", arr[a]);
	bpf_printk("%c\n", arr[b]);
	bpf_printk("%c\n", arr[c]);
	bpf_printk("%c\n", arr[d]);
	return 0;
}
```
By default verification of the program fails:
```
./veristat prog.bpf.o
```
By presetting global variables, we can make verification pass:
```
./veristat wq.bpf.o  -G "a = 0" -G "b = 1" -G "c = 2" -G "d = 3"
```

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20250225163101.121043-2-mykyta.yatsenko5@gmail.com
2025-02-26 10:45:00 -08:00
Ihor Solodrai
0ba0ef012e selftests/bpf: Test bpf_usdt_arg_size() function
Update usdt tests to also check for correct behavior of
bpf_usdt_arg_size().

Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20250224235756.2612606-2-ihor.solodrai@linux.dev
2025-02-26 08:59:44 -08:00
Ihor Solodrai
b62dff1440 libbpf: Implement bpf_usdt_arg_size BPF function
Information about USDT argument size is implicitly stored in
__bpf_usdt_arg_spec, but currently it's not accessbile to BPF programs
that use USDT.

Implement bpf_sdt_arg_size() that returns the size of an USDT argument
in bytes.

v1->v2:
  * do not add __bpf_usdt_arg_spec() helper

v1: https://lore.kernel.org/bpf/20250220215904.3362709-1-ihor.solodrai@linux.dev/

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20250224235756.2612606-1-ihor.solodrai@linux.dev
2025-02-26 08:59:44 -08:00
Alexei Starovoitov
4580f4e0eb bpf: Fix deadlock between rcu_tasks_trace and event_mutex.
Fix the following deadlock:
CPU A
_free_event()
  perf_kprobe_destroy()
    mutex_lock(&event_mutex)
      perf_trace_event_unreg()
        synchronize_rcu_tasks_trace()

There are several paths where _free_event() grabs event_mutex
and calls sync_rcu_tasks_trace. Above is one such case.

CPU B
bpf_prog_test_run_syscall()
  rcu_read_lock_trace()
    bpf_prog_run_pin_on_cpu()
      bpf_prog_load()
        bpf_tracing_func_proto()
          trace_set_clr_event()
            mutex_lock(&event_mutex)

Delegate trace_set_clr_event() to workqueue to avoid
such lock dependency.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250224221637.4780-1-alexei.starovoitov@gmail.com
2025-02-26 08:48:40 -08:00
Yonghong Song
b123480eec docs/bpf: Document some special sdiv/smod operations
Patch [1] fixed possible kernel crash due to specific sdiv/smod operations
in bpf program. The following are related operations and the expected results
of those operations:
  - LLONG_MIN/-1 = LLONG_MIN
  - INT_MIN/-1 = INT_MIN
  - LLONG_MIN%-1 = 0
  - INT_MIN%-1 = 0

Those operations are replaced with codes which won't cause
kernel crash. This patch documents what operations may cause exception and
what replacement operations are.

  [1] https://lore.kernel.org/all/20240913150326.1187788-1-yonghong.song@linux.dev/

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20241107170924.2944681-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-26 07:54:49 -08:00
Mahe Tardy
9138048bb5 selftests/bpf: add cgroup_skb netns cookie tests
Add netns cookie test that verifies the helper is now supported and work
in the context of cgroup_skb programs.

Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
Link: https://lore.kernel.org/r/20250225125031.258740-2-mahe.tardy@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-26 07:35:51 -08:00
Mahe Tardy
c221d3744a bpf: add get_netns_cookie helper to cgroup_skb programs
This is needed in the context of Cilium and Tetragon to retrieve netns
cookie from hostns when traffic leaves Pod, so that we can correlate
skb->sk's netns cookie.

Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
Link: https://lore.kernel.org/r/20250225125031.258740-1-mahe.tardy@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-26 07:35:51 -08:00
Amery Hung
4e4136c644 selftests/bpf: Test gen_pro/epilogue that generate kfuncs
Test gen_prologue and gen_epilogue that generate kfuncs that have not
been seen in the main program.

The main bpf program and return value checks are identical to
pro_epilogue.c introduced in commit 47e69431b5 ("selftests/bpf: Test
gen_prologue and gen_epilogue"). However, now when bpf_testmod_st_ops
detects a program name with prefix "test_kfunc_", it generates slightly
different prologue and epilogue: They still add 1000 to args->a in
prologue, add 10000 to args->a and set r0 to 2 * args->a in epilogue,
but involve kfuncs.

At high level, the alternative version of prologue and epilogue look
like this:

  cgrp = bpf_cgroup_from_id(0);
  if (cgrp)
          bpf_cgroup_release(cgrp);
  else
          /* Perform what original bpf_testmod_st_ops prologue or
           * epilogue does
           */

Since 0 is never a valid cgroup id, the original prologue or epilogue
logic will be performed. As a result, the __retval check should expect
the exact same return value.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20250225233545.285481-2-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-25 19:04:43 -08:00