Commit Graph

1267621 Commits

Author SHA1 Message Date
Jose E. Marchesi
6a2f786e69 bpf: ignore expected GCC warning in test_global_func10.c
The BPF selftest global_func10 in progs/test_global_func10.c contains:

  struct Small {
  	long x;
  };

  struct Big {
  	long x;
  	long y;
  };

  [...]

  __noinline int foo(const struct Big *big)
  {
	if (!big)
		return 0;

	return bpf_get_prandom_u32() < big->y;
  }

  [...]

  SEC("cgroup_skb/ingress")
  __failure __msg("invalid indirect access to stack")
  int global_func10(struct __sk_buff *skb)
  {
	const struct Small small = {.x = skb->len };

	return foo((struct Big *)&small) ? 1 : 0;
  }

GCC emits a "maybe uninitialized" warning for the code above, because
it knows `foo' accesses `big->y'.

Since the purpose of this selftest is to check that the verifier will
fail on this sort of invalid memory access, this patch just silences
the compiler warning.

Tested in bpf-next master.
No regressions.

Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Cc: david.faust@oracle.com
Cc: cupertino.miranda@oracle.com
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240511212349.23549-1-jose.marchesi@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:31:24 -07:00
Jose E. Marchesi
73868988c9 bpf: disable strict aliasing in test_global_func9.c
The BPF selftest test_global_func9.c performs type punning and breaks
srict-aliasing rules.

In particular, given:

  int global_func9(struct __sk_buff *skb)
  {
	int result = 0;

	[...]
	{
		const struct C c = {.x = skb->len, .y = skb->family };

		result |= foo((const struct S *)&c);
	}
  }

When building with strict-aliasing enabled (the default) the
initialization of `c' gets optimized away in its entirely:

	[... no initialization of `c' ...]
	r1 = r10
	r1 += -40
	call	foo
	w0 |= w6

Since GCC knows that `foo' accesses s->x, we get a "maybe
uninitialized" warning.

On the other hand, when strict-aliasing is disabled GCC only optimizes
away the store to `.y':

	r1 = *(u32 *) (r6+0)
	*(u32 *) (r10+-40) = r1  ; This is .x = skb->len in `c'
	r1 = r10
	r1 += -40
	call	foo
	w0 |= w6

In this case the warning is not emitted, because s-> is initialized.

This patch disables strict aliasing in this test when building with
GCC.  clang seems to not optimize this particular code even when
strict aliasing is enabled.

Tested in bpf-next master.

Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Cc: david.faust@oracle.com
Cc: cupertino.miranda@oracle.com
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240511212213.23418-1-jose.marchesi@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:30:15 -07:00
Geliang Tang
a3c1c95538 selftests/bpf: Free strdup memory in xdp_hw_metadata
The strdup() function returns a pointer to a new string which is a
duplicate of the string "ifname". Memory for the new string is obtained
with malloc(), and need to be freed with free().

This patch adds this missing "free(saved_hwtstamp_ifname)" in cleanup()
to avoid a potential memory leak in xdp_hw_metadata.c.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/af9bcccb96655e82de5ce2b4510b88c9c8ed5ed0.1715417367.git.tanggeliang@kylinos.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:28:42 -07:00
Cupertino Miranda
5ddafcc377 selftests/bpf: Fix a few tests for GCC related warnings.
This patch corrects a few warnings to allow selftests to compile for
GCC.

-- progs/cpumask_failure.c --

progs/bpf_misc.h:136:22: error: ‘cpumask’ is used uninitialized
[-Werror=uninitialized]
  136 | #define __sink(expr) asm volatile("" : "+g"(expr))
      |                      ^~~
progs/cpumask_failure.c:68:9: note: in expansion of macro ‘__sink’
   68 |         __sink(cpumask);

The macro __sink(cpumask) with the '+' contraint modifier forces the
the compiler to expect a read and write from cpumask. GCC detects
that cpumask is never initialized and reports an error.
This patch removes the spurious non required definitions of cpumask.

-- progs/dynptr_fail.c --

progs/dynptr_fail.c:1444:9: error: ‘ptr1’ may be used uninitialized
[-Werror=maybe-uninitialized]
 1444 |         bpf_dynptr_clone(&ptr1, &ptr2);

Many of the tests in the file are related to the detection of
uninitialized pointers by the verifier. GCC is able to detect possible
uninitialized values, and reports this as an error.
The patch initializes all of the previous uninitialized structs.

-- progs/test_tunnel_kern.c --

progs/test_tunnel_kern.c:590:9: error: array subscript 1 is outside
array bounds of ‘struct geneve_opt[1]’ [-Werror=array-bounds=]
  590 |         *(int *) &gopt.opt_data = bpf_htonl(0xdeadbeef);
      |         ^~~~~~~~~~~~~~~~~~~~~~~
progs/test_tunnel_kern.c:575:27: note: at offset 4 into object ‘gopt’ of
size 4
  575 |         struct geneve_opt gopt;

This tests accesses beyond the defined data for the struct geneve_opt
which contains as last field "u8 opt_data[0]" which clearly does not get
reserved space (in stack) in the function header. This pattern is
repeated in ip6geneve_set_tunnel and geneve_set_tunnel functions.
GCC is able to see this and emits a warning.
The patch introduces a local struct that allocates enough space to
safely allow the write to opt_data field.

-- progs/jeq_infer_not_null_fail.c --

progs/jeq_infer_not_null_fail.c:21:40: error: array subscript ‘struct
bpf_map[0]’ is partly outside array bounds of ‘struct <anonymous>[1]’
[-Werror=array-bounds=]
   21 |         struct bpf_map *inner_map = map->inner_map_meta;
      |                                        ^~
progs/jeq_infer_not_null_fail.c:14:3: note: object ‘m_hash’ of size 32
   14 | } m_hash SEC(".maps");

This example defines m_hash in the context of the compilation unit and
casts it to struct bpf_map which is much smaller than the size of struct
bpf_map. It errors out in GCC when it attempts to access an element that
would be defined in struct bpf_map outsize of the defined limits for
m_hash.
This patch disables the warning through a GCC pragma.

This changes were tested in bpf-next master selftests without any
regressions.

Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Cc: jose.marchesi@oracle.com
Cc: david.faust@oracle.com
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Link: https://lore.kernel.org/r/20240510183850.286661-2-cupertino.miranda@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:25:14 -07:00
David Faust
792a04bed4 bpf: avoid gcc overflow warning in test_xdp_vlan.c
This patch fixes an integer overflow warning raised by GCC in
xdp_prognum1 of progs/test_xdp_vlan.c:

  GCC-BPF  [test_maps] test_xdp_vlan.bpf.o
progs/test_xdp_vlan.c: In function 'xdp_prognum1':
progs/test_xdp_vlan.c:163:25: error: integer overflow in expression
 '(short int)(((__builtin_constant_p((int)vlan_hdr->h_vlan_TCI)) != 0
   ? (int)(short unsigned int)((short int)((int)vlan_hdr->h_vlan_TCI
   << 8 >> 8) << 8 | (short int)((int)vlan_hdr->h_vlan_TCI << 0 >> 8
   << 0)) & 61440 : (int)__builtin_bswap16(vlan_hdr->h_vlan_TCI)
   & 61440) << 8 >> 8) << 8' of type 'short int' results in '0' [-Werror=overflow]
  163 |                         bpf_htons((bpf_ntohs(vlan_hdr->h_vlan_TCI) & 0xf000)
      |                         ^~~~~~~~~

The problem lies with the expansion of the bpf_htons macro and the
expression passed into it.  The bpf_htons macro (and similarly the
bpf_ntohs macro) expand to a ternary operation using either
__builtin_bswap16 or ___bpf_swab16 to swap the bytes, depending on
whether the expression is constant.

For an expression, with 'value' as a u16, like:

  bpf_htons (value & 0xf000)

The entire (value & 0xf000) is 'x' in the expansion of ___bpf_swab16
and we get as one part of the expanded swab16:

  ((__u16)(value & 0xf000) << 8 >> 8 << 8

This will always evaluate to 0, which is intentional since this
subexpression deals with the byte guaranteed to be 0 by the mask.

However, GCC warns because the precise reason this always evaluates to 0
is an overflow.  Specifically, the plain 0xf000 in the expression is a
signed 32-bit integer, which causes 'value' to also be promoted to a
signed 32-bit integer, and the combination of the 8-bit left shift and
down-cast back to __u16 results in a signed overflow (really a 'warning:
overflow in conversion from int to __u16' which is propegated up through
the rest of the expression leading to the ultimate overflow warning
above), which is a valid warning despite being the intended result of
this code.

Clang does not warn on this case, likely because it performs constant
folding later in the compilation process relative to GCC.  It seems that
by the time clang does constant folding for this expression, the side of
the ternary with this overflow has already been discarded.

Fortunately, this warning is easily silenced by simply making the 0xf000
mask explicitly unsigned.  This has no impact on the result.

Signed-off-by: David Faust <david.faust@oracle.com>
Cc: jose.marchesi@oracle.com
Cc: cupertino.miranda@oracle.com
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240508193512.152759-1-david.faust@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:19:32 -07:00
Tushar Vyavahare
bbe91a9f68 tools: remove redundant ethtool.h from tooling infra
Remove the redundant ethtool.h header file from tools/include/uapi/linux.
The file is unnecessary as the system uses the kernel's
include/uapi/linux/ethtool.h directly.

Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20240508104123.434769-1-tushar.vyavahare@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:17:55 -07:00
Alexei Starovoitov
e9dd2290f1 Merge branch 'retire-progs-test_sock_addr'
Jordan Rife says:

====================
Retire progs/test_sock_addr.c

This patch series migrates remaining tests from bpf/test_sock_addr.c to
prog_tests/sock_addr.c and progs/verifier_sock_addr.c in order to fully
retire the old-style test program and expands test coverage to test
previously untested scenarios related to sockaddr hooks.

This is a continuation of the work started recently during the expansion
of prog_tests/sock_addr.c.

Link: https://lore.kernel.org/bpf/20240429214529.2644801-1-jrife@google.com/T/#u

=======
Patches
=======
* Patch 1 moves tests that check valid return values for recvmsg hooks
  into progs/verifier_sock_addr.c, a new addition to the verifier test
  suite.
* Patches 2-5 lay the groundwork for test migration, enabling
  prog_tests/sock_addr.c to handle more test dimensions.
* Patches 6-11 move existing tests to prog_tests/sock_addr.c.
* Patch 12 removes some redundant test cases.
* Patches 14-17 expand on existing test coverage.
====================

Link: https://lore.kernel.org/r/20240510190246.3247730-1-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:10:53 -07:00
Jordan Rife
a3d3eb957d selftests/bpf: Expand ATTACH_REJECT tests
This expands coverage for ATTACH_REJECT tests to include connect_unix,
sendmsg_unix, recvmsg*, getsockname*, and getpeername*.

Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-18-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:10:42 -07:00
Jordan Rife
bc467e953e selftests/bpf: Expand getsockname and getpeername tests
This expands coverage for getsockname and getpeername hooks to include
getsockname4, getsockname6, getpeername4, and getpeername6.

Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-17-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:10:42 -07:00
Jordan Rife
dfb7539b47 sefltests/bpf: Expand sockaddr hook deny tests
This patch expands test coverage for EPERM tests to include connect and
bind calls and rounds out the coverage for sendmsg by adding tests for
sendmsg_unix.

Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-16-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:10:42 -07:00
Jordan Rife
1e0a8367c8 selftests/bpf: Expand sockaddr program return value tests
This patch expands verifier coverage for program return values to cover
bind, connect, sendmsg, getsockname, and getpeername hooks. It also
rounds out the recvmsg coverage by adding test cases for recvmsg_unix
hooks.

Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-15-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:10:42 -07:00
Jordan Rife
61ecfdfce2 selftests/bpf: Retire test_sock_addr.(c|sh)
Fully remove test_sock_addr.c and test_sock_addr.sh, as test coverage
has been fully moved to prog_tests/sock_addr.c.

Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-14-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:10:42 -07:00
Jordan Rife
9c3f17862f selftests/bpf: Remove redundant sendmsg test cases
Remove these test cases completely, as the same behavior is already
covered by other sendmsg* test cases in prog_tests/sock_addr.c. This
just rewrites the destination address similar to sendmsg_v4_prog and
sendmsg_v6_prog.

Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-13-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:10:42 -07:00
Jordan Rife
cded71f595 selftests/bpf: Migrate ATTACH_REJECT test cases
Migrate test case from bpf/test_sock_addr.c ensuring that program
attachment fails when using an inappropriate attach type.

Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-12-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:10:42 -07:00
Jordan Rife
b0f3af0bff selftests/bpf: Migrate expected_attach_type tests
Migrates tests from progs/test_sock_addr.c ensuring that programs fail
to load when the expected attach type does not match.

Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-11-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:10:42 -07:00
Jordan Rife
8eaf8056a4 selftests/bpf: Migrate wildcard destination rewrite test
Migrate test case from bpf/test_sock_addr.c ensuring that sendmsg
respects when sendmsg6 hooks rewrite the destination IP with the IPv6
wildcard IP, [::].

Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-10-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:10:41 -07:00
Jordan Rife
54462e8452 selftests/bpf: Migrate sendmsg6 v4 mapped address tests
Migrate test case from bpf/test_sock_addr.c ensuring that sendmsg
returns -ENOTSUPP when sending to an IPv4-mapped IPv6 address to
prog_tests/sock_addr.c.

Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-9-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:10:41 -07:00
Jordan Rife
f46a10483b selftests/bpf: Migrate sendmsg deny test cases
This set of tests checks that sendmsg calls are rejected (return -EPERM)
when the sendmsg* hook returns 0. Replace those in bpf/test_sock_addr.c
with corresponding tests in prog_tests/sock_addr.c.

Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-8-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:10:41 -07:00
Jordan Rife
d1b24fcf1c selftests/bpf: Migrate WILDCARD_IP test
Move wildcard IP sendmsg test case out of bpf/test_sock_addr.c into
prog_tests/sock_addr.c.

Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-7-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:10:41 -07:00
Jordan Rife
a2618c0d85 selftests/bpf: Handle SYSCALL_EPERM and SYSCALL_ENOTSUPP test cases
In preparation to move test cases from bpf/test_sock_addr.c that expect
system calls to return ENOTSUPP or EPERM, this patch propagates errno
from relevant system calls up to test_sock_addr() where the result can
be checked.

Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-6-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:10:41 -07:00
Jordan Rife
5a047b2226 selftests/bpf: Handle ATTACH_REJECT test cases
In preparation to move test cases from bpf/test_sock_addr.c that expect
ATTACH_REJECT, this patch adds BPF_SKEL_FUNCS_RAW to generate load and
destroy functions that use bpf_prog_attach() to control the attach_type.

The normal load functions use bpf_program__attach_cgroup which does not
have the same degree of control over the attach type, as
bpf_program_attach_fd() calls bpf_link_create() with the attach type
extracted from prog using bpf_program__expected_attach_type(). It is
currently not possible to modify the attach type before
bpf_program__attach_cgroup() is called, since
bpf_program__set_expected_attach_type() has no effect after the program
is loaded.

Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-5-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:10:41 -07:00
Jordan Rife
5eff48f33f selftests/bpf: Handle LOAD_REJECT test cases
In preparation to move test cases from bpf/test_sock_addr.c that expect
LOAD_REJECT, this patch adds expected_attach_type and extends load_fn to
accept an expected attach type and a flag indicating whether or not
rejection is expected.

Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-4-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:10:41 -07:00
Jordan Rife
86b65c6db0 selftests/bpf: Use program name for skel load/destroy functions
In preparation to migrate tests from bpf/test_sock_addr.c to
sock_addr.c, update BPF_SKEL_FUNCS so that it generates functions
based on prog_name instead of skel_name. This allows us to differentiate
between programs in the same skeleton.

Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-3-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:10:40 -07:00
Jordan Rife
73964e9085 selftests/bpf: Migrate recvmsg* return code tests to verifier_sock_addr.c
This set of tests check that the BPF verifier rejects programs with
invalid return codes (recvmsg4 and recvmsg6 hooks can only return 1).
This patch replaces the tests in test_sock_addr.c with
verifier_sock_addr.c, a new verifier prog_tests for sockaddr hooks, in a
step towards fully retiring test_sock_addr.c.

Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-2-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:10:40 -07:00
Puranjay Mohan
20a759df3b riscv, bpf: make some atomic operations fully ordered
The BPF atomic operations with the BPF_FETCH modifier along with
BPF_XCHG and BPF_CMPXCHG are fully ordered but the RISC-V JIT implements
all atomic operations except BPF_CMPXCHG with relaxed ordering.

Section 8.1 of the "The RISC-V Instruction Set Manual Volume I:
Unprivileged ISA" [1], titled, "Specifying Ordering of Atomic
Instructions" says:

| To provide more efficient support for release consistency [5], each
| atomic instruction has two bits, aq and rl, used to specify additional
| memory ordering constraints as viewed by other RISC-V harts.

and

| If only the aq bit is set, the atomic memory operation is treated as
| an acquire access.
| If only the rl bit is set, the atomic memory operation is treated as a
| release access.
|
| If both the aq and rl bits are set, the atomic memory operation is
| sequentially consistent.

Fix this by setting both aq and rl bits as 1 for operations with
BPF_FETCH and BPF_XCHG.

[1] https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf

Fixes: dd642ccb45 ("riscv, bpf: Implement more atomic operations for RV64")
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Reviewed-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/r/20240505201633.123115-1-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:00:06 -07:00
Xiao Wang
80c5a07ae6 riscv, bpf: Fix typo in comment
We can use either "instruction" or "insn" in the comment.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/r/20240507111618.437121-1-xiao.w.wang@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 16:56:43 -07:00
Ilya Leoshkevich
68378982f0 s390/bpf: Emit a barrier for BPF_FETCH instructions
BPF_ATOMIC_OP() macro documentation states that "BPF_ADD | BPF_FETCH"
should be the same as atomic_fetch_add(), which is currently not the
case on s390x: the serialization instruction "bcr 14,0" is missing.
This applies to "and", "or" and "xor" variants too.

s390x is allowed to reorder stores with subsequent fetches from
different addresses, so code relying on BPF_FETCH acting as a barrier,
for example:

  stw [%r0], 1
  afadd [%r1], %r2
  ldxw %r3, [%r4]

may be broken. Fix it by emitting "bcr 14,0".

Note that a separate serialization instruction is not needed for
BPF_XCHG and BPF_CMPXCHG, because COMPARE AND SWAP performs
serialization itself.

Fixes: ba3b86b9ce ("s390/bpf: Implement new atomic ops")
Reported-by: Puranjay Mohan <puranjay12@gmail.com>
Closes: https://lore.kernel.org/bpf/mb61p34qvq3wf.fsf@kernel.org/
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20240507000557.12048-1-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 16:55:57 -07:00
Alexei Starovoitov
55302bc1ca Merge branch 'bpf-inline-helpers-in-arm64-and-riscv-jits'
Puranjay Mohan says:

====================
bpf: Inline helpers in arm64 and riscv JITs

Changes in v5 -> v6:
arm64 v5: https://lore.kernel.org/all/20240430234739.79185-1-puranjay@kernel.org/
riscv v2: https://lore.kernel.org/all/20240430175834.33152-1-puranjay@kernel.org/
- Combine riscv and arm64 changes in single series
- Some coding style fixes

Changes in v4 -> v5:
v4: https://lore.kernel.org/all/20240429131647.50165-1-puranjay@kernel.org/
- Implement the inlining of the bpf_get_smp_processor_id() in the JIT.

NOTE: This needs to be based on:
https://lore.kernel.org/all/20240430175834.33152-1-puranjay@kernel.org/
to be built.

Manual run of bpf-ci with this series rebased on above:
https://github.com/kernel-patches/bpf/pull/6929

Changes in v3 -> v4:
v3: https://lore.kernel.org/all/20240426121349.97651-1-puranjay@kernel.org/
- Fix coding style issue related to C89 standards.

Changes in v2 -> v3:
v2: https://lore.kernel.org/all/20240424173550.16359-1-puranjay@kernel.org/
- Fixed the xlated dump of percpu mov to "r0 = &(void __percpu *)(r0)"
- Made ARM64 and x86-64 use the same code for inlining. The only difference
  that remains is the per-cpu address of the cpu_number.

Changes in v1 -> v2:
v1: https://lore.kernel.org/all/20240405091707.66675-1-puranjay12@gmail.com/
- Add a patch to inline bpf_get_smp_processor_id()
- Fix an issue in MRS instruction encoding as pointed out by Will
- Remove CONFIG_SMP check because arm64 kernel always compiles with CONFIG_SMP

This series adds the support of internal only per-CPU instructions and inlines
the bpf_get_smp_processor_id() helper call for ARM64 and RISC-V BPF JITs.

Here is an example of calls to bpf_get_smp_processor_id() and
percpu_array_map_lookup_elem() before and after this series on ARM64.

                                         BPF
                                        =====
              BEFORE                                       AFTER
             --------                                     -------

int cpu = bpf_get_smp_processor_id();           int cpu = bpf_get_smp_processor_id();
(85) call bpf_get_smp_processor_id#229032       (85) call bpf_get_smp_processor_id#8

p = bpf_map_lookup_elem(map, &zero);            p = bpf_map_lookup_elem(map, &zero);
(18) r1 = map[id:78]                            (18) r1 = map[id:153]
(18) r2 = map[id:82][0]+65536                   (18) r2 = map[id:157][0]+65536
(85) call percpu_array_map_lookup_elem#313512   (07) r1 += 496
                                                (61) r0 = *(u32 *)(r2 +0)
                                                (35) if r0 >= 0x1 goto pc+5
                                                (67) r0 <<= 3
                                                (0f) r0 += r1
                                                (79) r0 = *(u64 *)(r0 +0)
                                                (bf) r0 = &(void __percpu *)(r0)
                                                (05) goto pc+1
                                                (b7) r0 = 0

                                      ARM64 JIT
                                     ===========

              BEFORE                                       AFTER
             --------                                     -------

int cpu = bpf_get_smp_processor_id();           int cpu = bpf_get_smp_processor_id();
mov     x10, #0xfffffffffffff4d0                mrs     x10, sp_el0
movk    x10, #0x802b, lsl #16                   ldr     w7, [x10, #24]
movk    x10, #0x8000, lsl #32
blr     x10
add     x7, x0, #0x0

p = bpf_map_lookup_elem(map, &zero);            p = bpf_map_lookup_elem(map, &zero);
mov     x0, #0xffff0003ffffffff                 mov     x0, #0xffff0003ffffffff
movk    x0, #0xce5c, lsl #16                    movk    x0, #0xe0f3, lsl #16
movk    x0, #0xca00                             movk    x0, #0x7c00
mov     x1, #0xffff8000ffffffff                 mov     x1, #0xffff8000ffffffff
movk    x1, #0x8bdb, lsl #16                    movk    x1, #0xb0c7, lsl #16
movk    x1, #0x6000                             movk    x1, #0xe000
mov     x10, #0xffffffffffff3ed0                add     x0, x0, #0x1f0
movk    x10, #0x802d, lsl #16                   ldr     w7, [x1]
movk    x10, #0x8000, lsl #32                   cmp     x7, #0x1
blr     x10                                     b.cs    0x0000000000000090
add     x7, x0, #0x0                            lsl     x7, x7, #3
                                                add     x7, x7, x0
                                                ldr     x7, [x7]
                                                mrs     x10, tpidr_el1
                                                add     x7, x7, x10
                                                b       0x0000000000000094
                                                mov     x7, #0x0

              Performance improvement found using benchmark[1]

./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc

  +---------------+-------------------+-------------------+--------------+
  |      Name     |      Before       |        After      |   % change   |
  |---------------+-------------------+-------------------+--------------|
  | glob-arr-inc  | 23.380 ± 1.675M/s | 25.893 ± 0.026M/s |   + 10.74%   |
  | arr-inc       | 23.928 ± 0.034M/s | 25.213 ± 0.063M/s |   + 5.37%    |
  | hash-inc      | 12.352 ± 0.005M/s | 12.609 ± 0.013M/s |   + 2.08%    |
  +---------------+-------------------+-------------------+--------------+

[1] https://github.com/anakryiko/linux/commit/8dec900975ef

             RISCV64 JIT output for `call bpf_get_smp_processor_id`
            =======================================================

                  Before                           After
                 --------                         -------

           auipc   t1,0x848c                  ld    a5,32(tp)
           jalr    604(t1)
           mv      a5,a0

  Benchmark using [1] on Qemu.

  ./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc

  +---------------+------------------+------------------+--------------+
  |      Name     |     Before       |       After      |   % change   |
  |---------------+------------------+------------------+--------------|
  | glob-arr-inc  | 1.077 ± 0.006M/s | 1.336 ± 0.010M/s |   + 24.04%   |
  | arr-inc       | 1.078 ± 0.002M/s | 1.332 ± 0.015M/s |   + 23.56%   |
  | hash-inc      | 0.494 ± 0.004M/s | 0.653 ± 0.001M/s |   + 32.18%   |
  +---------------+------------------+------------------+--------------+
====================

Link: https://lore.kernel.org/r/20240502151854.9810-1-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 16:54:35 -07:00
Puranjay Mohan
75fe4c0b3e bpf, arm64: inline bpf_get_smp_processor_id() helper
Inline calls to bpf_get_smp_processor_id() helper in the JIT by emitting
a read from struct thread_info. The SP_EL0 system register holds the
pointer to the task_struct and thread_info is the first member of this
struct. We can read the cpu number from the thread_info.

Here is how the ARM64 JITed assembly changes after this commit:

                                      ARM64 JIT
                                     ===========

              BEFORE                                    AFTER
             --------                                  -------

int cpu = bpf_get_smp_processor_id();        int cpu = bpf_get_smp_processor_id();

mov     x10, #0xfffffffffffff4d0             mrs     x10, sp_el0
movk    x10, #0x802b, lsl #16                ldr     w7, [x10, #24]
movk    x10, #0x8000, lsl #32
blr     x10
add     x7, x0, #0x0

               Performance improvement using benchmark[1]

./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc

+---------------+-------------------+-------------------+--------------+
|      Name     |      Before       |        After      |   % change   |
|---------------+-------------------+-------------------+--------------|
| glob-arr-inc  | 23.380 ± 1.675M/s | 25.893 ± 0.026M/s |   + 10.74%   |
| arr-inc       | 23.928 ± 0.034M/s | 25.213 ± 0.063M/s |   + 5.37%    |
| hash-inc      | 12.352 ± 0.005M/s | 12.609 ± 0.013M/s |   + 2.08%    |
+---------------+-------------------+-------------------+--------------+

[1] https://github.com/anakryiko/linux/commit/8dec900975ef

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240502151854.9810-5-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 16:54:34 -07:00
Puranjay Mohan
7a4c32222b arm64, bpf: add internal-only MOV instruction to resolve per-CPU addrs
Support an instruction for resolving absolute addresses of per-CPU
data from their per-CPU offsets. This instruction is internal-only and
users are not allowed to use them directly. They will only be used for
internal inlining optimizations for now between BPF verifier and BPF
JITs.

Since commit 7158627686 ("arm64: percpu: implement optimised pcpu
access using tpidr_el1"), the per-cpu offset for the CPU is stored in
the tpidr_el1/2 register of that CPU.

To support this BPF instruction in the ARM64 JIT, the following ARM64
instructions are emitted:

mov dst, src		// Move src to dst, if src != dst
mrs tmp, tpidr_el1/2	// Move per-cpu offset of the current cpu in tmp.
add dst, dst, tmp	// Add the per cpu offset to the dst.

To measure the performance improvement provided by this change, the
benchmark in [1] was used:

Before:
glob-arr-inc   :   23.597 ± 0.012M/s
arr-inc        :   23.173 ± 0.019M/s
hash-inc       :   12.186 ± 0.028M/s

After:
glob-arr-inc   :   23.819 ± 0.034M/s
arr-inc        :   23.285 ± 0.017M/s
hash-inc       :   12.419 ± 0.011M/s

[1] https://github.com/anakryiko/linux/commit/8dec900975ef

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240502151854.9810-4-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 16:54:34 -07:00
Puranjay Mohan
2ddec2c80b riscv, bpf: inline bpf_get_smp_processor_id()
Inline the calls to bpf_get_smp_processor_id() in the riscv bpf jit.

RISCV saves the pointer to the CPU's task_struct in the TP (thread
pointer) register. This makes it trivial to get the CPU's processor id.
As thread_info is the first member of task_struct, we can read the
processor id from TP + offsetof(struct thread_info, cpu).

          RISCV64 JIT output for `call bpf_get_smp_processor_id`
	  ======================================================

                Before                           After
               --------                         -------

         auipc   t1,0x848c                  ld    a5,32(tp)
         jalr    604(t1)
         mv      a5,a0

Benchmark using [1] on Qemu.

./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc

+---------------+------------------+------------------+--------------+
|      Name     |     Before       |       After      |   % change   |
|---------------+------------------+------------------+--------------|
| glob-arr-inc  | 1.077 ± 0.006M/s | 1.336 ± 0.010M/s |   + 24.04%   |
| arr-inc       | 1.078 ± 0.002M/s | 1.332 ± 0.015M/s |   + 23.56%   |
| hash-inc      | 0.494 ± 0.004M/s | 0.653 ± 0.001M/s |   + 32.18%   |
+---------------+------------------+------------------+--------------+

NOTE: This benchmark includes changes from this patch and the previous
      patch that implemented the per-cpu insn.

[1] https://github.com/anakryiko/linux/commit/8dec900975ef

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20240502151854.9810-3-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 16:54:34 -07:00
Puranjay Mohan
19c56d4e5b riscv, bpf: add internal-only MOV instruction to resolve per-CPU addrs
Support an instruction for resolving absolute addresses of per-CPU
data from their per-CPU offsets. This instruction is internal-only and
users are not allowed to use them directly. They will only be used for
internal inlining optimizations for now between BPF verifier and BPF
JITs.

RISC-V uses generic per-cpu implementation where the offsets for CPUs
are kept in an array called __per_cpu_offset[cpu_number]. RISCV stores
the address of the task_struct in TP register. The first element in
task_struct is struct thread_info, and we can get the cpu number by
reading from the TP register + offsetof(struct thread_info, cpu).

Once we have the cpu number in a register we read the offset for that
cpu from address: &__per_cpu_offset + cpu_number << 3. Then we add this
offset to the destination register.

To measure the improvement from this change, the benchmark in [1] was
used on Qemu:

Before:
glob-arr-inc   :    1.127 ± 0.013M/s
arr-inc        :    1.121 ± 0.004M/s
hash-inc       :    0.681 ± 0.052M/s

After:
glob-arr-inc   :    1.138 ± 0.011M/s
arr-inc        :    1.366 ± 0.006M/s
hash-inc       :    0.676 ± 0.001M/s

[1] https://github.com/anakryiko/linux/commit/8dec900975ef

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20240502151854.9810-2-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 16:54:34 -07:00
Shahab Vahedi
f122668ddc ARC: Add eBPF JIT support
This will add eBPF JIT support to the 32-bit ARCv2 processors. The
implementation is qualified by running the BPF tests on a Synopsys HSDK
board with "ARC HS38 v2.1c at 500 MHz" as the 4-core CPU.

The test_bpf.ko reports 2-10 fold improvements in execution time of its
tests. For instance:

test_bpf: #33 tcpdump port 22 jited:0 704 1766 2104 PASS
test_bpf: #33 tcpdump port 22 jited:1 120  224  260 PASS

test_bpf: #141 ALU_DIV_X: 4294967295 / 4294967295 = 1 jited:0 238 PASS
test_bpf: #141 ALU_DIV_X: 4294967295 / 4294967295 = 1 jited:1  23 PASS

test_bpf: #776 JMP32_JGE_K: all ... magnitudes jited:0 2034681 PASS
test_bpf: #776 JMP32_JGE_K: all ... magnitudes jited:1 1020022 PASS

Deployment and structure
------------------------
The related codes are added to "arch/arc/net":

- bpf_jit.h       -- The interface that a back-end translator must provide
- bpf_jit_core.c  -- Knows how to handle the input eBPF byte stream
- bpf_jit_arcv2.c -- The back-end code that knows the translation logic

The bpf_int_jit_compile() at the end of bpf_jit_core.c is the entrance
to the whole process. Normally, the translation is done in one pass,
namely the "normal pass". In case some relocations are not known during
this pass, some data (arc_jit_data) is allocated for the next pass to
come. This possible next (and last) pass is called the "extra pass".

1. Normal pass       # The necessary pass
     1a. Dry run       # Get the whole JIT length, epilogue offset, etc.
     1b. Emit phase    # Allocate memory and start emitting instructions
2. Extra pass        # Only needed if there are relocations to be fixed
     2a. Patch relocations

Support status
--------------
The JIT compiler supports BPF instructions up to "cpu=v4". However, it
does not yet provide support for:

- Tail calls
- Atomic operations
- 64-bit division/remainder
- BPF_PROBE_MEM* (exception table)

The result of "test_bpf" test suite on an HSDK board is:

hsdk-lnx# insmod test_bpf.ko test_suite=test_bpf

  test_bpf: Summary: 863 PASSED, 186 FAILED, [851/851 JIT'ed]

All the failing test cases are due to the ones that were not JIT'ed.
Categorically, they can be represented as:

  .-----------.------------.-------------.
  | test type |   opcodes  | # of cases  |
  |-----------+------------+-------------|
  | atomic    | 0xC3, 0xDB |         149 |
  | div64     | 0x37, 0x3F |          22 |
  | mod64     | 0x97, 0x9F |          15 |
  `-----------^------------+-------------|
                           | (total) 186 |
                           `-------------'

Setup: build config
-------------------
The following configs must be set to have a working JIT test:

  CONFIG_BPF_JIT=y
  CONFIG_BPF_JIT_ALWAYS_ON=y
  CONFIG_TEST_BPF=m

The following options are not necessary for the tests module,
but are good to have:

  CONFIG_DEBUG_INFO=y             # prerequisite for below
  CONFIG_DEBUG_INFO_BTF=y         # so bpftool can generate vmlinux.h

  CONFIG_FTRACE=y                 #
  CONFIG_BPF_SYSCALL=y            # all these options lead to
  CONFIG_KPROBE_EVENTS=y          # having CONFIG_BPF_EVENTS=y
  CONFIG_PERF_EVENTS=y            #

Some BPF programs provide data through /sys/kernel/debug:
  CONFIG_DEBUG_FS=y
arc# mount -t debugfs debugfs /sys/kernel/debug

Setup: elfutils
---------------
The libdw.{so,a} library that is used by pahole for processing
the final binary must come from elfutils 0.189 or newer. The
support for ARCv2 [1] has been added since that version.

[1]
https://sourceware.org/git/?p=elfutils.git;a=commit;h=de3d46b3e7

Setup: pahole
-------------
The line below in linux/scripts/Makefile.btf must be commented out:

pahole-flags-$(call test-ge, $(pahole-ver), 121) += --btf_gen_floats

Or else, the build will fail:

$ make V=1
  ...
  BTF     .btf.vmlinux.bin.o
pahole -J --btf_gen_floats                    \
       -j --lang_exclude=rust                 \
       --skip_encoding_btf_inconsistent_proto \
       --btf_gen_optimized .tmp_vmlinux.btf
Complex, interval and imaginary float types are not supported
Encountered error while encoding BTF.
  ...
  BTFIDS  vmlinux
./tools/bpf/resolve_btfids/resolve_btfids vmlinux
libbpf: failed to find '.BTF' ELF section in vmlinux
FAILED: load BTF from vmlinux: No data available

This is due to the fact that the ARC toolchains generate
"complex float" DIE entries in libgcc and at the moment, pahole
can't handle such entries.

Running the tests
-----------------
host$ scp /bld/linux/lib/test_bpf.ko arc:
arc # sysctl net.core.bpf_jit_enable=1
arc # insmod test_bpf.ko test_suite=test_bpf
      ...
      test_bpf: #1048 Staggered jumps: JMP32_JSLE_X jited:1 697811 PASS
      test_bpf: Summary: 863 PASSED, 186 FAILED, [851/851 JIT'ed]

Acknowledgments
---------------
- Claudiu Zissulescu for his unwavering support
- Yuriy Kolerov for testing and troubleshooting
- Vladimir Isaev for the pahole workaround
- Sergey Matyukevich for paving the road by adding the interpreter support

Signed-off-by: Shahab Vahedi <shahab@synopsys.com>
Link: https://lore.kernel.org/r/20240430145604.38592-1-list+bpf@vahedi.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 16:51:36 -07:00
Alan Maguire
fcd1ed89a0 kbuild,bpf: Switch to using --btf_features for pahole v1.26 and later
The btf_features list can be used for pahole v1.26 and later -
it is useful because if a feature is not yet implemented it will
not exit with a failure message.  This will allow us to add feature
requests to the pahole options without having to check pahole versions
in future; if the version of pahole supports the feature it will be
added.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240507135514.490467-1-alan.maguire@oracle.com
2024-05-09 14:44:35 -07:00
Martin KaFai Lau
0d03a4d24b Merge branch 'use network helpers, part 4'
Geliang Tang says:

====================
From: Geliang Tang <tanggeliang@kylinos.cn>

This patchset adds post_socket_cb pointer into
struct network_helper_opts to make start_server_addr() helper
more flexible. With these modifications, many duplicate codes
can be dropped.

Patches 1-3 address Martin's comments in the previous series.
====================

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-09 13:40:39 -07:00
Geliang Tang
7abbf38cd8 selftests/bpf: Drop get_port in test_tcp_check_syncookie
The arguments "addr" and "len" of run_test() have dropped. This makes
function get_port() useless. Drop it from test_tcp_check_syncookie_user.c.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/a9b5c8064ab4cbf0f68886fe0e4706428b8d0d47.1714907662.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-09 13:40:38 -07:00
Geliang Tang
65a3f0df44 selftests/bpf: Use connect_to_fd in test_tcp_check_syncookie
This patch uses public helper connect_to_fd() exported in network_helpers.h
instead of the local defined function connect_to_server() in
test_tcp_check_syncookie_user.c. This can avoid duplicate code.

Then the arguments "addr" and "len" of run_test() become useless, drop them
too.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/e0ae6b790ac0abc7193aadfb2660c8c9eb0fe1f0.1714907662.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-09 13:40:38 -07:00
Geliang Tang
5059c73eca selftests/bpf: Use connect_to_fd in sockopt_inherit
This patch uses public helper connect_to_fd() exported in network_helpers.h
instead of the local defined function connect_to_server() in
prog_tests/sockopt_inherit.c. This can avoid duplicate code.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/71db79127cc160b0643fd9a12c70ae019ae076a1.1714907662.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-09 13:40:38 -07:00
Geliang Tang
49e1fa8dbd selftests/bpf: Use start_server_addr in test_tcp_check_syncookie
Include network_helpers.h in test_tcp_check_syncookie_user.c, use
public helper start_server_addr() in it instead of the local defined
function start_server(). This can avoid duplicate code.

Add two helpers v6only_true() and v6only_false() to set IPV6_V6ONLY
sockopt to true or false, set them to post_socket_cb pointer of struct
network_helper_opts, and pass it to start_server_setsockopt().

In order to use functions defined in network_helpers.c, Makefile needs
to be updated too.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/e0c5324f5da84f453f47543536e70f126eaa8678.1714907662.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-09 13:40:38 -07:00
Geliang Tang
5166b3e3e3 selftests/bpf: Use start_server_addr in sockopt_inherit
Include network_helpers.h in prog_tests/sockopt_inherit.c, use public
helper start_server_addr() instead of the local defined function
start_server(). This can avoid duplicate code.

Add a helper custom_cb() to set SOL_CUSTOM sockopt looply, set it to
post_socket_cb pointer of struct network_helper_opts, and pass it to
start_server_addr().

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/687af66f743a0bf15cdba372c5f71fe64863219e.1714907662.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-09 13:40:38 -07:00
Geliang Tang
20434d2d89 selftests/bpf: Add post_socket_cb for network_helper_opts
__start_server() sets SO_REUSPORT through setsockopt() when the parameter
'reuseport' is set. This patch makes it more flexible by adding a function
pointer post_socket_cb into struct network_helper_opts. The
'const struct post_socket_opts *cb_opts' args in the post_socket_cb is
for the future extension.

The 'reuseport' parameter can be dropped.
Now the original start_reuseport_server() can be implemented by setting a
newly defined reuseport_cb() function pointer to post_socket_cb filed of
struct network_helper_opts.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/470cb82f209f055fc7fb39c66c6b090b5b7ed2b2.1714907662.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-09 13:40:29 -07:00
Alexei Starovoitov
cbe35adf69 Merge branch 'selftests-bpf-retire-bpf_tcp_helpers-h'
Martin KaFai Lau says:

====================
selftests/bpf: Retire bpf_tcp_helpers.h

From: Martin KaFai Lau <martin.lau@kernel.org>

The earlier commit 8e6d9ae2e0 ("selftests/bpf: Use bpf_tracing.h instead of bpf_tcp_helpers.h")
removed the bpf_tcp_helpers.h usages from the non networking tests.

This patch set is a continuation of this effort to retire
the bpf_tcp_helpers.h from the networking tests (mostly tcp-cc related).

The main usage of the bpf_tcp_helpers.h is the partial kernel
socket definitions (e.g. sock, tcp_sock). New fields are kept adding
back to those partial socket definitions while everything is available
in the vmlinux.h. The recent bpf_cc_cubic.c test tried to extend
bpf_tcp_helpers.c but eventually used the vmlinux.h instead. To avoid
this unnecessary detour for new tests and have one consistent way
of using the kernel sockets, this patch set retires the bpf_tcp_helpers.h
usages and consolidates the tests to use vmlinux.h instead.
====================

Link: https://lore.kernel.org/r/20240509175026.3423614-1-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-09 11:13:13 -07:00
Martin KaFai Lau
6a650816b0 selftests/bpf: Retire bpf_tcp_helpers.h
The previous patches have consolidated the tests to use
bpf_tracing_net.h (i.e. vmlinux.h) instead of bpf_tcp_helpers.h.

This patch can finally retire the bpf_tcp_helpers.h from
the repository.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509175026.3423614-11-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-09 11:13:12 -07:00
Martin KaFai Lau
c075c9c4af selftests/bpf: Remove the bpf_tcp_helpers.h usages from other non tcp-cc tests
The patch removes the remaining bpf_tcp_helpers.h usages in the
non tcp-cc networking tests. It either replaces it with bpf_tracing_net.h
or just removed it because the test is not actually using any
kernel sockets. For the later, the missing macro (mainly SOL_TCP) is
defined locally.

An exception is the test_sock_fields which is testing
the "struct bpf_sock" type instead of the kernel sock type.
Whenever "vmlinux.h" is used instead, it hits a verifier
error on doing arithmetic on the sock_common pointer:

; return !a6[0] && !a6[1] && !a6[2] && a6[3] == bpf_htonl(1); @ test_sock_fields.c:54
21: (61) r2 = *(u32 *)(r1 +28)        ; R1_w=sock_common() R2_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
22: (56) if w2 != 0x0 goto pc-6       ; R2_w=0
23: (b7) r3 = 28                      ; R3_w=28
24: (bf) r2 = r1                      ; R1_w=sock_common() R2_w=sock_common()
25: (0f) r2 += r3
R2 pointer arithmetic on sock_common prohibited

Hence, instead of including bpf_tracing_net.h, the test_sock_fields test
defines a tcp_sock with one lsndtime field in it.

Another highlight is, in sockopt_qos_to_cc.c, the tcp_cc_eq()
is replaced by bpf_strncmp(). tcp_cc_eq() was a workaround
in bpf_tcp_helpers.h before bpf_strncmp had been added.

The SOL_IPV6 addition to bpf_tracing_net.h is needed by the
test_tcpbpf_kern test.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509175026.3423614-10-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-09 11:13:12 -07:00
Martin KaFai Lau
6eee55aa76 selftests/bpf: Remove bpf_tcp_helpers.h usages from other misc bpf tcp-cc tests
This patch removed the final few bpf_tcp_helpers.h usages
in some misc bpf tcp-cc tests and replace it with
bpf_tracing_net.h (i.e. vmlinux.h)

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509175026.3423614-9-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-09 11:13:12 -07:00
Martin KaFai Lau
6ad4e6e946 selftests/bpf: Use bpf_tracing_net.h in bpf_dctcp
This patch uses bpf_tracing_net.h (i.e. vmlinux.h) in bpf_dctcp.
This will allow to retire the bpf_tcp_helpers.h and consolidate
tcp-cc tests to vmlinux.h.

It will have a dup on min/max macros with the bpf_cubic. It could
be further refactored in the future.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509175026.3423614-8-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-09 11:13:12 -07:00
Martin KaFai Lau
a824c9a8a4 selftests/bpf: Use bpf_tracing_net.h in bpf_cubic
This patch uses bpf_tracing_net.h (i.e. vmlinux.h) in bpf_cubic.
This will allow to retire the bpf_tcp_helpers.h and consolidate
tcp-cc tests to vmlinux.h.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509175026.3423614-7-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-09 11:13:12 -07:00
Martin KaFai Lau
b1d87ae9b0 selftests/bpf: Rename tcp-cc private struct in bpf_cubic and bpf_dctcp
The "struct bictcp" and "struct dctcp" are private to the bpf prog
and they are stored in the private buffer in inet_csk(sk)->icsk_ca_priv.
Hence, there is no bpf CO-RE required.

The same struct name exists in the vmlinux.h. To reuse vmlinux.h,
they need to be renamed such that the bpf prog logic will be
immuned from the kernel tcp-cc changes.

This patch adds a "bpf_" prefix to them.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509175026.3423614-6-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-09 11:13:12 -07:00
Martin KaFai Lau
7d3851a318 selftests/bpf: Sanitize the SEC and inline usages in the bpf-tcp-cc tests
It is needed to remove the BPF_STRUCT_OPS usages from the tcp-cc tests
because it is defined in bpf_tcp_helpers.h which is going to be retired.
While at it, this patch consolidates all tcp-cc struct_ops programs to
use the SEC("struct_ops") + BPF_PROG().

It also removes the unnecessary __always_inline usages from the
tcp-cc tests.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509175026.3423614-5-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-09 11:13:11 -07:00
Martin KaFai Lau
cc5b18ce17 selftests/bpf: Reuse the tcp_sk() from the bpf_tracing_net.h
This patch removes the individual tcp_sk implementations from the
tcp-cc tests. The tcp_sk() implementation from the bpf_tracing_net.h
is reused instead.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509175026.3423614-4-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-09 11:13:11 -07:00