linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-13 17:39:23 -04:00

Author	SHA1	Message	Date
Alexei Starovoitov	db22b1382b	Merge branch 'replace-config_dmabuf_sysfs_stats-with-bpf' T.J. Mercier says: ==================== Replace CONFIG_DMABUF_SYSFS_STATS with BPF Until CONFIG_DMABUF_SYSFS_STATS was added [1] it was only possible to perform per-buffer accounting with debugfs which is not suitable for production environments. Eventually we discovered the overhead with per-buffer sysfs file creation/removal was significantly impacting allocation and free times, and exacerbated kernfs lock contention. [2] dma_buf_stats_setup() is responsible for 39% of single-page buffer creation duration, or 74% of single-page dma_buf_export() duration when stressing dmabuf allocations and frees. I prototyped a change from per-buffer to per-exporter statistics with a RCU protected list of exporter allocations that accommodates most (but not all) of our use-cases and avoids almost all of the sysfs overhead. While that adds less overhead than per-buffer sysfs, and less even than the maintenance of the dmabuf debugfs_list, it's still additional overhead on top of the debugfs_list and doesn't give us per-buffer info. This series uses the existing dmabuf debugfs_list to implement a BPF dmabuf iterator, which adds no overhead to buffer allocation/free and provides per-buffer info. The list has been moved outside of CONFIG_DEBUG_FS scope so that it is always populated. The BPF program loaded by userspace that extracts per-buffer information gets to define its own interface which avoids the lack of ABI stability with debugfs. This will allow us to replace our use of CONFIG_DMABUF_SYSFS_STATS, and the plan is to remove it from the kernel after the next longterm stable release. [1] https://lore.kernel.org/linux-media/20201210044400.1080308-1-hridya@google.com [2] https://lore.kernel.org/all/20220516171315.2400578-1-tjmercier@google.com v1: https://lore.kernel.org/all/20250414225227.3642618-1-tjmercier@google.com v1 -> v2: Make the DMA buffer list independent of CONFIG_DEBUG_FS per Christian König Add CONFIG_DMA_SHARED_BUFFER check to kernel/bpf/Makefile per kernel test robot Use BTF_ID_LIST_SINGLE instead of BTF_ID_LIST_GLOBAL_SINGLE per Song Liu Fixup comment style, mixing code/declarations, and use ASSERT_OK_FD in selftest per Song Liu Add BPF_ITER_RESCHED feature to bpf_dmabuf_reg_info per Alexei Starovoitov Add open-coded iterator and selftest per Alexei Starovoitov Add a second test buffer from the system dmabuf heap to selftests Use the BPF program we'll use in production for selftest per Alexei Starovoitov https://r.android.com/c/platform/system/bpfprogs/+/3616123/2/dmabufIter.c https://r.android.com/c/platform/system/memory/libmeminfo/+/3614259/1/libdmabufinfo/dmabuf_bpf_stats.cpp v2: https://lore.kernel.org/all/20250504224149.1033867-1-tjmercier@google.com v2 -> v3: Rebase onto bpf-next/master Move get_next_dmabuf() into drivers/dma-buf/dma-buf.c, along with the new get_first_dmabuf(). This avoids having to expose the dmabuf list and mutex to the rest of the kernel, and keeps the dmabuf mutex operations near each other in the same file. (Christian König) Add Christian's RB to dma-buf: Rename debugfs symbols Drop RFC: dma-buf: Remove DMA-BUF statistics v3: https://lore.kernel.org/all/20250507001036.2278781-1-tjmercier@google.com v3 -> v4: Fix selftest BPF program comment style (not kdoc) per Alexei Starovoitov Fix dma-buf.c kdoc comment style per Alexei Starovoitov Rename get_first_dmabuf / get_next_dmabuf to dma_buf_iter_begin / dma_buf_iter_next per Christian König Add Christian's RB to bpf: Add dmabuf iterator v4: https://lore.kernel.org/all/20250508182025.2961555-1-tjmercier@google.com v4 -> v5: Add Christian's Acks to all patches Add Song Liu's Acks Move BTF_ID_LIST_SINGLE and DEFINE_BPF_ITER_FUNC closer to usage per Song Liu Fix open-coded iterator comment style per Song Liu Move iterator termination check to its own subtest per Song Liu Rework selftest buffer creation per Song Liu Fix spacing in sanitize_string per BPF CI v5: https://lore.kernel.org/all/20250512174036.266796-1-tjmercier@google.com v5 -> v6: Song Liu: Init test buffer FDs to -1 Zero-init udmabuf_create for future proofing Bail early for iterator fd/FILE creation failure Dereference char ptr to check for NUL in sanitize_string() Move map insertion from create_test_buffers() to test_dmabuf_iter() Add ACK to selftests/bpf: Add test for open coded dmabuf_iter v6: https://lore.kernel.org/all/20250513163601.812317-1-tjmercier@google.com v6 -> v7: Zero uninitialized name bytes following the end of name strings per s390x BPF CI Reorder sanitize_string bounds checks per Song Liu Add Song's Ack to: selftests/bpf: Add test for dmabuf_iter Rebase onto bpf-next/master per BPF CI ==================== Link: https://patch.msgid.link/20250522230429.941193-1-tjmercier@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-27 09:51:26 -07:00
T.J. Mercier	7594dcb71f	selftests/bpf: Add test for open coded dmabuf_iter Use the same test buffers as the traditional iterator and a new BPF map to verify the test buffers can be found with the open coded dmabuf iterator. Signed-off-by: T.J. Mercier <tjmercier@google.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250522230429.941193-6-tjmercier@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-27 09:51:26 -07:00
T.J. Mercier	ae5d2c59ec	selftests/bpf: Add test for dmabuf_iter This test creates a udmabuf, and a dmabuf from the system dmabuf heap, and uses a BPF program that prints dmabuf metadata with the new dmabuf_iter to verify they can be found. Signed-off-by: T.J. Mercier <tjmercier@google.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250522230429.941193-5-tjmercier@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-27 09:51:26 -07:00
T.J. Mercier	6eab7ac7c5	bpf: Add open coded dmabuf iterator This open coded iterator allows for more flexibility when creating BPF programs. It can support output in formats other than text. With an open coded iterator, a single BPF program can traverse multiple kernel data structures (now including dmabufs), allowing for more efficient analysis of kernel data compared to multiple reads from procfs, sysfs, or multiple traditional BPF iterator invocations. Signed-off-by: T.J. Mercier <tjmercier@google.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250522230429.941193-4-tjmercier@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-27 09:51:25 -07:00
T.J. Mercier	76ea955349	bpf: Add dmabuf iterator The dmabuf iterator traverses the list of all DMA buffers. DMA buffers are refcounted through their associated struct file. A reference is taken on each buffer as the list is iterated to ensure each buffer persists for the duration of the bpf program execution without holding the list mutex. Signed-off-by: T.J. Mercier <tjmercier@google.com> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250522230429.941193-3-tjmercier@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-27 09:51:25 -07:00
T.J. Mercier	89f9dba365	dma-buf: Rename debugfs symbols Rename the debugfs list and mutex so it's clear they are now usable without the need for CONFIG_DEBUG_FS. The list will always be populated to support the creation of a BPF iterator for dmabufs. Signed-off-by: T.J. Mercier <tjmercier@google.com> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250522230429.941193-2-tjmercier@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-27 09:51:25 -07:00
Mykyta Yatsenko	079e5c56a5	bpf: Fix error return value in bpf_copy_from_user_dynptr On error, copy_from_user returns number of bytes not copied to destination, but current implementation of copy_user_data_sleepable does not handle that correctly and returns it as error value, which may confuse user, expecting meaningful negative error value. Fixes: `a498ee7576` ("bpf: Implement dynptr copy kfuncs") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250523181705.261585-1-mykyta.yatsenko5@gmail.com	2025-05-23 13:25:02 -07:00
Andrii Nakryiko	bfccacdf93	Merge branch 'allow-mmap-of-sys-kernel-btf-vmlinux' Lorenz Bauer says: ==================== Allow mmap of /sys/kernel/btf/vmlinux I'd like to cut down the memory usage of parsing vmlinux BTF in ebpf-go. With some upcoming changes the library is sitting at 5MiB for a parse. Most of that memory is simply copying the BTF blob into user space. By allowing vmlinux BTF to be mmapped read-only into user space I can cut memory usage by about 75%. Signed-off-by: Lorenz Bauer <lmb@isovalent.com> --- Changes in v5: - Fix error return of btf_parse_raw_mmap (Andrii) - Link to v4: https://lore.kernel.org/r/20250510-vmlinux-mmap-v4-0-69e424b2a672@isovalent.com Changes in v4: - Go back to remap_pfn_range for aarch64 compat - Dropped btf_new_no_copy (Andrii) - Fixed nits in selftests (Andrii) - Clearer error handling in the mmap handler (Andrii) - Fixed build on s390 - Link to v3: https://lore.kernel.org/r/20250505-vmlinux-mmap-v3-0-5d53afa060e8@isovalent.com Changes in v3: - Remove slightly confusing calculation of trailing (Alexei) - Use vm_insert_page (Alexei) - Simplified libbpf code - Link to v2: https://lore.kernel.org/r/20250502-vmlinux-mmap-v2-0-95c271434519@isovalent.com Changes in v2: - Use btf__new in selftest - Avoid vm_iomap_memory in btf_vmlinux_mmap - Add VM_DONTDUMP - Add support to libbpf - Link to v1: https://lore.kernel.org/r/20250501-vmlinux-mmap-v1-0-aa2724572598@isovalent.com --- ==================== Link: https://patch.msgid.link/20250520-vmlinux-mmap-v5-0-e8c941acc414@isovalent.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2025-05-23 10:06:29 -07:00
Lorenz Bauer	3c0421c93c	libbpf: Use mmap to parse vmlinux BTF from sysfs Teach libbpf to use mmap when parsing vmlinux BTF from /sys. We don't apply this to fall-back paths on the regular file system because there is no way to ensure that modifications underlying the MAP_PRIVATE mapping are not visible to the process. Signed-off-by: Lorenz Bauer <lmb@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250520-vmlinux-mmap-v5-3-e8c941acc414@isovalent.com	2025-05-23 10:06:28 -07:00
Lorenz Bauer	828226b69f	selftests: bpf: Add a test for mmapable vmlinux BTF Add a basic test for the ability to mmap /sys/kernel/btf/vmlinux. Ensure that the data is valid BTF and that it is padded with zero. Signed-off-by: Lorenz Bauer <lmb@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20250520-vmlinux-mmap-v5-2-e8c941acc414@isovalent.com	2025-05-23 10:06:28 -07:00
Lorenz Bauer	a539e2a6d5	btf: Allow mmap of vmlinux btf User space needs access to kernel BTF for many modern features of BPF. Right now each process needs to read the BTF blob either in pieces or as a whole. Allow mmaping the sysfs file so that processes can directly access the memory allocated for it in the kernel. remap_pfn_range is used instead of vm_insert_page due to aarch64 compatibility issues. Signed-off-by: Lorenz Bauer <lmb@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Link: https://lore.kernel.org/bpf/20250520-vmlinux-mmap-v5-1-e8c941acc414@isovalent.com	2025-05-23 10:06:28 -07:00
Jiayuan Chen	8259eb0e06	bpf, sockmap: Avoid using sk_socket after free when sending The sk->sk_socket is not locked or referenced in backlog thread, and during the call to skb_send_sock(), there is a race condition with the release of sk_socket. All types of sockets(tcp/udp/unix/vsock) will be affected. Race conditions: ''' CPU0 CPU1 backlog::skb_send_sock sendmsg_unlocked sock_sendmsg sock_sendmsg_nosec close(fd): ... ops->release() -> sock_map_close() sk_socket->ops = NULL free(socket) sock->ops->sendmsg ^ panic here ''' The ref of psock become 0 after sock_map_close() executed. ''' void sock_map_close() { ... if (likely(psock)) { ... // !! here we remove psock and the ref of psock become 0 sock_map_remove_links(sk, psock) psock = sk_psock_get(sk); if (unlikely(!psock)) goto no_psock; <=== Control jumps here via goto ... cancel_delayed_work_sync(&psock->work); <=== not executed sk_psock_put(sk, psock); ... } ''' Based on the fact that we already wait for the workqueue to finish in sock_map_close() if psock is held, we simply increase the psock reference count to avoid race conditions. With this patch, if the backlog thread is running, sock_map_close() will wait for the backlog thread to complete and cancel all pending work. If no backlog running, any pending work that hasn't started by then will fail when invoked by sk_psock_get(), as the psock reference count have been zeroed, and sk_psock_drop() will cancel all jobs via cancel_delayed_work_sync(). In summary, we require synchronization to coordinate the backlog thread and close() thread. The panic I catched: ''' Workqueue: events sk_psock_backlog RIP: 0010:sock_sendmsg+0x21d/0x440 RAX: 0000000000000000 RBX: ffffc9000521fad8 RCX: 0000000000000001 ... Call Trace: <TASK> ? die_addr+0x40/0xa0 ? exc_general_protection+0x14c/0x230 ? asm_exc_general_protection+0x26/0x30 ? sock_sendmsg+0x21d/0x440 ? sock_sendmsg+0x3e0/0x440 ? __pfx_sock_sendmsg+0x10/0x10 __skb_send_sock+0x543/0xb70 sk_psock_backlog+0x247/0xb80 ... ''' Fixes: `4b4647add7` ("sock_map: avoid race between sock_map_close and sk_psock_put") Reported-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20250516141713.291150-1-jiayuan.chen@linux.dev	2025-05-22 16:16:37 -07:00
Martin KaFai Lau	6888a036cf	Merge branch 'selftests-bpf-test-sockmap-sockhash-redirection' Michal Luczaj says: ==================== selftests/bpf: Test sockmap/sockhash redirection The idea behind this series is to comprehensively test the BPF redirection: BPF_MAP_TYPE_SOCKMAP, BPF_MAP_TYPE_SOCKHASH x sk_msg-to-egress, sk_msg-to-ingress, sk_skb-to-egress, sk_skb-to-ingress x AF_INET, SOCK_STREAM, AF_INET6, SOCK_STREAM, AF_INET, SOCK_DGRAM, AF_INET6, SOCK_DGRAM, AF_UNIX, SOCK_STREAM, AF_UNIX, SOCK_DGRAM, AF_VSOCK, SOCK_STREAM, AF_VSOCK, SOCK_SEQPACKET New module is introduced, sockmap_redir: all supported and unsupported redirect combinations are tested for success and failure respectively. Code is pretty much stolen/adapted from Jakub Sitnicki's sockmap_redir_matrix.c [1]. Usage: $ cd tools/testing/selftests/bpf $ make $ sudo ./test_progs -t sockmap_redir ... Summary: 1/576 PASSED, 0 SKIPPED, 0 FAILED [1]: https://github.com/jsitnicki/sockmap-redir-matrix/blob/main/sockmap_redir_matrix.c Changes in v3: - Drop unrelated changes; sockmap_listen, test_sockmap_listen, doc - Collect tags [Jakub, John] - Introduce BPF verdict programs especially for sockmap_redir [Jiayuan] - Link to v2: https://lore.kernel.org/r/20250411-selftests-sockmap-redir-v2-0-5f9b018d6704@rbox.co Changes in v2: - Verify that the unsupported redirect combos do fail [Jakub] - Dedup tests in sockmap_listen - Cosmetic changes and code reordering - Link to v1: https://lore.kernel.org/bpf/42939687-20f9-4a45-b7c2-342a0e11a014@rbox.co/ ==================== Link: https://patch.msgid.link/20250515-selftests-sockmap-redir-v3-0-a1ea723f7e7e@rbox.co Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2025-05-22 15:05:23 -07:00
Michal Luczaj	c04eeeb2af	selftests/bpf: sockmap_listen cleanup: Drop af_inet SOCK_DGRAM redir tests Remove tests covered by sockmap_redir. Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-8-a1ea723f7e7e@rbox.co	2025-05-22 14:26:59 -07:00
Michal Luczaj	f3de1cf621	selftests/bpf: sockmap_listen cleanup: Drop af_unix redir tests Remove tests covered by sockmap_redir. Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-7-a1ea723f7e7e@rbox.co	2025-05-22 14:26:58 -07:00
Michal Luczaj	9266e49d60	selftests/bpf: sockmap_listen cleanup: Drop af_vsock redir tests Remove tests covered by sockmap_redir. Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-6-a1ea723f7e7e@rbox.co	2025-05-22 14:26:58 -07:00
Michal Luczaj	f0709263a0	selftests/bpf: Add selftest for sockmap/hashmap redirection Test redirection logic. All supported and unsupported redirect combinations are tested for success and failure respectively. BPF_MAP_TYPE_SOCKMAP BPF_MAP_TYPE_SOCKHASH x sk_msg-to-egress sk_msg-to-ingress sk_skb-to-egress sk_skb-to-ingress x AF_INET, SOCK_STREAM AF_INET6, SOCK_STREAM AF_INET, SOCK_DGRAM AF_INET6, SOCK_DGRAM AF_UNIX, SOCK_STREAM AF_UNIX, SOCK_DGRAM AF_VSOCK, SOCK_STREAM AF_VSOCK, SOCK_SEQPACKET Suggested-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-5-a1ea723f7e7e@rbox.co	2025-05-22 14:26:58 -07:00
Michal Luczaj	f266905bb3	selftests/bpf: Introduce verdict programs for sockmap_redir Instead of piggybacking on test_sockmap_listen, introduce test_sockmap_redir especially for sockmap redirection tests. Suggested-by: Jiayuan Chen <mrpre@163.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-4-a1ea723f7e7e@rbox.co	2025-05-22 14:26:58 -07:00
Michal Luczaj	b57482b0fe	selftests/bpf: Add u32()/u64() to sockmap_helpers Add integer wrappers for convenient sockmap usage. While there, fix misaligned trailing slashes. Suggested-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-3-a1ea723f7e7e@rbox.co	2025-05-22 14:26:58 -07:00
Michal Luczaj	d87857946d	selftests/bpf: Add socket_kind_to_str() to socket_helpers Add function that returns string representation of socket's domain/type. Suggested-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-2-a1ea723f7e7e@rbox.co	2025-05-22 14:26:58 -07:00
Michal Luczaj	fb1131d5e1	selftests/bpf: Support af_unix SOCK_DGRAM socket pair creation Handle af_unix in init_addr_loopback(). For pair creation, bind() the peer socket to make SOCK_DGRAM connect() happy. Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-1-a1ea723f7e7e@rbox.co	2025-05-22 14:26:58 -07:00
Mykyta Yatsenko	5ead949920	selftests/bpf: Add SKIP_LLVM makefile variable Introduce SKIP_LLVM makefile variable that allows to avoid using llvm dependencies when building BPF selftests. This is different from existing feature-llvm, as the latter is a result of automatic detection and should not be set by user explicitly. Avoiding llvm dependencies could be useful for environments that do not have them, given that as of now llvm dependencies are required only by jit_disasm_helpers.c. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250522013813.125428-1-mykyta.yatsenko5@gmail.com	2025-05-22 09:31:03 -07:00
Alexei Starovoitov	d90f0bce57	Merge branch 's390-bpf-use-kernel-s-expoline-thunks' Ilya Leoshkevich says: ==================== This series simplifies the s390 JIT by replacing the generation of expolines (Spectre mitigation) with using the ones from the kernel text. This is possible thanks to the V!=R s390 kernel rework. Patch 1 is a small prerequisite for arch/s390 that I would like to get in via the BPF tree. It has Heiko's Acked-by. Patches 2 and 3 are the implementation. ==================== Link: https://patch.msgid.link/20250519223646.66382-1-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-22 08:41:06 -07:00
Ilya Leoshkevich	7f332f9fe9	s390/bpf: Use kernel's expoline thunks Simplify the JIT code by replacing the custom expolines with the ones defined in the kernel text. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250519223646.66382-4-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-22 08:40:56 -07:00
Ilya Leoshkevich	9053ba042f	s390/bpf: Add macros for calling external functions After the V!=R rework (commit `c98d2ecae0` ("s390/mm: Uncouple physical vs virtual address spaces")), kernel and BPF programs are allocated within a 4G region, making it possible to use relative addressing to directly use kernel functions from BPF code. Add two new macros for calling kernel functions from BPF code: EMIT6_PCREL_RILB_PTR() and EMIT6_PCREL_RILC_PTR(). Factor out parts of the existing macros that are helpful for implementing the new ones. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250519223646.66382-3-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-22 08:40:56 -07:00
Ilya Leoshkevich	f7562001c8	s390: always declare expoline thunks It would be convenient to use the following pattern in the BPF JIT: if (nospec_uses_trampoline()) emit_call(__s390_indirect_jump_r1); Unfortunately with CONFIG_EXPOLINE=n the compiler complains about the missing prototype of __s390_indirect_jump_r1(). One could wrap the whole "if" statement in an #ifdef, but this clutters the code. Instead, declare expoline thunk prototypes even when compiling without expolines. When using the above code structure and compiling without expolines, references to them are optimized away, and there are no linker errors. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250519223646.66382-2-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-22 08:40:56 -07:00
Di Shen	4e2e6841ff	bpf: Revert "bpf: remove unnecessary rcu_read_{lock,unlock}() in multi-uprobe attach logic" This reverts commit `4a8f635a60`. Althought get_pid_task() internally already calls rcu_read_lock() and rcu_read_unlock(), the find_vpid() was not. The documentation for find_vpid() clearly states: "Must be called with the tasklist_lock or rcu_read_lock() held." Add proper rcu_read_lock/unlock() to protect the find_vpid(). Fixes: `4a8f635a60` ("bpf: remove unnecessary rcu_read_{lock,unlock}() in multi-uprobe attach logic") Reported-by: Xuewen Yan <xuewen.yan@unisoc.com> Signed-off-by: Di Shen <di.shen@unisoc.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20250520054943.5002-1-xuewen.yan@unisoc.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-22 07:49:32 -07:00
Andrii Nakryiko	25b6d5def6	Merge branch 'libbpf-support-multi-split-btf' Alan Maguire says: ==================== libbpf: support multi-split BTF In discussing handling of inlines in BTF [1], one area which we may need support for in the future is multiple split BTF, where split BTF sits atop another split BTF which sits atop base BTF. This two-patch series fixes one issue discovered when testing multi-split BTF and extends the split BTF test to cover multi-split BTF also. [1] https://lore.kernel.org/dwarves/20250416-btf_inline-v1-0-e4bd2f8adae5@meta.com/ ==================== Link: https://patch.msgid.link/20250519165935.261614-1-alan.maguire@oracle.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2025-05-20 16:22:31 -07:00
Alan Maguire	02f5e7c1f3	selftests/bpf: Test multi-split BTF Extend split BTF test to cover case where we create split BTF on top of existing split BTF and add info to it; ensure that such BTF can be created and handled by searching within it, dumping/comparing to expected. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250519165935.261614-3-alan.maguire@oracle.com	2025-05-20 16:22:30 -07:00
Alan Maguire	4e29128a9a	libbpf/btf: Fix string handling to support multi-split BTF libbpf handling of split BTF has been written largely with the assumption that multiple splits are possible, i.e. split BTF on top of split BTF on top of base BTF. One area where this does not quite work is string handling in split BTF; the start string offset should be the base BTF string section length + the base BTF string offset. This worked in the past because for a single split BTF with base the start string offset was always 0. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250519165935.261614-2-alan.maguire@oracle.com	2025-05-20 16:22:30 -07:00
Mykyta Yatsenko	b615ce5fbe	selftests/bpf: Remove unnecessary link dependencies Remove llvm dependencies from binaries that do not use llvm libraries. Filter out libxml2 from llvm dependencies, as it seems that it is not actually used. This patch reduced link dependencies for BPF selftests. The next line was adding llvm dependencies to every target in the makefile, while the only targets that require those are test runnners (test_progs, test_progs-no_alu32,...): ``` $(OUTPUT)/$(TRUNNER_BINARY): LDLIBS += $$(LLVM_LDLIBS) ``` Before this change: ldd linux/tools/testing/selftests/bpf/veristat linux-vdso.so.1 (0x00007ffd2c3fd000) libelf.so.1 => /lib64/libelf.so.1 (0x00007fe1dcf89000) libz.so.1 => /lib64/libz.so.1 (0x00007fe1dcf6f000) libm.so.6 => /lib64/libm.so.6 (0x00007fe1dce94000) libzstd.so.1 => /lib64/libzstd.so.1 (0x00007fe1dcddd000) libxml2.so.2 => /lib64/libxml2.so.2 (0x00007fe1dcc54000) libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fe1dca00000) libc.so.6 => /lib64/libc.so.6 (0x00007fe1dc600000) /lib64/ld-linux-x86-64.so.2 (0x00007fe1dcfb1000) liblzma.so.5 => /lib64/liblzma.so.5 (0x00007fe1dc9d4000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fe1dcc38000) After: ldd linux/tools/testing/selftests/bpf/veristat linux-vdso.so.1 (0x00007ffc83370000) libelf.so.1 => /lib64/libelf.so.1 (0x00007f4b87515000) libz.so.1 => /lib64/libz.so.1 (0x00007f4b874fb000) libc.so.6 => /lib64/libc.so.6 (0x00007f4b87200000) libzstd.so.1 => /lib64/libzstd.so.1 (0x00007f4b87444000) /lib64/ld-linux-x86-64.so.2 (0x00007f4b8753d000) Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250516195522.311769-1-mykyta.yatsenko5@gmail.com	2025-05-19 09:30:38 -07:00
Paul Chaignon	1cb0f56d96	bpf: WARN_ONCE on verifier bugs Throughout the verifier's logic, there are multiple checks for inconsistent states that should never happen and would indicate a verifier bug. These bugs are typically logged in the verifier logs and sometimes preceded by a WARN_ONCE. This patch reworks these checks to consistently emit a verifier log AND a warning when CONFIG_DEBUG_KERNEL is enabled. The consistent use of WARN_ONCE should help fuzzers (ex. syzkaller) expose any situation where they are actually able to reach one of those buggy verifier states. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/aCs1nYvNNMq8dAWP@mail.gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-19 08:17:08 -07:00
Alexei Starovoitov	9325d53fe9	Merge branch 's390-bpf-remove-the-orig_call-null-check' Ilya Leoshkevich says: ==================== I've been looking at fixing the tailcall_bpf2bpf_hierarchy failures on s390. One of the challenges is that when a BPF trampoline calls a BPF prog A, the prologue of A sets the tail call count to 0. Therefore it would be useful to know whether the trampoline is attached to some other BPF prog B, in which case A should be called using an offset equal to tail_call_start, bypassing the tail call count initialization. The trampoline attachment point is passed to trampoline functions via the orig_call variable. Unfortunately in the case of calculating the size of a struct_ops trampoline it's NULL, and I could not think of a good reason to have it this way. This series makes it always non-NULL. ==================== Link: https://patch.msgid.link/20250512221911.61314-1-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-14 17:50:37 -07:00
Ilya Leoshkevich	8e57cf09c8	s390/bpf: Remove the orig_call NULL check Now that orig_call can never be NULL, remove the respective check. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250512221911.61314-3-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-14 17:48:57 -07:00
Ilya Leoshkevich	94bde253d3	bpf: Pass the same orig_call value to trampoline functions There is currently some confusion in the s390x JIT regarding whether orig_call can be NULL and what that means. Originally the NULL value was used to distinguish the struct_ops case, but this was superseded by BPF_TRAMP_F_INDIRECT (see commit `0c970ed2f8` ("s390/bpf: Fix indirect trampoline generation"). The remaining reason to have this check is that NULL can actually be passed to the arch_bpf_trampoline_size() call - but not to the respective arch_prepare_bpf_trampoline()! call - by bpf_struct_ops_prepare_trampoline(). Remove this asymmetry by passing stub_func to both functions, so that JITs may rely on orig_call never being NULL. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20250512221911.61314-2-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-14 17:48:57 -07:00
Ilya Leoshkevich	5f55f21684	s390/bpf: Store backchain even for leaf progs Currently a crash in a leaf prog (caused by a bug) produces the following call trace: [<000003ff600ebf00>] bpf_prog_6df0139e1fbf2789_fentry+0x20/0x78 [<0000000000000000>] 0x0 This is because leaf progs do not store backchain. Fix by making all progs do it. This is what GCC and Clang-generated code does as well. Now the call trace looks like this: [<000003ff600eb0f2>] bpf_prog_6df0139e1fbf2789_fentry+0x2a/0x80 [<000003ff600ed096>] bpf_trampoline_201863462940+0x96/0xf4 [<000003ff600e3a40>] bpf_prog_05f379658fdd72f2_classifier_0+0x58/0xc0 [<000003ffe0aef070>] bpf_test_run+0x210/0x390 [<000003ffe0af0dc2>] bpf_prog_test_run_skb+0x25a/0x668 [<000003ffe038a90e>] __sys_bpf+0xa46/0xdb0 [<000003ffe038ad0c>] __s390x_sys_bpf+0x44/0x50 [<000003ffe0defea8>] __do_syscall+0x150/0x280 [<000003ffe0e01d5c>] system_call+0x74/0x98 Fixes: `0546231057` ("s390/bpf: Add s390x eBPF JIT compiler backend") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250512122717.54878-1-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-14 17:47:41 -07:00
Kuniyuki Iwashima	4dd372de3f	selftests/bpf: Relax TCPOPT_WINDOW validation in test_tcp_custom_syncookie.c. The custom syncookie test expects TCPOPT_WINDOW to be 7 based on the kernel’s behaviour at the time, but the upcoming series [0] will bump it to 10. Let's relax the test to allow any valid TCPOPT_WINDOW value in the range 1–14. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/netdev/20250513193919.1089692-1-edumazet@google.com/ #[0] Link: https://patch.msgid.link/20250514214021.85187-1-kuniyu@amazon.com	2025-05-14 15:13:24 -07:00
Mykyta Yatsenko	d0445d7dd3	libbpf: Check bpf_map_skeleton link for NULL Avoid dereferencing bpf_map_skeleton's link field if it's NULL. If BPF map skeleton is created with the size, that indicates containing link field, but the field was not actually initialized with valid bpf_link pointer, libbpf crashes. This may happen when using libbpf-rs skeleton. Skeleton loading may still progress, but user needs to attach struct_ops map separately. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250514113220.219095-1-mykyta.yatsenko5@gmail.com	2025-05-14 09:30:06 -07:00
Kumar Kartikeya Dwivedi	bc049387b4	bpf: Add support for __prog argument suffix to pass in prog->aux Instead of hardcoding the list of kfuncs that need prog->aux passed to them with a combination of fixup_kfunc_call adjustment + __ign suffix, combine both in __prog suffix, which ignores the argument passed in, and fixes it up to the prog->aux. This allows kfuncs to have the prog->aux passed into them without having to touch the verifier. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250513142812.1021591-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-13 18:47:54 -07:00
Tao Chen	3880cdbed1	bpf: Fix WARN() in get_bpf_raw_tp_regs syzkaller reported an issue: WARNING: CPU: 3 PID: 5971 at kernel/trace/bpf_trace.c:1861 get_bpf_raw_tp_regs+0xa4/0x100 kernel/trace/bpf_trace.c:1861 Modules linked in: CPU: 3 UID: 0 PID: 5971 Comm: syz-executor205 Not tainted 6.15.0-rc5-syzkaller-00038-g707df3375124 #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:get_bpf_raw_tp_regs+0xa4/0x100 kernel/trace/bpf_trace.c:1861 RSP: 0018:ffffc90003636fa8 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffffffff81c6bc4c RDX: ffff888032efc880 RSI: ffffffff81c6bc83 RDI: 0000000000000005 RBP: ffff88806a730860 R08: 0000000000000005 R09: 0000000000000003 R10: 0000000000000004 R11: 0000000000000000 R12: 0000000000000004 R13: 0000000000000001 R14: ffffc90003637008 R15: 0000000000000900 FS: 0000000000000000(0000) GS:ffff8880d6cdf000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7baee09130 CR3: 0000000029f5a000 CR4: 0000000000352ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ____bpf_get_stack_raw_tp kernel/trace/bpf_trace.c:1934 [inline] bpf_get_stack_raw_tp+0x24/0x160 kernel/trace/bpf_trace.c:1931 bpf_prog_ec3b2eefa702d8d3+0x43/0x47 bpf_dispatcher_nop_func include/linux/bpf.h:1316 [inline] __bpf_prog_run include/linux/filter.h:718 [inline] bpf_prog_run include/linux/filter.h:725 [inline] __bpf_trace_run kernel/trace/bpf_trace.c:2363 [inline] bpf_trace_run3+0x23f/0x5a0 kernel/trace/bpf_trace.c:2405 __bpf_trace_mmap_lock_acquire_returned+0xfc/0x140 include/trace/events/mmap_lock.h:47 __traceiter_mmap_lock_acquire_returned+0x79/0xc0 include/trace/events/mmap_lock.h:47 __do_trace_mmap_lock_acquire_returned include/trace/events/mmap_lock.h:47 [inline] trace_mmap_lock_acquire_returned include/trace/events/mmap_lock.h:47 [inline] __mmap_lock_do_trace_acquire_returned+0x138/0x1f0 mm/mmap_lock.c:35 __mmap_lock_trace_acquire_returned include/linux/mmap_lock.h:36 [inline] mmap_read_trylock include/linux/mmap_lock.h:204 [inline] stack_map_get_build_id_offset+0x535/0x6f0 kernel/bpf/stackmap.c:157 __bpf_get_stack+0x307/0xa10 kernel/bpf/stackmap.c:483 ____bpf_get_stack kernel/bpf/stackmap.c:499 [inline] bpf_get_stack+0x32/0x40 kernel/bpf/stackmap.c:496 ____bpf_get_stack_raw_tp kernel/trace/bpf_trace.c:1941 [inline] bpf_get_stack_raw_tp+0x124/0x160 kernel/trace/bpf_trace.c:1931 bpf_prog_ec3b2eefa702d8d3+0x43/0x47 Tracepoint like trace_mmap_lock_acquire_returned may cause nested call as the corner case show above, which will be resolved with more general method in the future. As a result, WARN_ON_ONCE will be triggered. As Alexei suggested, remove the WARN_ON_ONCE first. Fixes: `9594dc3c7e` ("bpf: fix nested bpf tracepoints with per-cpu data") Reported-by: syzbot+45b0c89a0fc7ae8dbadc@syzkaller.appspotmail.com Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Tao Chen <chen.dylane@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250513042747.757042-1-chen.dylane@linux.dev Closes: https://lore.kernel.org/bpf/8bc2554d-1052-4922-8832-e0078a033e1d@gmail.com	2025-05-13 09:32:56 -07:00
Khaled Elnaggar	79af71c5fe	docs: bpf: Fix bullet point formatting warning Fix indentation for a bullet list item in bpf_iterators.rst. According to reStructuredText rules, bullet list item bodies must be consistently indented relative to the bullet. The indentation of the first line after the bullet determines the alignment for the rest of the item body. Reported by smatch: /linux/Documentation/bpf/bpf_iterators.rst:55: WARNING: Bullet list ends without a blank line; unexpected unindent. [docutils] Fixes: `7220eabff8` ("bpf, docs: document open-coded BPF iterators") Signed-off-by: Khaled Elnaggar <khaledelnaggarlinux@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250513015901.475207-1-khaledelnaggarlinux@gmail.com	2025-05-13 08:53:02 -07:00
Alexei Starovoitov	f4efc73b1e	Merge branch 'introduce-kfuncs-for-memory-reads-into-dynptrs' Mykyta Yatsenko says: ==================== Introduce kfuncs for memory reads into dynptrs From: Mykyta Yatsenko <yatsenko@meta.com> This patch adds new kfuncs that enable reading variable-length user or kernel data directly into dynptrs. These kfuncs provide a way to perform dynamically-sized reads while maintaining memory safety. Unlike existing `bpf_probe_read_{user\|kernel}` APIs, which are limited to constant-sized reads, these new kfuncs allow for more flexible data access. v4 -> v5 * Fix pointers annotations, use __user where necessary, cast where needed v3 -> v4 * Added pid filtering in selftests v2 -> v3 * Add KF_TRUSTED_ARGS for kfuncs that take pointer to task_struct as an argument * Remove checks for non-NULL task, where it was not necessary * Added comments on constants used in selftests, etc. v1 -> v2 * Renaming helper functions to use "user_str" instead of "user_data_str" suffix ==================== Link: https://patch.msgid.link/20250512205348.191079-1-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-12 18:32:48 -07:00
Mykyta Yatsenko	c61bcd29ed	selftests/bpf: introduce tests for dynptr copy kfuncs Introduce selftests verifying newly-added dynptr copy kfuncs. Covering contiguous and non-contiguous memory backed dynptrs. Disable test_probe_read_user_str_dynptr that triggers bug in strncpy_from_user_nofault. Patch to fix the issue [1]. [1] https://patchwork.kernel.org/project/linux-mm/patch/20250422131449.57177-1-mykyta.yatsenko5@gmail.com/ Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Link: https://lore.kernel.org/r/20250512205348.191079-4-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-12 18:32:47 -07:00
Mykyta Yatsenko	a498ee7576	bpf: Implement dynptr copy kfuncs This patch introduces a new set of kfuncs for working with dynptrs in BPF programs, enabling reading variable-length user or kernel data into dynptr directly. To enable memory-safety, verifier allows only constant-sized reads via existing bpf_probe_read_{user\|kernel} etc. kfuncs, dynptr-based kfuncs allow dynamically-sized reads without memory safety shortcomings. The following kfuncs are introduced: * `bpf_probe_read_kernel_dynptr()`: probes kernel-space data into a dynptr * `bpf_probe_read_user_dynptr()`: probes user-space data into a dynptr * `bpf_probe_read_kernel_str_dynptr()`: probes kernel-space string into a dynptr * `bpf_probe_read_user_str_dynptr()`: probes user-space string into a dynptr * `bpf_copy_from_user_dynptr()`: sleepable, copies user-space data into a dynptr for the current task * `bpf_copy_from_user_str_dynptr()`: sleepable, copies user-space string into a dynptr for the current task * `bpf_copy_from_user_task_dynptr()`: sleepable, copies user-space data of the task into a dynptr * `bpf_copy_from_user_task_str_dynptr()`: sleepable, copies user-space string of the task into a dynptr The implementation is built on two generic functions: * __bpf_dynptr_copy * __bpf_dynptr_copy_str These functions take function pointers as arguments, enabling the copying of data from various sources, including both kernel and user space. Use __always_inline for generic functions and callbacks to make sure the compiler doesn't generate indirect calls into callbacks, which is more expensive, especially on some kernel configurations. Inlining allows compiler to put direct calls into all the specific callback implementations (copy_user_data_sleepable, copy_user_data_nofault, and so on). Reviewed-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Link: https://lore.kernel.org/r/20250512205348.191079-3-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-12 18:31:51 -07:00
Mykyta Yatsenko	d060b6aab0	helpers: make few bpf helpers public Make bpf_dynptr_slice_rdwr, bpf_dynptr_check_off_len and __bpf_dynptr_write available outside of the helpers.c by adding their prototypes into linux/include/bpf.h. bpf_dynptr_check_off_len() implementation is moved to header and made inline explicitly, as small function should typically be inlined. These functions are going to be used from bpf_trace.c in the next patch of this series. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Link: https://lore.kernel.org/r/20250512205348.191079-2-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-12 18:29:03 -07:00
Anton Protopopov	fd5fd538a1	libbpf: Use proper errno value in nlattr Return value of the validate_nla() function can be propagated all the way up to users of libbpf API. In case of error this libbpf version of validate_nla returns -1 which will be seen as -EPERM from user's point of view. Instead, return a more reasonable -EINVAL. Fixes: `bbf48c18ee` ("libbpf: add error reporting in XDP") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250510182011.2246631-1-a.s.protopopov@gmail.com	2025-05-12 15:22:54 -07:00
Mykyta Yatsenko	3a320ed325	selftests/bpf: Allow skipping docs compilation Currently rst2man is required to build bpf selftests, as the tool is used by Makefile.docs. rst2man may be missing in some build environments and is not essential for selftests. It makes sense to allow user to skip building docs. This patch adds SKIP_DOCS variable into bpf selftests Makefile that when set to 1 allows skipping building docs, for example: make -C tools/testing/selftests TARGETS=bpf SKIP_DOCS=1 Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250510002450.365613-1-mykyta.yatsenko5@gmail.com	2025-05-12 15:18:46 -07:00
Alexei Starovoitov	149e0cf4c9	Merge branch 'fix-verifier-test-failures-in-verbose-mode' Gregory Bell says: ==================== Fix verifier test failures in verbose mode This patch series fixes two issues that cause false failures in the BPF verifier test suite when run with verbose output (`-v`). The following tests fail only when running the test_verifier in verbose. This leads to inconsistent results across verbose and non-verbose runs. Patch 1 addresses an issue where the verbose flag (`-v`) unintentionally overrides the `opts.log_level`, leading to incorrect contents when checking bpf_vlog in tests with `expected_ret == VERBOSE_ACCEPT`. This occurs when running verbose with `-v` but not `-vv` Patch 2 increases the size of the `bpf_vlog[]` buffer to prevent truncation of large verifier logs, which was causing failures in several scale and 64-bit immediate tests. Before patches: ./test_verifier \| grep FAIL Summary: 790 PASSED, 0 SKIPPED, 0 FAILED ./test_verifier -v \| grep FAIL Summary: 782 PASSED, 0 SKIPPED, 8 FAILED ./test_verifier -vv \| grep FAIL Summary: 787 PASSED, 0 SKIPPED, 3 FAILED After patches: ./test_verifier -v \| grep FAIL Summary: 790 PASSED, 0 SKIPPED, 0 FAILED ./test_verifier -vv \| grep FAIL Summary: 790 PASSED, 0 SKIPPED, 0 FAILED These fixes improve test reliability and ensure consistent behavior across verbose and non-verbose runs. ==================== Tested-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://patch.msgid.link/cover.1747058195.git.grbell@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-12 10:44:01 -07:00
Gregory Bell	af8a5125a0	selftests/bpf: test_verifier verbose log overflows Tests: - 458/p ld_dw: xor semi-random 64-bit imms, test 5 - 501/p scale: scale test 1 - 502/p scale: scale test 2 fail in verbose mode due to bpf_vlog[] overflowing. These tests generate large verifier logs that exceed the current buffer size, causing them to fail to load. Increase the size of the bpf_vlog[] buffer to accommodate larger logs and prevent false failures during test runs with verbose output. Signed-off-by: Gregory Bell <grbell@redhat.com> Link: https://lore.kernel.org/r/e49267100f07f099a5877a3a5fc797b702bbaf0c.1747058195.git.grbell@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-12 10:43:43 -07:00
Gregory Bell	c5bcc8c781	selftests/bpf: test_verifier verbose causes erroneous failures When running test_verifier with the -v flag and a test with `expected_ret==VERBOSE_ACCEPT`, the opts.log_level is unintentionally overwritten because the verbose flag takes precedence. This leads to a mismatch in the expected and actual contents of bpf_vlog, causing tests to fail incorrectly. Reorder the conditional logic that sets opts.log_level to preserve the expected log level and prevent it from being overridden by -v. Signed-off-by: Gregory Bell <grbell@redhat.com> Link: https://lore.kernel.org/r/182bf00474f817c99f968a9edb119882f62be0f8.1747058195.git.grbell@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-12 10:43:43 -07:00

1 2 3 4 5 ...

1352430 Commits