Commit Graph

1352357 Commits

Author SHA1 Message Date
YiFei Zhu
43745d11bf bpftool: Fix regression of "bpftool cgroup tree" EINVAL on older kernels
If cgroup_has_attached_progs queries an attach type not supported
by the running kernel, due to the kernel being older than the bpftool
build, it would encounter an -EINVAL from BPF_PROG_QUERY syscall.

Prior to commit 98b303c9bf ("bpftool: Query only cgroup-related
attach types"), this EINVAL would be ignored by the function, allowing
the function to only consider supported attach types. The commit
changed so that, instead of querying all attach types, only attach
types from the array `cgroup_attach_types` is queried. The assumption
is that because these are only cgroup attach types, they should all
be supported. Unfortunately this assumption may be false when the
kernel is older than the bpftool build, where the attach types queried
by bpftool is not yet implemented in the kernel. This would result in
errors such as:

  $ bpftool cgroup tree
  CgroupPath
  ID       AttachType      AttachFlags     Name
  Error: can't query bpf programs attached to /sys/fs/cgroup: Invalid argument

This patch restores the logic of ignoring EINVAL from prior to that patch.

Fixes: 98b303c9bf ("bpftool: Query only cgroup-related attach types")
Reported-by: Sagarika Sharma <sharmasagarika@google.com>
Reported-by: Minh-Anh Nguyen <minhanhdn@google.com>
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/bpf/20250428211536.1651456-1-zhuyifei@google.com
2025-05-06 13:53:55 -07:00
Alexei Starovoitov
9fd060622c Merge branch 'bpf-support-bpf-rbtree-traversal-and-list-peeking'
Martin KaFai Lau says:

====================
bpf: Support bpf rbtree traversal and list peeking

From: Martin KaFai Lau <martin.lau@kernel.org>

The RFC v1 [1] showed a fq qdisc implementation in bpf
that is much closer to the kernel sch_fq.c.

The fq example and bpf qdisc changes are separated out from this set.
This set is to focus on the kfunc and verifier changes that
enable the bpf rbtree traversal and list peeking.

v2:
- Added tests to check that the return value of
  the bpf_rbtree_{root,left,right} and bpf_list_{front,back} is
  marked as a non_own_ref node pointer. (Kumar)
- Added tests to ensure that the bpf_rbtree_{root,left,right} and
  bpf_list_{front,back} must be called after holding the spinlock.
- Squashed the selftests adjustment to the corresponding verifier
  changes to avoid bisect failure. (Kumar)
- Separated the bpf qdisc specific changes and fq selftest example
  from this set.

[1]: https://lore.kernel.org/bpf/20250418224652.105998-1-martin.lau@linux.dev/
====================

Link: https://patch.msgid.link/20250506015857.817950-1-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-06 10:21:14 -07:00
Martin KaFai Lau
29318b4d5d selftests/bpf: Add test for bpf_list_{front,back}
This patch adds the "list_peek" test to use the new
bpf_list_{front,back} kfunc.

The test_{front,back}* tests ensure that the return value
is a non_own_ref node pointer and requires the spinlock to be held.

Suggested-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> # check non_own_ref marking
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20250506015857.817950-9-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-06 10:21:06 -07:00
Martin KaFai Lau
fb5b480205 bpf: Add bpf_list_{front,back} kfunc
In the kernel fq qdisc implementation, it only needs to look at
the fields of the first node in a list but does not always
need to remove it from the list. It is more convenient to have
a peek kfunc for the list. It works similar to the bpf_rbtree_first().

This patch adds bpf_list_{front,back} kfunc. The verifier is changed
such that the kfunc returning "struct bpf_list_node *" will be
marked as non-owning. The exception is the KF_ACQUIRE kfunc. The
net effect is only the new bpf_list_{front,back} kfuncs will
have its return pointer marked as non-owning.

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20250506015857.817950-8-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-06 10:21:05 -07:00
Martin KaFai Lau
3fab84f00d bpf: Simplify reg0 marking for the list kfuncs that return a bpf_list_node pointer
The next patch will add bpf_list_{front,back} kfuncs to peek the head
and tail of a list. Both of them will return a 'struct bpf_list_node *'.

Follow the earlier change for rbtree, this patch checks the
return btf type is a 'struct bpf_list_node' pointer instead
of checking each kfuncs individually to decide if
mark_reg_graph_node should be called. This will make
the bpf_list_{front,back} kfunc addition easier in
the later patch.

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20250506015857.817950-7-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-06 10:21:05 -07:00
Martin KaFai Lau
47ada65c5c selftests/bpf: Add tests for bpf_rbtree_{root,left,right}
This patch has a much simplified rbtree usage from the
kernel sch_fq qdisc. It has a "struct node_data" which can be
added to two different rbtrees which are ordered by different keys.

The test first populates both rbtrees. Then search for a lookup_key
from the "groot0" rbtree. Once the lookup_key is found, that node
refcount is taken. The node is then removed from another "groot1"
rbtree.

While searching the lookup_key, the test will also try to remove
all rbnodes in the path leading to the lookup_key.

The test_{root,left,right}_spinlock_true tests ensure that the
return value of the bpf_rbtree functions is a non_own_ref node pointer.
This is done by forcing an verifier error by calling a helper
bpf_jiffies64() while holding the spinlock. The tests then
check for the verifier message
"call bpf_rbtree...R0=rcu_ptr_or_null_node..."

The other test_{root,left,right}_spinlock_false tests ensure that
they must be called with spinlock held.

Suggested-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> # Check non_own_ref marking
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20250506015857.817950-6-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-06 10:21:05 -07:00
Martin KaFai Lau
2ddef1783c bpf: Allow refcounted bpf_rb_node used in bpf_rbtree_{remove,left,right}
The bpf_rbtree_{remove,left,right} requires the root's lock to be held.
They also check the node_internal->owner is still owned by that root
before proceeding, so it is safe to allow refcounted bpf_rb_node
pointer to be used in these kfuncs.

In a bpf fq implementation which is much closer to the kernel fq,
https://lore.kernel.org/bpf/20250418224652.105998-13-martin.lau@linux.dev/,
a networking flow (allocated by bpf_obj_new) can be added to two different
rbtrees. There are cases that the flow is searched from one rbtree,
held the refcount of the flow, and then removed from another rbtree:

struct fq_flow {
	struct bpf_rb_node	fq_node;
	struct bpf_rb_node	rate_node;
	struct bpf_refcount	refcount;
	unsigned long		sk_long;
};

int bpf_fq_enqueue(...)
{
	/* ... */

	bpf_spin_lock(&root->lock);
	while (can_loop) {
		/* ... */
		if (!p)
			break;
		gc_f = bpf_rb_entry(p, struct fq_flow, fq_node);
		if (gc_f->sk_long == sk_long) {
			f = bpf_refcount_acquire(gc_f);
			break;
		}
		/* ... */
	}
	bpf_spin_unlock(&root->lock);

	if (f) {
		bpf_spin_lock(&q->lock);
		bpf_rbtree_remove(&q->delayed, &f->rate_node);
		bpf_spin_unlock(&q->lock);
	}
}

bpf_rbtree_{left,right} do not need this change but are relaxed together
with bpf_rbtree_remove instead of adding extra verifier logic
to exclude these kfuncs.

To avoid bi-sect failure, this patch also changes the selftests together.

The "rbtree_api_remove_unadded_node" is not expecting verifier's error.
The test now expects bpf_rbtree_remove(&groot, &m->node) to return NULL.
The test uses __retval(0) to ensure this NULL return value.

Some of the "only take non-owning..." failure messages are changed also.

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20250506015857.817950-5-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-06 10:21:05 -07:00
Martin KaFai Lau
9e3e66c553 bpf: Add bpf_rbtree_{root,left,right} kfunc
In a bpf fq implementation that is much closer to the kernel fq,
it will need to traverse the rbtree:
https://lore.kernel.org/bpf/20250418224652.105998-13-martin.lau@linux.dev/

The much simplified logic that uses the bpf_rbtree_{root,left,right}
to traverse the rbtree is like:

struct fq_flow {
	struct bpf_rb_node	fq_node;
	struct bpf_rb_node	rate_node;
	struct bpf_refcount	refcount;
	unsigned long		sk_long;
};

struct fq_flow_root {
	struct bpf_spin_lock lock;
	struct bpf_rb_root root __contains(fq_flow, fq_node);
};

struct fq_flow *fq_classify(...)
{
	struct bpf_rb_node *tofree[FQ_GC_MAX];
	struct fq_flow_root *root;
	struct fq_flow *gc_f, *f;
	struct bpf_rb_node *p;
	int i, fcnt = 0;

	/* ... */

	f = NULL;
	bpf_spin_lock(&root->lock);
	p = bpf_rbtree_root(&root->root);
	while (can_loop) {
		if (!p)
			break;

		gc_f = bpf_rb_entry(p, struct fq_flow, fq_node);
		if (gc_f->sk_long == sk_long) {
			f = bpf_refcount_acquire(gc_f);
			break;
		}

		/* To be removed from the rbtree */
		if (fcnt < FQ_GC_MAX && fq_gc_candidate(gc_f, jiffies_now))
			tofree[fcnt++] = p;

		if (gc_f->sk_long > sk_long)
			p = bpf_rbtree_left(&root->root, p);
		else
			p = bpf_rbtree_right(&root->root, p);
	}

	/* remove from the rbtree */
	for (i = 0; i < fcnt; i++) {
		p = tofree[i];
		tofree[i] = bpf_rbtree_remove(&root->root, p);
	}

	bpf_spin_unlock(&root->lock);

	/* bpf_obj_drop the fq_flow(s) that have just been removed
	 * from the rbtree.
	 */
	for (i = 0; i < fcnt; i++) {
		p = tofree[i];
		if (p) {
			gc_f = bpf_rb_entry(p, struct fq_flow, fq_node);
			bpf_obj_drop(gc_f);
		}
	}

	return f;

}

The above simplified code needs to traverse the rbtree for two purposes,
1) find the flow with the desired sk_long value
2) while searching for the sk_long, collect flows that are
   the fq_gc_candidate. They will be removed from the rbtree.

This patch adds the bpf_rbtree_{root,left,right} kfunc to enable
the rbtree traversal. The returned bpf_rb_node pointer will be a
non-owning reference which is the same as the returned pointer
of the exisiting bpf_rbtree_first kfunc.

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20250506015857.817950-4-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-06 10:21:05 -07:00
Martin KaFai Lau
7faccdf4b4 bpf: Simplify reg0 marking for the rbtree kfuncs that return a bpf_rb_node pointer
The current rbtree kfunc, bpf_rbtree_{first, remove}, returns the
bpf_rb_node pointer. The check_kfunc_call currently checks the
kfunc btf_id instead of its return pointer type to decide
if it needs to do mark_reg_graph_node(reg0) and ref_set_non_owning(reg0).

The later patch will add bpf_rbtree_{root,left,right} that will also
return a bpf_rb_node pointer. Instead of adding more kfunc btf_id
checks to the "if" case, this patch changes the test to check the
kfunc's return type. is_rbtree_node_type() function is added to
test if a pointer type is a bpf_rb_node. The callers have already
skipped the modifiers of the pointer type.

A note on the ref_set_non_owning(), although bpf_rbtree_remove()
also returns a bpf_rb_node pointer, the bpf_rbtree_remove()
has the KF_ACQUIRE flag. Thus, its reg0 will not become non-owning.

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20250506015857.817950-3-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-06 10:21:05 -07:00
Martin KaFai Lau
b183c0123d bpf: Check KF_bpf_rbtree_add_impl for the "case KF_ARG_PTR_TO_RB_NODE"
In a later patch, two new kfuncs will take the bpf_rb_node pointer arg.

struct bpf_rb_node *bpf_rbtree_left(struct bpf_rb_root *root,
				    struct bpf_rb_node *node);
struct bpf_rb_node *bpf_rbtree_right(struct bpf_rb_root *root,
				     struct bpf_rb_node *node);

In the check_kfunc_call, there is a "case KF_ARG_PTR_TO_RB_NODE"
to check if the reg->type should be an allocated pointer or should be
a non_owning_ref.

The later patch will need to ensure that the bpf_rb_node pointer passing
to the new bpf_rbtree_{left,right} must be a non_owning_ref. This
should be the same requirement as the existing bpf_rbtree_remove.

This patch swaps the current "if else" statement. Instead of checking
the bpf_rbtree_remove, it checks the bpf_rbtree_add. Then the new
bpf_rbtree_{left,right} will fall into the "else" case to make
the later patch simpler. bpf_rbtree_add should be the only
one that needs an allocated pointer.

This should be a no-op change considering there are only two kfunc(s)
taking bpf_rb_node pointer arg, rbtree_add and rbtree_remove.

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20250506015857.817950-2-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-06 10:21:05 -07:00
Andrii Nakryiko
62e23f1838 libbpf: Improve BTF dedup handling of "identical" BTF types
BTF dedup has a strong assumption that compiler with deduplicate identical
types within any given compilation unit (i.e., .c file). This property
is used when establishing equilvalence of two subgraphs of types.

Unfortunately, this property doesn't always holds in practice. We've
seen cases of having truly identical structs, unions, array definitions,
and, most recently, even pointers to the same type being duplicated
within CU.

Previously, we mitigated this on a case-by-case basis, adding a few
simple heuristics for validating that two BTF types (having two
different type IDs) are structurally the same. But this approach scales
poorly, and we can have more weird cases come up in the future.

So let's take a half-step back, and implement a bit more generic
structural equivalence check, recursively. We still limit it to
reasonable depth to avoid long reference loops. Depth-wise limiting of
potentially cyclical graph isn't great, but as I mentioned below doesn't
seem to be detrimental performance-wise. We can always improve this in
the future with per-type visited markers, if necessary.

Performance-wise this doesn't seem too affect vmlinux BTF dedup, which
makes sense because this logic kicks in not so frequently and only if we
already established a canonical candidate type match, but suddenly find
a different (but probably identical) type.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/r/20250501235231.1339822-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-05 14:51:47 -07:00
Thorsten Blum
41948afcf5 bpf: Replace offsetof() with struct_size()
Compared to offsetof(), struct_size() provides additional compile-time
checks for structs with flexible arrays (e.g., __must_be_array()).

No functional changes intended.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250503151513.343931-2-thorsten.blum@linux.dev
2025-05-05 14:21:46 -07:00
Anton Protopopov
41d4ce6df3 bpf: Fix uninitialized values in BPF_{CORE,PROBE}_READ
With the latest LLVM bpf selftests build will fail with
the following error message:

    progs/profiler.inc.h:710:31: error: default initialization of an object of type 'typeof ((parent_task)->real_cred->uid.val)' (aka 'const unsigned int') leaves the object uninitialized and is incompatible with C++ [-Werror,-Wdefault-const-init-unsafe]
      710 |         proc_exec_data->parent_uid = BPF_CORE_READ(parent_task, real_cred, uid.val);
          |                                      ^
    tools/testing/selftests/bpf/tools/include/bpf/bpf_core_read.h:520:35: note: expanded from macro 'BPF_CORE_READ'
      520 |         ___type((src), a, ##__VA_ARGS__) __r;                               \
          |                                          ^

This happens because BPF_CORE_READ (and other macro) declare the
variable __r using the ___type macro which can inherit const modifier
from intermediate types.

Fix this by using __typeof_unqual__, when supported. (And when it
is not supported, the problem shouldn't appear, as older compilers
haven't complained.)

Fixes: 792001f4f7 ("libbpf: Add user-space variants of BPF_CORE_READ() family of macros")
Fixes: a4b09a9ef9 ("libbpf: Add non-CO-RE variants of BPF_CORE_READ() macro family")
Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250502193031.3522715-1-a.s.protopopov@gmail.com
2025-05-05 14:20:28 -07:00
Ihor Solodrai
a28fe31603 selftests/bpf: Remove sockmap_ktls disconnect_after_delete test
"sockmap_ktls disconnect_after_delete" is effectively moot after
disconnect has been disabled for TLS [1][2]. Remove the test
completely.

[1] https://lore.kernel.org/bpf/20250416170246.2438524-1-ihor.solodrai@linux.dev/
[2] https://lore.kernel.org/netdev/20250404180334.3224206-1-kuba@kernel.org/

Signed-off-by: Ihor Solodrai <isolodrai@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250502185221.1556192-1-isolodrai@meta.com
2025-05-05 14:20:04 -07:00
Alan Maguire
f263336a41 selftests/bpf: Add btf dedup test covering module BTF dedup
Recently issues were observed with module BTF deduplication failures
[1].  Add a dedup selftest that ensures that core kernel types are
referenced from split BTF as base BTF types.  To do this use bpf_testmod
functions which utilize core kernel types, specifically

ssize_t
bpf_testmod_test_write(struct file *file, struct kobject *kobj,
                       struct bin_attribute *bin_attr,
                       char *buf, loff_t off, size_t len);

__bpf_kfunc struct sock *bpf_kfunc_call_test3(struct sock *sk);

__bpf_kfunc void bpf_kfunc_call_test_pass_ctx(struct __sk_buff *skb);

For each of these ensure that the types they reference -
struct file, struct kobject, struct bin_attr etc - are in base BTF.
Note that because bpf_testmod.ko is built with distilled base BTF
the associated reference types - i.e. the PTR that points at a
"struct file" - will be in split BTF.  As a result the test resolves
typedef and pointer references and verifies the pointed-at or
typedef'ed type is in base BTF.  Because we use BTF from
/sys/kernel/btf/bpf_testmod relocation has occurred for the
referenced types and they will be base - not distilled base - types.

For large-scale dedup issues, we see such types appear in split BTF and
as a result this test fails.  Hence it is proposed as a test which will
fail when large-scale dedup issues have occurred.

[1] https://lore.kernel.org/dwarves/CAADnVQL+-LiJGXwxD3jEUrOonO-fX0SZC8496dVzUXvfkB7gYQ@mail.gmail.com/

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20250430134249.2451066-1-alan.maguire@oracle.com
2025-05-01 14:05:49 -07:00
Martin KaFai Lau
86870d0b8f Merge branch 'bpf-allow-xdp_redirect-for-xdp-dev-bound-programs'
Lorenzo Bianconi says:

====================
bpf: Allow XDP_REDIRECT for XDP dev-bound programs

In the current implementation if the program is dev-bound to a specific
device, it will not be possible to perform XDP_REDIRECT into a DEVMAP or
CPUMAP even if the program is running in the driver NAPI context.
Fix the issue introducing __bpf_prog_map_compatible utility routine in
order to avoid bpf_prog_is_dev_bound() during the XDP program load.
Continue forbidding to attach a dev-bound program to XDP maps.
====================

Link: https://patch.msgid.link/20250428-xdp-prog-bound-fix-v3-0-c9e9ba3300c7@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2025-05-01 12:54:07 -07:00
Lorenzo Bianconi
3678331ca7 selftests/bpf: xdp_metadata: Check XDP_REDIRCT support for dev-bound progs
Improve xdp_metadata bpf selftest in order to check it is possible for a
XDP dev-bound program to perform XDP_REDIRECT into a DEVMAP but it is still
not allowed to attach a XDP dev-bound program to a DEVMAP entry.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
2025-05-01 12:54:06 -07:00
Lorenzo Bianconi
714070c4cb bpf: Allow XDP dev-bound programs to perform XDP_REDIRECT into maps
In the current implementation if the program is dev-bound to a specific
device, it will not be possible to perform XDP_REDIRECT into a DEVMAP
or CPUMAP even if the program is running in the driver NAPI context and
it is not attached to any map entry. This seems in contrast with the
explanation available in bpf_prog_map_compatible routine.
Fix the issue introducing __bpf_prog_map_compatible utility routine in
order to avoid bpf_prog_is_dev_bound() check running bpf_check_tail_call()
at program load time (bpf_prog_select_runtime()).
Continue forbidding to attach a dev-bound program to XDP maps
(BPF_MAP_TYPE_PROG_ARRAY, BPF_MAP_TYPE_DEVMAP and BPF_MAP_TYPE_CPUMAP).

Fixes: 3d76a4d3d4 ("bpf: XDP metadata RX kfuncs")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
2025-05-01 12:54:06 -07:00
Thorsten Blum
7b05f43155 bpf: Replace offsetof() with struct_size()
Compared to offsetof(), struct_size() provides additional compile-time
checks for structs with flexible arrays (e.g., __must_be_array()).

No functional changes intended.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250428210638.30219-2-thorsten.blum@linux.dev
2025-05-01 10:37:35 -07:00
Anton Protopopov
358b1c0f56 libbpf: Use proper errno value in linker
Return values of the linker_append_sec_data() and the
linker_append_elf_relos() functions are propagated all the
way up to users of libbpf API. In some error cases these
functions return -1 which will be seen as -EPERM from user's
point of view. Instead, return a more reasonable -EINVAL.

Fixes: faf6ed321c ("libbpf: Add BPF static linker APIs")
Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250430120820.2262053-1-a.s.protopopov@gmail.com
2025-04-30 09:04:20 -07:00
T.J. Mercier
38d976c32d selftests/bpf: Fix kmem_cache iterator draining
The closing parentheses around the read syscall is misplaced, causing
single byte reads from the iterator instead of buf sized reads. While
the end result is the same, many more read calls than necessary are
performed.

$ tools/testing/selftests/bpf/vmtest.sh  "./test_progs -t kmem_cache_iter"
145/1   kmem_cache_iter/check_task_struct:OK
145/2   kmem_cache_iter/check_slabinfo:OK
145/3   kmem_cache_iter/open_coded_iter:OK
145     kmem_cache_iter:OK
Summary: 1/3 PASSED, 0 SKIPPED, 0 FAILED

Fixes: a496d0cdc8 ("selftests/bpf: Add a test for kmem_cache_iter")
Signed-off-by: T.J. Mercier <tjmercier@google.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://patch.msgid.link/20250428180256.1482899-1-tjmercier@google.com
2025-04-29 13:21:48 -07:00
Alan Maguire
8e64c387c9 libbpf: Add identical pointer detection to btf_dedup_is_equiv()
Recently as a side-effect of

commit ac053946f5 ("compiler.h: introduce TYPEOF_UNQUAL() macro")

issues were observed in deduplication between modules and kernel BTF
such that a large number of kernel types were not deduplicated so
were found in module BTF (task_struct, bpf_prog etc).  The root cause
appeared to be a failure to dedup struct types, specifically those
with members that were pointers with __percpu annotations.

The issue in dedup is at the point that we are deduplicating structures,
we have not yet deduplicated reference types like pointers.  If multiple
copies of a pointer point at the same (deduplicated) integer as in this
case, we do not see them as identical.  Special handling already exists
to deal with structures and arrays, so add pointer handling here too.

Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250429161042.2069678-1-alan.maguire@oracle.com
2025-04-29 10:16:23 -07:00
Alexei Starovoitov
224ee86639 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc4
Cross-merge bpf and other fixes after downstream PRs.

No conflicts.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-04-28 08:40:45 -07:00
Linus Torvalds
b4432656b3 Linux 6.15-rc4 v6.15-rc4 2025-04-27 15:19:23 -07:00
Linus Torvalds
5bc1018675 Merge tag 'pci-v6.15-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull PCI fixes from Bjorn Helgaas:

 - When releasing a start-aligned resource, e.g., a bridge window, save
   start/end/flags for the next assignment attempt; fixes a v6.15-rc1
   regression (Ilpo Järvinen)

 - Move set_pcie_speed.sh from TEST_PROGS to TEST_FILE; fixes a bwctrl
   selftest v6.15-rc1 regression (Ilpo Järvinen)

 - Add Manivannan Sadhasivam as maintainer of native host bridge and
   endpoint drivers (Manivannan Sadhasivam)

 - In endpoint test driver, defer IRQ allocation from .probe() until
   ioctl() to fix a regression on platforms where the Vendor/Device ID
   match doesn't include driver_data (Niklas Cassel)

* tag 'pci-v6.15-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  misc: pci_endpoint_test: Defer IRQ allocation until ioctl(PCITEST_SET_IRQTYPE)
  MAINTAINERS: Move Manivannan Sadhasivam as PCI Native host bridge and endpoint maintainer
  selftests/pcie_bwctrl: Fix test progs list
  PCI: Restore assigned resources fully after release
2025-04-26 13:02:36 -07:00
Linus Torvalds
d22aad29de Merge tag 'nfsd-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fix from Chuck Lever:

 - Revert a v6.15 patch due to a report of SELinux test failures

* tag 'nfsd-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  Revert "sunrpc: clean cache_detail immediately when flush is written frequently"
2025-04-26 10:43:03 -07:00
Linus Torvalds
06b31bdbf8 Merge tag 'x86-urgent-2025-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar:

 - Fix 32-bit kernel boot crash if passed physical memory with more than
   32 address bits

 - Fix Xen PV crash

 - Work around build bug in certain limited build environments

 - Fix CTEST instruction decoding in insn_decoder_test

* tag 'x86-urgent-2025-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/insn: Fix CTEST instruction decoding
  x86/boot: Work around broken busybox 'truncate' tool
  x86/mm: Fix _pgd_alloc() for Xen PV mode
  x86/e820: Discard high memory that can't be addressed by 32-bit systems
2025-04-26 09:45:54 -07:00
Linus Torvalds
3d23ef05c3 Merge tag 'sched-urgent-2025-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
 "Fix sporadic crashes in dequeue_entities() due to ... bad math.

  [ Arguably if pick_eevdf()/pick_next_entity() was less trusting of
    complex math being correct it could have de-escalated a crash into
    a warning, but that's for a different patch ]"

* tag 'sched-urgent-2025-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/eevdf: Fix se->slice being set to U64_MAX and resulting crash
2025-04-26 09:23:20 -07:00
Linus Torvalds
86baa5499c Merge tag 'perf-urgent-2025-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc perf events fixes from Ingo Molnar:

 - Use POLLERR for events in error state, instead of the ambiguous
   POLLHUP error value

 - Fix non-sampling (counting) events on certain x86 platforms

* tag 'perf-urgent-2025-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Fix non-sampling (counting) events on certain x86 platforms
  perf/core: Change to POLLERR for pinned events with error
2025-04-26 09:13:09 -07:00
Linus Torvalds
a226e6540b Merge tag 'irq-urgent-2025-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Ingo Molnar:
 "Fix crashes in the gic-v2m irqchip driver, caused by an incorrect
  __init annotation"

* tag 'irq-urgent-2025-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v2m: Prevent use after free of gicv2m_get_fwnode()
2025-04-26 09:08:45 -07:00
Linus Torvalds
e742bd1990 Merge tag 'loongarch-fixes-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
 "Add a missing Kconfig option, fix some bugs in exception handlers,
  memory management and KVM"

* tag 'loongarch-fixes-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: KVM: Fix PMU pass-through issue if VM exits to host finally
  LoongArch: KVM: Fully clear some CSRs when VM reboot
  LoongArch: KVM: Fix multiple typos of KVM code
  LoongArch: Return NULL from huge_pte_offset() for invalid PMD
  LoongArch: Remove a bogus reference to ZONE_DMA
  LoongArch: Handle fp, lsx, lasx and lbt assembly symbols
  LoongArch: Make do_xyz() exception handlers more robust
  LoongArch: Make regs_irqs_disabled() more clear
  LoongArch: Select ARCH_USE_MEMTEST
2025-04-26 09:02:41 -07:00
Linus Torvalds
ec0c2d5359 Merge tag 'for-linus' of https://github.com/openrisc/linux
Pull OpenRISC updates from Stafford Horne:

 - Support for cacheinfo API to expose OpenRISC cache info via sysfs,
   this also translated to some cleanups to OpenRISC cache flush and
   invalidate API's

 - Documentation updates for new mailing list and toolchain binaries

* tag 'for-linus' of https://github.com/openrisc/linux:
  Documentation: openrisc: Update toolchain binaries URL
  Documentation: openrisc: Update mailing list
  openrisc: Add cacheinfo support
  openrisc: Introduce new utility functions to flush and invalidate caches
  openrisc: Refactor struct cpuinfo_or1k to reduce duplication
2025-04-26 09:01:13 -07:00
Chuck Lever
831e3f545b Revert "sunrpc: clean cache_detail immediately when flush is written frequently"
Ondrej reports that certain SELinux tests are failing after commit
fc2a169c56 ("sunrpc: clean cache_detail immediately when flush is
written frequently"), merged during the v6.15 merge window.

Reported-by: Ondrej Mosnacek <omosnace@redhat.com>
Fixes: fc2a169c56 ("sunrpc: clean cache_detail immediately when flush is written frequently")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-04-26 12:00:43 -04:00
Linus Torvalds
a16ebe51a6 Merge tag 'move-lib-kunit-v6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull kunit fix from Kees Cook:
 "A single fix for the kunit lib/tests/ relocation:

   - Ensure prime numbers tests are included in KUnit test runs (Mark Brown)"

* tag 'move-lib-kunit-v6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  lib: Ensure prime numbers tests are included in KUnit test runs
2025-04-26 08:55:24 -07:00
Linus Torvalds
fa573aefdf Merge tag 'drm-fixes-2025-04-26' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "Weekly drm fixes, mostly amdgpu, with some exynos cleanups and a
  couple of minor fixes, seems a bit quiet, but probably some lag from
  Easter holidays.

  amdgpu:
   - P2P DMA fixes
   - Display reset fixes
   - DCN 3.5 fixes
   - ACPI EDID fix
   - LTTPR fix
   - mode_valid() fix

  exynos:
   - fix spelling error
   - remove redundant error handling in exynos_drm_vidi.c module
   - marks struct decon_data as const in the exynos7_drm_decon driver
     since it is only read
   - Remove unnecessary checking in exynos_drm_drv.c module

  meson:
   - Fix VCLK calculation

  panel:
   - jd9365a: Fix reset polarity"

* tag 'drm-fixes-2025-04-26' of https://gitlab.freedesktop.org/drm/kernel:
  drm/exynos: Fix spelling mistake "enqueu" -> "enqueue"
  drm/exynos: exynos7_drm_decon: Consstify struct decon_data
  drm/exynos: fixed a spelling error
  drm/exynos/vidi: Remove redundant error handling in vidi_get_modes()
  drm/exynos: Remove unnecessary checking
  drm/amd/display: do not copy invalid CRTC timing info
  drm/amd/display: Default IPS to RCG_IN_ACTIVE_IPS2_IN_OFF
  drm/amd/display: Use 16ms AUX read interval for LTTPR with old sinks
  drm/amd/display: Fix ACPI edid parsing on some Lenovo systems
  drm/amdgpu: Allow P2P access through XGMI
  drm/amd/display: Enable urgent latency adjustment on DCN35
  drm/amd/display: Force full update in gpu reset
  drm/amd/display: Fix gpu reset in multidisplay config
  drm/amdgpu: Don't pin VRAM without DMABUF_MOVE_NOTIFY
  drm/amdgpu: Use allowed_domains for pinning dmabufs
  drm: panel: jd9365da: fix reset signal polarity in unprepare
  drm/meson: use unsigned long long / Hz for frequency types
  Revert "drm/meson: vclk: fix calculation of 59.94 fractional rates"
2025-04-26 08:32:29 -07:00
Omar Sandoval
bbce3de72b sched/eevdf: Fix se->slice being set to U64_MAX and resulting crash
There is a code path in dequeue_entities() that can set the slice of a
sched_entity to U64_MAX, which sometimes results in a crash.

The offending case is when dequeue_entities() is called to dequeue a
delayed group entity, and then the entity's parent's dequeue is delayed.
In that case:

1. In the if (entity_is_task(se)) else block at the beginning of
   dequeue_entities(), slice is set to
   cfs_rq_min_slice(group_cfs_rq(se)). If the entity was delayed, then
   it has no queued tasks, so cfs_rq_min_slice() returns U64_MAX.
2. The first for_each_sched_entity() loop dequeues the entity.
3. If the entity was its parent's only child, then the next iteration
   tries to dequeue the parent.
4. If the parent's dequeue needs to be delayed, then it breaks from the
   first for_each_sched_entity() loop _without updating slice_.
5. The second for_each_sched_entity() loop sets the parent's ->slice to
   the saved slice, which is still U64_MAX.

This throws off subsequent calculations with potentially catastrophic
results. A manifestation we saw in production was:

6. In update_entity_lag(), se->slice is used to calculate limit, which
   ends up as a huge negative number.
7. limit is used in se->vlag = clamp(vlag, -limit, limit). Because limit
   is negative, vlag > limit, so se->vlag is set to the same huge
   negative number.
8. In place_entity(), se->vlag is scaled, which overflows and results in
   another huge (positive or negative) number.
9. The adjusted lag is subtracted from se->vruntime, which increases or
   decreases se->vruntime by a huge number.
10. pick_eevdf() calls entity_eligible()/vruntime_eligible(), which
    incorrectly returns false because the vruntime is so far from the
    other vruntimes on the queue, causing the
    (vruntime - cfs_rq->min_vruntime) * load calulation to overflow.
11. Nothing appears to be eligible, so pick_eevdf() returns NULL.
12. pick_next_entity() tries to dereference the return value of
    pick_eevdf() and crashes.

Dumping the cfs_rq states from the core dumps with drgn showed tell-tale
huge vruntime ranges and bogus vlag values, and I also traced se->slice
being set to U64_MAX on live systems (which was usually "benign" since
the rest of the runqueue needed to be in a particular state to crash).

Fix it in dequeue_entities() by always setting slice from the first
non-empty cfs_rq.

Fixes: aef6987d89 ("sched/eevdf: Propagate min_slice up the cgroup hierarchy")
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/f0c2d1072be229e1bdddc73c0703919a8b00c652.1745570998.git.osandov@fb.com
2025-04-26 10:44:36 +02:00
Suzuki K Poulose
3318dc299b irqchip/gic-v2m: Prevent use after free of gicv2m_get_fwnode()
With ACPI in place, gicv2m_get_fwnode() is registered with the pci
subsystem as pci_msi_get_fwnode_cb(), which may get invoked at runtime
during a PCI host bridge probe. But, the call back is wrongly marked as
__init, causing it to be freed, while being registered with the PCI
subsystem and could trigger:

 Unable to handle kernel paging request at virtual address ffff8000816c0400
  gicv2m_get_fwnode+0x0/0x58 (P)
  pci_set_bus_msi_domain+0x74/0x88
  pci_register_host_bridge+0x194/0x548

This is easily reproducible on a Juno board with ACPI boot.

Retain the function for later use.

Fixes: 0644b3daca ("irqchip/gic-v2m: acpi: Introducing GICv2m ACPI support")
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
2025-04-26 10:17:24 +02:00
Bibo Mao
5add0dbbeb LoongArch: KVM: Fix PMU pass-through issue if VM exits to host finally
In function kvm_pre_enter_guest(), it prepares to enter guest and check
whether there are pending signals or events. And it will not enter guest
if there are, PMU pass-through preparation for guest should be cancelled
and host should own PMU hardware.

Cc: stable@vger.kernel.org
Fixes: f4e40ea9f7 ("LoongArch: KVM: Add PMU support for guest")
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-04-26 09:58:13 +08:00
Bibo Mao
9ea86232a5 LoongArch: KVM: Fully clear some CSRs when VM reboot
Some registers such as LOONGARCH_CSR_ESTAT and LOONGARCH_CSR_GINTC are
partly cleared with function _kvm_setcsr(). This comes from the hardware
specification, some bits are read only in VM mode, and however they can
be written in host mode. So they are partly cleared in VM mode, and can
be fully cleared in host mode.

These read only bits show pending interrupt or exception status. When VM
reset, the read-only bits should be cleared, otherwise vCPU will receive
unknown interrupts in boot stage.

Here registers LOONGARCH_CSR_ESTAT/LOONGARCH_CSR_GINTC are fully cleared
in ioctl KVM_REG_LOONGARCH_VCPU_RESET vCPU reset path.

Cc: stable@vger.kernel.org
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-04-26 09:58:13 +08:00
Yulong Han
8b2d01fec8 LoongArch: KVM: Fix multiple typos of KVM code
Fix multiple typos inside arch/loongarch/kvm.

Cc: stable@vger.kernel.org
Reviewed-by: Yuli Wang <wangyuli@uniontech.com>
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Yulong Han <wheatfox17@icloud.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-04-26 09:58:13 +08:00
Ming Wang
bd51834d1c LoongArch: Return NULL from huge_pte_offset() for invalid PMD
LoongArch's huge_pte_offset() currently returns a pointer to a PMD slot
even if the underlying entry points to invalid_pte_table (indicating no
mapping). Callers like smaps_hugetlb_range() fetch this invalid entry
value (the address of invalid_pte_table) via this pointer.

The generic is_swap_pte() check then incorrectly identifies this address
as a swap entry on LoongArch, because it satisfies the "!pte_present()
&& !pte_none()" conditions. This misinterpretation, combined with a
coincidental match by is_migration_entry() on the address bits, leads to
kernel crashes in pfn_swap_entry_to_page().

Fix this at the architecture level by modifying huge_pte_offset() to
check the PMD entry's content using pmd_none() before returning. If the
entry is invalid (i.e., it points to invalid_pte_table), return NULL
instead of the pointer to the slot.

Cc: stable@vger.kernel.org
Acked-by: Peter Xu <peterx@redhat.com>
Co-developed-by: Hongchen Zhang <zhanghongchen@loongson.cn>
Signed-off-by: Hongchen Zhang <zhanghongchen@loongson.cn>
Signed-off-by: Ming Wang <wangming01@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-04-26 09:58:12 +08:00
Petr Tesarik
c37325cbd9 LoongArch: Remove a bogus reference to ZONE_DMA
Remove dead code. LoongArch does not have a DMA memory zone (24bit DMA).
The architecture does not even define MAX_DMA_PFN.

Cc: stable@vger.kernel.org
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-04-26 09:58:12 +08:00
Tiezhu Yang
2ef174b133 LoongArch: Handle fp, lsx, lasx and lbt assembly symbols
Like the other relevant symbols, export some fp, lsx, lasx and lbt
assembly symbols and put the function declarations in header files
rather than source files.

While at it, use "asmlinkage" for the other existing C prototypes
of assembly functions and also do not use the "extern" keyword with
function declarations according to the document coding-style.rst.

Cc: stable@vger.kernel.org # 6.6+
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-04-26 09:58:12 +08:00
Tiezhu Yang
cc73cc6bcd LoongArch: Make do_xyz() exception handlers more robust
Currently, interrupts need to be disabled before single-step mode is
set, it requires that CSR_PRMD_PIE be cleared in save_local_irqflag()
which is called by setup_singlestep(), this is reasonable.

But in the first kprobe breakpoint exception, if the irq is enabled at
the beginning of do_bp(), it will not be disabled at the end of do_bp()
due to the CSR_PRMD_PIE has been cleared in save_local_irqflag(). So for
this case, it may corrupt exception context when restoring the exception
after do_bp() in handle_bp(), this is not reasonable.

In order to restore exception safely in handle_bp(), it needs to ensure
the irq is disabled at the end of do_bp(), so just add a local variable
to record the original interrupt status in the parent context, then use
it as the check condition to enable and disable irq in do_bp().

While at it, do the similar thing for other do_xyz() exception handlers
to make them more robust.

Fixes: 6d4cc40fb5 ("LoongArch: Add kprobes support")
Suggested-by: Jinyang He <hejinyang@loongson.cn>
Suggested-by: Huacai Chen <chenhuacai@loongson.cn>
Co-developed-by: Tianyang Zhang <zhangtianyang@loongson.cn>
Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-04-26 09:58:12 +08:00
Tiezhu Yang
bb0511d59d LoongArch: Make regs_irqs_disabled() more clear
In the current code, the definition of regs_irqs_disabled() is actually
"!(regs->csr_prmd & CSR_CRMD_IE)" because arch_irqs_disabled_flags() is
defined as "!(flags & CSR_CRMD_IE)", it looks a little strange.

Define regs_irqs_disabled() as !(regs->csr_prmd & CSR_PRMD_PIE) directly
to make it more clear, no functional change.

While at it, the return value of regs_irqs_disabled() is true or false,
so change its type to reflect that and also make it always inline.

Fixes: 803b0fc5c3 ("LoongArch: Add process management")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-04-26 09:58:12 +08:00
Yuli Wang
fb8e9f59d6 LoongArch: Select ARCH_USE_MEMTEST
As of commit dce4456619 ("mm/memtest: add ARCH_USE_MEMTEST"),
architectures must select ARCH_USE_MEMTESET to enable CONFIG_MEMTEST.

Commit 628c3bb40e ("LoongArch: Add boot and setup routines") added
support for early_memtest but did not select ARCH_USE_MEMTESET.

Fixes: 628c3bb40e ("LoongArch: Add boot and setup routines")
Tested-by: Erpeng Xu <xuerpeng@uniontech.com>
Tested-by: Yuli Wang <wangyuli@uniontech.com>
Signed-off-by: Yuli Wang <wangyuli@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-04-26 09:58:12 +08:00
Linus Torvalds
f1a3944c86 Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:

 - Add namespace to BPF internal symbols (Alexei Starovoitov)

 - Fix possible endless loop in BPF map iteration (Brandon Kammerdiener)

 - Fix compilation failure for samples/bpf on LoongArch (Haoran Jiang)

 - Disable a part of sockmap_ktls test (Ihor Solodrai)

 - Correct typo in __clang_major__ macro (Peilin Ye)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Correct typo in __clang_major__ macro
  samples/bpf: Fix compilation failure for samples/bpf on LoongArch Fedora
  bpf: Add namespace to BPF internal symbols
  selftests/bpf: add test for softlock when modifying hashmap while iterating
  bpf: fix possible endless loop in BPF map iteration
  selftests/bpf: Mitigate sockmap_ktls disconnect_after_delete failure
2025-04-25 17:53:09 -07:00
Peilin Ye
f000791078 selftests/bpf: Correct typo in __clang_major__ macro
Make sure that CAN_USE_BPF_ST test (compute_live_registers/store) is
enabled when __clang_major__ >= 18.

Fixes: 2ea8f6a1cd ("selftests/bpf: test cases for compute_live_registers()")
Signed-off-by: Peilin Ye <yepeilin@google.com>
Link: https://lore.kernel.org/r/20250425213712.1542077-1-yepeilin@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-04-25 16:56:10 -07:00
Linus Torvalds
1eb09e624f Merge tag 'ata-6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fixes from Damien Le Moal:

 - Fix the incorrect return type of ata_mselect_control_ata_feature()

 - Several fixes for the control of the Command Duration Limits feature
   to avoid unnecessary enable and disable actions. Avoiding the
   unnecessary enable action also avoids unwanted resets of the CDL
   statistics log page as that is implied for any enable action.

 - Fix the translation for sensing the control mode page to correctly
   return the last enable or disable action performed, as defined in
   SAT-6. This correct mode sense information is used to fix the
   behavior of the scsi layer to avoid unnecessary mode select command
   issuing.

* tag 'ata-6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
  scsi: Improve CDL control
  ata: libata-scsi: Improve CDL control
  ata: libata-scsi: Fix ata_msense_control_ata_feature()
  ata: libata-scsi: Fix ata_mselect_control_ata_feature() return type
2025-04-25 16:31:10 -07:00
Linus Torvalds
eb98f30442 Merge tag 'vfs-6.15-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:

 - For some reason we went from zero to three maintainers for HFS/HFS+
   in a matter of days. The lesson to learn from this might just be that
   we need to threaten code removal more often!?

 - Fix a regression introduced by enabling large folios for lage logical
   block sizes. This has caused issues for noref migration with large
   folios due to sleeping while in an atomic context.

   New sleeping variants of pagecache lookup helpers are introduced.
   These helpers take the folio lock instead of the mapping's private
   spinlock. The problematic users are converted to the sleeping
   variants and serialize against noref migration. Atomic users will
   bail on seeing the new BH_Migrate flag.

   This also shrinks the critical region of the mapping's private lock
   and the new blocking callers reduce contention on the spinlock for
   bdev mappings.

 - Fix two bugs in do_move_mount() when with MOVE_MOUNT_BENEATH. The
   first bug is using a mountpoint that is located on a mount we're not
   holding a reference to. The second bug is putting the mountpoint
   after we've called namespace_unlock() as it's no longer guaranteed
   that it does stay a mountpoint.

 - Remove a pointless call to vfs_getattr_nosec() in the devtmpfs code
   just to query i_mode instead of simply querying the inode directly.
   This also avoids lifetime issues for the dm code by an earlier bugfix
   this cycle that moved bdev_statx() handling into vfs_getattr_nosec().

 - Fix AT_FDCWD handling with getname_maybe_null() in the xattr code.

 - Fix a performance regression for files when multiple callers issue a
   close when it's not the last reference.

 - Remove a duplicate noinline annotation from pipe_clear_nowait().

* tag 'vfs-6.15-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs/xattr: Fix handling of AT_FDCWD in setxattrat(2) and getxattrat(2)
  MAINTAINERS: hfs/hfsplus: add myself as maintainer
  splice: remove duplicate noinline from pipe_clear_nowait
  devtmpfs: don't use vfs_getattr_nosec to query i_mode
  fix a couple of races in MNT_TREE_BENEATH handling by do_move_mount()
  fs: fall back to file_ref_put() for non-last reference
  mm/migrate: fix sleep in atomic for large folios and buffer heads
  fs/ext4: use sleeping version of sb_find_get_block()
  fs/jbd2: use sleeping version of __find_get_block()
  fs/ocfs2: use sleeping version of __find_get_block()
  fs/buffer: use sleeping version of __find_get_block()
  fs/buffer: introduce sleeping flavors for pagecache lookups
  MAINTAINERS: add HFS/HFS+ maintainers
  fs/buffer: split locking for pagecache lookups
2025-04-25 15:57:21 -07:00