linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-04-11 10:43:46 -04:00

Author	SHA1	Message	Date
Thomas Gleixner	7e04e5c6f6	genirq/manage: Rework __irq_apply_affinity_hint() Use the new guards to get and lock the interrupt descriptor and tidy up the code. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065421.897188799@linutronix.de	2025-05-07 09:08:15 +02:00
Thomas Gleixner	b0561582ea	genirq/manage: Rework irq_update_affinity_desc() Use the new guards to get and lock the interrupt descriptor and tidy up the code. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065421.830357569@linutronix.de	2025-05-07 09:08:15 +02:00
Thomas Gleixner	17c1953567	genirq/manage: Convert to lock guards Convert lock/unlock pairs to guards. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065421.771476066@linutronix.de	2025-05-07 09:08:14 +02:00
Thomas Gleixner	0c169edf36	genirq/manage: Cleanup kernel doc comments Get rid of the extra tab to make it consistent. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065421.710273122@linutronix.de	2025-05-07 09:08:14 +02:00
Thomas Gleixner	95a3645893	genirq/chip: Rework irq_modify_status() Use the new guards to get and lock the interrupt descriptor and tidy up the code. Fixup the kernel doc comment while at it. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065421.650454052@linutronix.de	2025-05-07 09:08:14 +02:00
Thomas Gleixner	5cd05f3e23	genirq/chip: Rework irq_set_handler() variants Use the new guards to get and lock the interrupt descriptor and tidy up the code. Fixup the kernel doc comment while at it. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065421.590753128@linutronix.de	2025-05-07 09:08:14 +02:00
Thomas Gleixner	b3801ddc68	genirq/chip: Rework irq_set_chip_data() Use the new guards to get and lock the interrupt descriptor and tidy up the code. Fixup the kernel doc comment while at it. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065421.532308759@linutronix.de	2025-05-07 09:08:14 +02:00
Thomas Gleixner	c836e5a70c	genirq/chip: Rework irq_set_msi_desc_off() Use the new guards to get and lock the interrupt descriptor and tidy up the code. Fixup the kernel doc comment while at it. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065421.473563978@linutronix.de	2025-05-07 09:08:14 +02:00
Thomas Gleixner	321a0fdf13	genirq/chip: Rework irq_set_handler_data() Use the new guards to get and lock the interrupt descriptor and tidy up the code. Fixup the kernel doc comment while at it. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065421.415072350@linutronix.de	2025-05-07 09:08:14 +02:00
Thomas Gleixner	fa870e0f35	genirq/chip: Rework irq_set_irq_type() Use the new guards to get and lock the interrupt descriptor and tidy up the code. Fixup the kernel doc comment while at it. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065421.355673840@linutronix.de	2025-05-07 09:08:13 +02:00
Thomas Gleixner	46ff4d11f0	genirq/chip: Rework irq_set_chip() Use the new guards to get and lock the interrupt descriptor and tidy up the code. Fixup the kernel doc comment while at it. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065421.295400891@linutronix.de	2025-05-07 09:08:13 +02:00
Thomas Gleixner	e7c6542557	genirq/chip: Use lock guards where applicable Convert all lock/unlock pairs to guards and tidy up the code. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065421.236248749@linutronix.de	2025-05-07 09:08:13 +02:00
Thomas Gleixner	f71d7c45ed	genirq/chip: Rework handle_fasteoi_mask_irq() Use the new helpers to decide whether the interrupt should be handled and switch the descriptor locking to guard(). Note: The mask_irq() operation in the second condition was redundant as the interrupt is already masked right at the beginning of the function. Fixup the kernel doc comment while at it. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065421.175652864@linutronix.de	2025-05-07 09:08:13 +02:00
Thomas Gleixner	2beb01cbb7	genirq/chip: Rework handle_fasteoi_ack_irq() Use the new helpers to decide whether the interrupt should be handled and switch the descriptor locking to guard(). Fixup the kernel doc comment while at it. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065421.105015800@linutronix.de	2025-05-07 09:08:13 +02:00
Thomas Gleixner	2d46aea52c	genirq/chip: Rework handle_edge_irq() Use the new helpers to decide whether the interrupt should be handled and switch the descriptor locking to guard(). Fixup the kernel doc comment while at it. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065421.045492336@linutronix.de	2025-05-07 09:08:13 +02:00
Thomas Gleixner	15d772e2ee	genirq/chip: Rework handle_eoi_irq() Use the new helpers to decide whether the interrupt should be handled and switch the descriptor locking to guard(). Fixup the kernel doc comment while at it. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065420.986002418@linutronix.de	2025-05-07 09:08:13 +02:00
Thomas Gleixner	2334c45521	genirq/chip: Rework handle_level_irq() Use the new helpers to decide whether the interrupt should be handled and switch the descriptor locking to guard(). Fixup the kernel doc comment while at it. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065420.926362488@linutronix.de	2025-05-07 09:08:13 +02:00
Thomas Gleixner	a155777175	genirq/chip: Rework handle_untracked_irq() Use the new helpers to decide whether the interrupt should be handled and switch the descriptor locking to guard(). Fixup the kernel doc comment while at it. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065420.865212916@linutronix.de	2025-05-07 09:08:12 +02:00
Thomas Gleixner	1a3678675f	genirq/chip: Rework handle_simple_irq() Use the new helpers to decide whether the interrupt should be handled and switch the descriptor locking to guard(). Fixup the kernel doc comment while at it. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065420.804683349@linutronix.de	2025-05-07 09:08:12 +02:00
Thomas Gleixner	2ef2e13094	genirq/chip: Rework handle_nested_irq() Use the new helpers to decide whether the interrupt should be handled and switch the descriptor locking to guard(). Fixup the kernel doc comment while at it. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065420.744042890@linutronix.de	2025-05-07 09:08:12 +02:00
Thomas Gleixner	a6d8d0d12e	genirq/chip: Prepare for code reduction The interrupt flow handlers have similar patterns to decide whether to handle an interrupt or not. Provide common helper functions to allow removal of duplicated code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065420.682547546@linutronix.de	2025-05-07 09:08:12 +02:00
Thomas Gleixner	ecb84a3e7e	genirq/debugfs: Convert to lock guards Convert all lock/unlock pairs to guards and tidy up the code. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065420.620200108@linutronix.de	2025-05-07 09:08:12 +02:00
Thomas Gleixner	88a4df117a	genirq/cpuhotplug: Convert to lock guards Convert all lock/unlock pairs to guards and tidy up the code. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065420.560083665@linutronix.de	2025-05-07 09:08:12 +02:00
Thomas Gleixner	113332a865	genirq/spurious: Switch to lock guards Convert all lock/unlock pairs to guards and tidy up the code. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065420.497714413@linutronix.de	2025-05-07 09:08:12 +02:00
Thomas Gleixner	e815ffc759	genirq/spurious: Cleanup code Clean up the coding style No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065420.437285102@linutronix.de	2025-05-07 09:08:12 +02:00
Thomas Gleixner	659ff9c9d7	genirq/proc: Switch to lock guards Convert all lock/unlock pairs to guards and tidy up the code. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065420.373998838@linutronix.de	2025-05-07 09:08:11 +02:00
Thomas Gleixner	4bcdf07467	genirq/resend: Switch to lock guards Convert all lock/unlock pairs to guards and tidy up the code. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065420.312487167@linutronix.de	2025-05-07 09:08:11 +02:00
Thomas Gleixner	19b4b14428	genirq/pm: Switch to lock guards Convert all lock/unlock pairs to guards and tidy up the code. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065420.251299112@linutronix.de	2025-05-07 09:08:11 +02:00
Thomas Gleixner	e80618b27a	genirq/autoprobe: Switch to lock guards Convert all lock/unlock pairs to guards. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065420.188866381@linutronix.de	2025-05-07 09:08:11 +02:00
Thomas Gleixner	5d964a9f7c	genirq/irqdesc: Switch to lock guards Replace all lock/unlock pairs with lock guards and simplify the code flow. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jiri Slaby <jirislaby@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/871ptaqhoo.ffs@tglx	2025-05-07 09:08:11 +02:00
Thomas Gleixner	0f70a49f3f	genirq: Provide conditional lock guards The interrupt core code has an ever repeating pattern: unsigned long flags; struct irq_desc *desc = irq_get_desc_[bus]lock(irq, &flags, mode); if (!desc) return -EINVAL; .... irq_put_desc_[bus]unlock(desc, flags); That requires gotos in failure paths and just creates visual clutter. Provide lock guards, which allow to simplify the code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250429065420.061659985@linutronix.de	2025-05-07 09:08:11 +02:00
Martin KaFai Lau	fb5b480205	bpf: Add bpf_list_{front,back} kfunc In the kernel fq qdisc implementation, it only needs to look at the fields of the first node in a list but does not always need to remove it from the list. It is more convenient to have a peek kfunc for the list. It works similar to the bpf_rbtree_first(). This patch adds bpf_list_{front,back} kfunc. The verifier is changed such that the kfunc returning "struct bpf_list_node *" will be marked as non-owning. The exception is the KF_ACQUIRE kfunc. The net effect is only the new bpf_list_{front,back} kfuncs will have its return pointer marked as non-owning. Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20250506015857.817950-8-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-06 10:21:05 -07:00
Martin KaFai Lau	3fab84f00d	bpf: Simplify reg0 marking for the list kfuncs that return a bpf_list_node pointer The next patch will add bpf_list_{front,back} kfuncs to peek the head and tail of a list. Both of them will return a 'struct bpf_list_node *'. Follow the earlier change for rbtree, this patch checks the return btf type is a 'struct bpf_list_node' pointer instead of checking each kfuncs individually to decide if mark_reg_graph_node should be called. This will make the bpf_list_{front,back} kfunc addition easier in the later patch. Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20250506015857.817950-7-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-06 10:21:05 -07:00
Martin KaFai Lau	2ddef1783c	bpf: Allow refcounted bpf_rb_node used in bpf_rbtree_{remove,left,right} The bpf_rbtree_{remove,left,right} requires the root's lock to be held. They also check the node_internal->owner is still owned by that root before proceeding, so it is safe to allow refcounted bpf_rb_node pointer to be used in these kfuncs. In a bpf fq implementation which is much closer to the kernel fq, https://lore.kernel.org/bpf/20250418224652.105998-13-martin.lau@linux.dev/, a networking flow (allocated by bpf_obj_new) can be added to two different rbtrees. There are cases that the flow is searched from one rbtree, held the refcount of the flow, and then removed from another rbtree: struct fq_flow { struct bpf_rb_node fq_node; struct bpf_rb_node rate_node; struct bpf_refcount refcount; unsigned long sk_long; }; int bpf_fq_enqueue(...) { /* ... / bpf_spin_lock(&root->lock); while (can_loop) { / ... / if (!p) break; gc_f = bpf_rb_entry(p, struct fq_flow, fq_node); if (gc_f->sk_long == sk_long) { f = bpf_refcount_acquire(gc_f); break; } / ... */ } bpf_spin_unlock(&root->lock); if (f) { bpf_spin_lock(&q->lock); bpf_rbtree_remove(&q->delayed, &f->rate_node); bpf_spin_unlock(&q->lock); } } bpf_rbtree_{left,right} do not need this change but are relaxed together with bpf_rbtree_remove instead of adding extra verifier logic to exclude these kfuncs. To avoid bi-sect failure, this patch also changes the selftests together. The "rbtree_api_remove_unadded_node" is not expecting verifier's error. The test now expects bpf_rbtree_remove(&groot, &m->node) to return NULL. The test uses __retval(0) to ensure this NULL return value. Some of the "only take non-owning..." failure messages are changed also. Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20250506015857.817950-5-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-06 10:21:05 -07:00
Martin KaFai Lau	9e3e66c553	bpf: Add bpf_rbtree_{root,left,right} kfunc In a bpf fq implementation that is much closer to the kernel fq, it will need to traverse the rbtree: https://lore.kernel.org/bpf/20250418224652.105998-13-martin.lau@linux.dev/ The much simplified logic that uses the bpf_rbtree_{root,left,right} to traverse the rbtree is like: struct fq_flow { struct bpf_rb_node fq_node; struct bpf_rb_node rate_node; struct bpf_refcount refcount; unsigned long sk_long; }; struct fq_flow_root { struct bpf_spin_lock lock; struct bpf_rb_root root __contains(fq_flow, fq_node); }; struct fq_flow fq_classify(...) { struct bpf_rb_node tofree[FQ_GC_MAX]; struct fq_flow_root root; struct fq_flow gc_f, f; struct bpf_rb_node p; int i, fcnt = 0; /* ... / f = NULL; bpf_spin_lock(&root->lock); p = bpf_rbtree_root(&root->root); while (can_loop) { if (!p) break; gc_f = bpf_rb_entry(p, struct fq_flow, fq_node); if (gc_f->sk_long == sk_long) { f = bpf_refcount_acquire(gc_f); break; } / To be removed from the rbtree / if (fcnt < FQ_GC_MAX && fq_gc_candidate(gc_f, jiffies_now)) tofree[fcnt++] = p; if (gc_f->sk_long > sk_long) p = bpf_rbtree_left(&root->root, p); else p = bpf_rbtree_right(&root->root, p); } / remove from the rbtree / for (i = 0; i < fcnt; i++) { p = tofree[i]; tofree[i] = bpf_rbtree_remove(&root->root, p); } bpf_spin_unlock(&root->lock); / bpf_obj_drop the fq_flow(s) that have just been removed * from the rbtree. */ for (i = 0; i < fcnt; i++) { p = tofree[i]; if (p) { gc_f = bpf_rb_entry(p, struct fq_flow, fq_node); bpf_obj_drop(gc_f); } } return f; } The above simplified code needs to traverse the rbtree for two purposes, 1) find the flow with the desired sk_long value 2) while searching for the sk_long, collect flows that are the fq_gc_candidate. They will be removed from the rbtree. This patch adds the bpf_rbtree_{root,left,right} kfunc to enable the rbtree traversal. The returned bpf_rb_node pointer will be a non-owning reference which is the same as the returned pointer of the exisiting bpf_rbtree_first kfunc. Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20250506015857.817950-4-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-06 10:21:05 -07:00
Martin KaFai Lau	7faccdf4b4	bpf: Simplify reg0 marking for the rbtree kfuncs that return a bpf_rb_node pointer The current rbtree kfunc, bpf_rbtree_{first, remove}, returns the bpf_rb_node pointer. The check_kfunc_call currently checks the kfunc btf_id instead of its return pointer type to decide if it needs to do mark_reg_graph_node(reg0) and ref_set_non_owning(reg0). The later patch will add bpf_rbtree_{root,left,right} that will also return a bpf_rb_node pointer. Instead of adding more kfunc btf_id checks to the "if" case, this patch changes the test to check the kfunc's return type. is_rbtree_node_type() function is added to test if a pointer type is a bpf_rb_node. The callers have already skipped the modifiers of the pointer type. A note on the ref_set_non_owning(), although bpf_rbtree_remove() also returns a bpf_rb_node pointer, the bpf_rbtree_remove() has the KF_ACQUIRE flag. Thus, its reg0 will not become non-owning. Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20250506015857.817950-3-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-06 10:21:05 -07:00
Martin KaFai Lau	b183c0123d	bpf: Check KF_bpf_rbtree_add_impl for the "case KF_ARG_PTR_TO_RB_NODE" In a later patch, two new kfuncs will take the bpf_rb_node pointer arg. struct bpf_rb_node bpf_rbtree_left(struct bpf_rb_root root, struct bpf_rb_node node); struct bpf_rb_node bpf_rbtree_right(struct bpf_rb_root root, struct bpf_rb_node node); In the check_kfunc_call, there is a "case KF_ARG_PTR_TO_RB_NODE" to check if the reg->type should be an allocated pointer or should be a non_owning_ref. The later patch will need to ensure that the bpf_rb_node pointer passing to the new bpf_rbtree_{left,right} must be a non_owning_ref. This should be the same requirement as the existing bpf_rbtree_remove. This patch swaps the current "if else" statement. Instead of checking the bpf_rbtree_remove, it checks the bpf_rbtree_add. Then the new bpf_rbtree_{left,right} will fall into the "else" case to make the later patch simpler. bpf_rbtree_add should be the only one that needs an allocated pointer. This should be a no-op change considering there are only two kfunc(s) taking bpf_rb_node pointer arg, rbtree_add and rbtree_remove. Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20250506015857.817950-2-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-06 10:21:05 -07:00
Al Viro	2dbf6e0df4	kill vfs_submount() The last remaining user of vfs_submount() (tracefs) is easy to convert to fs_context_for_submount(); do that and bury that thing, along with SB_SUBMOUNT Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2025-05-06 12:49:07 -04:00
Waiman Long	cdb7d2d68c	locking/lockdep: Add number of dynamic keys to /proc/lockdep_stats There have been recent reports about running out of lockdep keys: MAX_LOCKDEP_KEYS too low! One possible reason is that too many dynamic keys have been registered. A possible culprit is the lockdep_register_key() call in qdisc_alloc() of net/sched/sch_generic.c. Currently, there is no way to find out how many dynamic keys have been registered. Add such a stat to the /proc/lockdep_stats to get better clarity. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Bill Wendling <morbo@google.com> Cc: Justin Stitt <justinstitt@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20250506042049.50060-4-boqun.feng@gmail.com	2025-05-06 18:34:43 +02:00
Waiman Long	6a1a219f53	locking/lockdep: Prevent abuse of lockdep subclass To catch the code trying to use a subclass value >= MAX_LOCKDEP_SUBCLASSES (8), add a DEBUG_LOCKS_WARN_ON() statement to notify the users that such a large value is not allowed. [ boqun: Reword the commit log with a more objective tone ] Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Bill Wendling <morbo@google.com> Cc: Justin Stitt <justinstitt@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20250506042049.50060-3-boqun.feng@gmail.com	2025-05-06 18:34:35 +02:00
Andy Shevchenko	96ca1830e1	locking/lockdep: Move hlock_equal() to the respective #ifdeffery When hlock_equal() is unused, it prevents kernel builds with clang, `make W=1` and CONFIG_WERROR=y, CONFIG_LOCKDEP=y and CONFIG_LOCKDEP_SMALL=n: lockdep.c:2005:20: error: unused function 'hlock_equal' [-Werror,-Wunused-function] Fix this by moving the function to the respective existing ifdeffery for its the only user. See also: `6863f5643d` ("kbuild: allow Clang to find unused static inline functions for W=1 build") Fixes: `68e3056785` ("lockdep: Adjust check_redundant() for recursive read change") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Bill Wendling <morbo@google.com> Cc: Justin Stitt <justinstitt@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20250506042049.50060-2-boqun.feng@gmail.com	2025-05-06 18:34:31 +02:00
Steven Rostedt	54c53dfdb6	tracing: Add common_comm to histograms If one wants to trace the name of the task that wakes up a process and pass that to the synthetic events, there's nothing currently that lets the synthetic events do that. Add a "common_comm" to the histogram logic that allows histograms save the current->comm as a variable that can be passed through and added to a synthetic event: # cd /sys/kernel/tracing # echo 's:wake_lat char[] waker; char[] wakee; u64 delta;' >> dynamic_events # echo 'hist:keys=pid:comm=common_comm:ts=common_timestamp.usecs if !(common_flags & 0x18)' > events/sched/sched_waking/trigger # echo 'hist:keys=next_pid:wake_comm=$comm:delta=common_timestamp.usecs-$ts:onmatch(sched.sched_waking).trace(wake_lat,$wake_comm,next_comm,$delta)' > events/sched/sched_switch/trigger The above will create a synthetic trace event that will save both the name of the waker and the wakee but only if the wakeup did not happen in a hard or soft interrupt context. The "common_comm" is used to save the task->comm at the time of the initial event and is passed via the "comm" variable to the second event, and that is saved as the "waker" field in the "wake_lat" synthetic event. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250407154912.3c6c6246@gandalf.local.home Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-05-06 11:37:03 -04:00
Steven Rostedt	7ab0fc61ce	tracing: Move histogram trigger variables from stack to per CPU structure The histogram trigger has three somewhat large arrays on the kernel stack: unsigned long entries[HIST_STACKTRACE_DEPTH]; u64 var_ref_vals[TRACING_MAP_VARS_MAX]; char compound_key[HIST_KEY_SIZE_MAX]; Checking the function event_hist_trigger() stack frame size, it currently uses 816 bytes for its stack frame due to these variables! Instead, allocate a per CPU structure that holds these arrays for each context level (normal, softirq, irq and NMI). That is, each CPU will have 4 of these structures. This will be allocated when the first histogram trigger is enabled and freed when the last is disabled. When the histogram callback triggers, it will request this structure. The request will disable preemption, get the per CPU structure at the index of the per CPU variable, and increment that variable. The callback will use the arrays in this structure to perform its work and then release the structure. That in turn will simply decrement the per CPU index and enable preemption. Moving the variables from the kernel stack to the per CPU structure brings the stack frame of event_hist_trigger() down to just 112 bytes. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Tom Zanussi <zanussi@kernel.org> Link: https://lore.kernel.org/20250407123851.74ea8d58@gandalf.local.home Fixes: `067fe038e7` ("tracing: Add variable reference handling to hist triggers") Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-05-06 11:36:11 -04:00
Steven Rostedt	872a0d90c1	tracing: Always use memcpy() in histogram add_to_key() The add_to_key() function tests if the key is a string or some data. If it's a string it does some further calculations of the string size (still truncating it to the max size it can be), and calls strncpy(). If the key isn't as string it calls memcpy(). The interesting point is that both use the exact same parameters: strncpy(compound_key + key_field->offset, (char *)key, size); } else memcpy(compound_key + key_field->offset, key, size); As strncpy() is being used simply as a memcpy() for a string, and since strncpy() is deprecated, just call memcpy() for both memory and string keys. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/20250403210637.1c477d4a@gandalf.local.home Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-05-06 11:35:34 -04:00
Steven Rostedt	3e4b37160b	tracing: Show preempt and irq events callsites from the offsets in field print When the "fields" option is set in a trace instance, it ignores the "print fmt" portion of the trace event and just prints the raw fields defined by the TP_STRUCT__entry() of the TRACE_EVENT() macro. The preempt_disable/enable and irq_disable/enable events record only the caller offset from _stext to save space in the ring buffer. Even though the "fields" option only prints the fields, it also tries to print what they represent too, which includes function names. Add a check in the output of the event field printing to see if the field name is "caller_offs" or "parent_offs" and then print the function at the offset from _stext of that field. Instead of just showing: irq_disable: caller_offs=0xba634d (12215117) parent_offs=0x39d10e2 (60625122) Show: irq_disable: caller_offs=trace_hardirqs_off.part.0+0xad/0x130 0xba634d (12215117) parent_offs=_raw_spin_lock_irqsave+0x62/0x70 0x39d10e2 (60625122) Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250506105131.4b6089a9@gandalf.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-05-06 11:34:52 -04:00
Steven Rostedt	dc6a49d4cd	tracing: Adjust addresses for printing out fields Add adjustments to the values of the "fields" output if the buffer is a persistent ring buffer to adjust the addresses to both the kernel core and kernel modules if they match a module in the persistent memory and that module is also loaded. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250325185619.54b85587@gandalf.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-05-06 11:32:29 -04:00
Steven Rostedt	00d872dd54	tracing: Only return an adjusted address if it matches the kernel address The trace_adjust_address() will take a given address and examine the persistent ring buffer to see if the address matches a module that is listed there. If it does not, it will just adjust the value to the core kernel delta. But if the address was for something that was not part of the core kernel text or data it should not be adjusted. Check the result of the adjustment and only return the adjustment if it lands in the current kernel text or data. If not, return the original address. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250506102300.0ba2f9e0@gandalf.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-05-06 11:31:45 -04:00
Steven Rostedt	531ee10b43	tracing: Show function names when possible when listing fields When the "fields" option is enabled, the "print fmt" of the trace event is ignored and only the fields are printed. But some fields contain function pointers. Instead of just showing the hex value in this case, show the function name when possible: Instead of having: # echo 1 > options/fields # cat trace [..] kmem_cache_free: call_site=0xffffffffa9afcf31 (-1448095951) ptr=0xffff888124452910 (-131386736039664) name=kmemleak_object Have it output: kmem_cache_free: call_site=rcu_do_batch+0x3d1/0x14a0 (-1768960207) ptr=0xffff888132ea5ed0 (854220496) name=kmemleak_object Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250325213919.624181915@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-05-06 11:30:56 -04:00
Steven Rostedt	e3223e1e9a	tracing: Update function trace addresses with module addresses Now that module addresses are saved in the persistent ring buffer, their addresses can be used to adjust the address in the persistent ring buffer to the address of the module that is currently loaded. Instead of blindly using the text_delta that only works for core kernel code, call the trace_adjust_address() that will see if the address matches an address saved in the persistent ring buffer, and then uses that against the matching module if it is loaded. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250506111648.5df7f3ec@gandalf.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-05-06 11:30:17 -04:00
Christian Brauner	db56723cea	pidfs: detect refcount bugs Now that we have pidfs_{get,register}_pid() that needs to be paired with pidfs_put_pid() it's possible that someone pairs them with put_pid(). Thus freeing struct pid while it's still used by pidfs. Notice when that happens. I'll also add a scheme to detect invalid uses of pidfs_get_pid() and pidfs_put_pid() later. Link: https://lore.kernel.org/20250506-uferbereich-guttun-7c8b1a0a431f@brauner Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-05-06 13:59:00 +02:00

... 5 6 7 8 9 ...

48360 Commits