linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-22 13:15:11 -04:00

Author	SHA1	Message	Date
Paul E. McKenney	05b724655b	rcutorture: Increase visibility of forward-progress hangs This commit adds a few pr_alert() calls to rcutorture's forward-progress testing in order to better diagnose shutdown-time hangs. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-02-01 17:24:38 -08:00
Paul E. McKenney	6f81bd6a4e	rcutorture: Print message before invoking ->cb_barrier() The various ->cb_barrier() functions, for example, rcu_barrier(), sometimes cause rcutorture hangs. But currently, the last console message is the unenlightening "Stopping rcu_torture_stats". This commit therefore prints a message of the form "rcu_torture_cleanup: Invoking rcu_barrier+0x0/0x1e0()" to help point people in the right direction. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-02-01 17:24:38 -08:00
Zqiang	c951587585	rcu: Add per-CPU rcuc task dumps to RCU CPU stall warnings When the rcutree.use_softirq kernel boot parameter is set to zero, all RCU_SOFTIRQ processing is carried out by the per-CPU rcuc kthreads. If these kthreads are being starved, quiescent states will not be reported, which in turn means that the grace period will not end, which can in turn trigger RCU CPU stall warnings. This commit therefore dumps stack traces of stalled CPUs' rcuc kthreads, which can help identify what is preventing those kthreads from running. Suggested-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Reviewed-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Zqiang <qiang1.zhang@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-02-01 17:22:17 -08:00
Paul E. McKenney	10c5357874	rcu: Don't deboost before reporting expedited quiescent state Currently rcu_preempt_deferred_qs_irqrestore() releases rnp->boost_mtx before reporting the expedited quiescent state. Under heavy real-time load, this can result in this function being preempted before the quiescent state is reported, which can in turn prevent the expedited grace period from completing. Tim Murray reports that the resulting expedited grace periods can take hundreds of milliseconds and even more than one second, when they should normally complete in less than a millisecond. This was fine given that there were no particular response-time constraints for synchronize_rcu_expedited(), as it was designed for throughput rather than latency. However, some users now need sub-100-millisecond response-time constratints. This patch therefore follows Neeraj's suggestion (seconded by Tim and by Uladzislau Rezki) of simply reversing the two operations. Reported-by: Tim Murray <timmurray@google.com> Reported-by: Joel Fernandes <joelaf@google.com> Reported-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Tested-by: Tim Murray <timmurray@google.com> Cc: Todd Kjos <tkjos@google.com> Cc: Sandeep Patil <sspatil@google.com> Cc: <stable@vger.kernel.org> # 5.4.x Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-02-01 17:19:41 -08:00
Alison Chaiken	c8b16a6526	rcu: Elevate priority of offloaded callback threads When CONFIG_PREEMPT_RT=y, the rcutree.kthread_prio command-line parameter signals initialization code to boost the priority of rcuc callbacks to the designated value. With the additional CONFIG_RCU_NOCB_CPU=y configuration and an additional rcu_nocbs command-line parameter, the callbacks on the listed cores are offloaded to new rcuop kthreads that are not pinned to the cores whose post-grace-period work is performed. While the rcuop kthreads perform the same function as the rcuc kthreads they offload, the kthread_prio parameter only boosts the priority of the rcuc kthreads. Fix this inconsistency by elevating rcuop kthreads to the same priority as the rcuc kthreads. Signed-off-by: Alison Chaiken <achaiken@aurora.tech> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-02-01 17:19:25 -08:00
Alison Chaiken	54577e23fa	rcu: Make priority of grace-period thread consistent The priority of RCU grace period threads is set to kthread_prio when they are launched from rcu_spawn_gp_kthread(). The same is not true of rcu_spawn_one_nocb_kthread(). Accordingly, add priority elevation to rcu_spawn_one_nocb_kthread(). Signed-off-by: Alison Chaiken <achaiken@aurora.tech> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-02-01 17:19:17 -08:00
Alison Chaiken	c8db27dd0e	rcu: Move kthread_prio bounds-check to a separate function Move the bounds-check of the kthread_prio cmdline parameter to a new function in order to faciliate a different callsite. Signed-off-by: Alison Chaiken <achaiken@aurora.tech> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-02-01 17:19:02 -08:00
Zqiang	4b4399b245	rcu: Create per-cpu rcuc kthreads only when rcutree.use_softirq=0 The per-CPU "rcuc" kthreads are used only by kernels booted with rcutree.use_softirq=0, but they are nevertheless unconditionally created by kernels built with CONFIG_RCU_BOOST=y. This results in "rcuc" kthreads being created that are never actually used. This commit therefore refrains from creating these kthreads unless the kernel is actually booted with rcutree.use_softirq=0. Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Zqiang <qiang1.zhang@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-02-01 17:19:02 -08:00
Neeraj Upadhyay	eae9f147a4	rcu: Remove unused rcu_state.boost Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-02-01 17:19:02 -08:00
Neeraj Upadhyay	02e3024175	rcu/nocb: Handle concurrent nocb kthreads creation When multiple CPUs in the same nocb gp/cb group concurrently come online, they might try to concurrently create the same rcuog kthread. Fix this by using nocb gp CPU's spawn mutex to provide mutual exclusion for the rcuog kthread creation code. [ paulmck: Whitespace fixes per kernel test robot feedback. ] Acked-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-02-01 17:19:02 -08:00
Paul E. McKenney	a47f9f131d	rcu: Mark accesses to boost_starttime The boost_starttime shared variable has conflicting unmarked C-language accesses, which are dangerous at best. This commit therefore adds appropriate marking. This was found by KCSAN. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-02-01 17:19:02 -08:00
Paul E. McKenney	63c564da11	rcu: Mark ->expmask access in synchronize_rcu_expedited_wait() This commit adds a READ_ONCE() to an access to the rcu_node structure's ->expmask field to prevent compiler mischief. Detected by KCSAN. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-02-01 17:05:10 -08:00
Neeraj Upadhyay	4d266c247d	rcu/exp: Fix check for idle context in rcu_exp_handler For PREEMPT_RCU, the rcu_exp_handler() function checks whether the current CPU is in idle, by calling rcu_dynticks_curr_cpu_in_eqs(). However, rcu_exp_handler() is called in IPI handler context. So, it should be checking the idle context using rcu_is_cpu_rrupt_from_idle(). Fix this by using rcu_is_cpu_rrupt_from_idle() instead of rcu_dynticks_curr_cpu_in_eqs(). Non-preempt configuration already uses the correct check. Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-02-01 17:05:10 -08:00
Paul E. McKenney	da123016ca	rcu-tasks: Fix computation of CPU-to-list shift counts The ->percpu_enqueue_shift field is used to map from the running CPU number to the index of the corresponding callback list. This mapping can change at runtime in response to varying callback load, resulting in varying levels of contention on the callback-list locks. Unfortunately, the initial value of this field is correct only if the system happens to have a power-of-two number of CPUs, otherwise the callbacks from the high-numbered CPUs can be placed into the callback list indexed by 1 (rather than 0), and those index-1 callbacks will be ignored. This can result in soft lockups and hangs. This commit therefore corrects this mapping, adding one to this shift count as needed for systems having odd numbers of CPUs. Fixes: `7a30871b6a` ("rcu-tasks: Introduce ->percpu_enqueue_shift for dynamic queue selection") Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-01-26 13:04:05 -08:00
Linus Torvalds	f56caedaf9	Merge branch 'akpm' (patches from Andrew) Merge misc updates from Andrew Morton: "146 patches. Subsystems affected by this patch series: kthread, ia64, scripts, ntfs, squashfs, ocfs2, vfs, and mm (slab-generic, slab, kmemleak, dax, kasan, debug, pagecache, gup, shmem, frontswap, memremap, memcg, selftests, pagemap, dma, vmalloc, memory-failure, hugetlb, userfaultfd, vmscan, mempolicy, oom-kill, hugetlbfs, migration, thp, ksm, page-poison, percpu, rmap, zswap, zram, cleanups, hmm, and damon)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (146 commits) mm/damon: hide kernel pointer from tracepoint event mm/damon/vaddr: hide kernel pointer from damon_va_three_regions() failure log mm/damon/vaddr: use pr_debug() for damon_va_three_regions() failure logging mm/damon/dbgfs: remove an unnecessary variable mm/damon: move the implementation of damon_insert_region to damon.h mm/damon: add access checking for hugetlb pages Docs/admin-guide/mm/damon/usage: update for schemes statistics mm/damon/dbgfs: support all DAMOS stats Docs/admin-guide/mm/damon/reclaim: document statistics parameters mm/damon/reclaim: provide reclamation statistics mm/damon/schemes: account how many times quota limit has exceeded mm/damon/schemes: account scheme actions that successfully applied mm/damon: remove a mistakenly added comment for a future feature Docs/admin-guide/mm/damon/usage: update for kdamond_pid and (mk\|rm)_contexts Docs/admin-guide/mm/damon/usage: mention tracepoint at the beginning Docs/admin-guide/mm/damon/usage: remove redundant information Docs/admin-guide/mm/damon/usage: update for scheme quotas and watermarks mm/damon: convert macro functions to static inline functions mm/damon: modify damon_rand() macro to static inline function mm/damon: move damon_rand() definition into damon.h ...	2022-01-15 20:37:06 +02:00
Cai Huoqing	3b9cb4ba4b	rcutorture: make use of the helper function kthread_run_on_cpu() Replace kthread_create_on_node/kthread_bind/wake_up_process() with kthread_run_on_cpu() to simplify the code. Link: https://lkml.kernel.org/r/20211022025711.3673-5-caihuoqing@baidu.com Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Cc: Bernard Metzler <bmt@zurich.ibm.com> Cc: Daniel Bristot de Oliveira <bristot@kernel.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Doug Ledford <dledford@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-01-15 16:30:24 +02:00
Paul E. McKenney	f80fe66c38	Merge branches 'doc.2021.11.30c', 'exp.2021.12.07a', 'fastnohz.2021.11.30c', 'fixes.2021.11.30c', 'nocb.2021.12.09a', 'nolibc.2021.11.30c', 'tasks.2021.12.09a', 'torture.2021.12.07a' and 'torturescript.2021.11.30c' into HEAD doc.2021.11.30c: Documentation updates. exp.2021.12.07a: Expedited-grace-period fixes. fastnohz.2021.11.30c: Remove CONFIG_RCU_FAST_NO_HZ. fixes.2021.11.30c: Miscellaneous fixes. nocb.2021.12.09a: No-CB CPU updates. nolibc.2021.11.30c: Tiny in-kernel library updates. tasks.2021.12.09a: RCU-tasks updates, including update-side scalability. torture.2021.12.07a: Torture-test in-kernel module updates. torturescript.2021.11.30c: Torture-test scripting updates.	2021-12-09 11:38:09 -08:00
Frederic Weisbecker	10d4703154	rcu/nocb: Merge rcu_spawn_cpu_nocb_kthread() and rcu_spawn_one_nocb_kthread() The rcu_spawn_one_nocb_kthread() function is called only from rcu_spawn_cpu_nocb_kthread(). Therefore, inline the former into the latter, saving a few lines of code. Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Joel Fernandes <joel@joelfernandes.org> Tested-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 11:35:16 -08:00
Frederic Weisbecker	d2cf0854d7	rcu/nocb: Allow empty "rcu_nocbs" kernel parameter Allow the rcu_nocbs kernel parameter to be specified just by itself, without specifying any CPUs. This allows systems administrators to use "rcu_nocbs" to specify that none of the CPUs are to be offloaded at boot time, but than any of them may be offloaded at runtime via cpusets. In contrast, if the "rcu_nocbs" or "nohz_full" kernel parameters are not specified at all, then not only are none of the CPUs offloaded at boot, none of them can be offloaded at runtime, either. While in the area, modernize the description of the "rcuo" kthreads' naming scheme. Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Joel Fernandes <joel@joelfernandes.org> Tested-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 11:35:11 -08:00
Frederic Weisbecker	2cf4528d6d	rcu/nocb: Create kthreads on all CPUs if "rcu_nocbs=" or "nohz_full=" are passed In order to be able to (de-)offload any CPU using cpusets in the future, create the NOCB data structures for all possible CPUs. For now this is done only as long as the "rcu_nocbs=" or "nohz_full=" kernel parameters are passed to avoid the unnecessary overhead for most users. Note that the rcuog and rcuoc kthreads are not created until at least one of the corresponding CPUs comes online. This approach avoids the creation of excess kthreads when firmware lies about the number of CPUs present on the system. Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Joel Fernandes <joel@joelfernandes.org> Tested-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 11:35:06 -08:00
Frederic Weisbecker	a81aeaf7a1	rcu/nocb: Optimize kthreads and rdp initialization Currently cpumask_available() is used to prevent from unwanted NOCB initialization. However if neither "rcu_nocbs=" nor "nohz_full=" parameters are passed to a kernel built with CONFIG_CPUMASK_OFFSTACK=n, the initialization path is still taken, running through all sorts of needless operations and iterations on an empty cpumask. Fix this by relying on a real initialization state instead. This also optimizes kthread creation, preventing needless iteration over all online CPUs when the kernel is booted without any offloaded CPUs. Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Joel Fernandes <joel@joelfernandes.org> Tested-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 11:34:37 -08:00
Frederic Weisbecker	8d97039646	rcu/nocb: Prepare nocb_cb_wait() to start with a non-offloaded rdp In order to be able to toggle the offloaded state from cpusets, a nocb kthread will need to be created for all possible CPUs whenever either of the "rcu_nocbs=" or "nohz_full=" parameters are specified. Therefore, the nocb_cb_wait() kthread must be prepared to start running on a de-offloaded rdp. To accomplish this, simply move the sleeping condition to the beginning of the nocb_cb_wait() function, which prevents this kthread from attempting to invoke callbacks before the corresponding CPU is offloaded. Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Joel Fernandes <joel@joelfernandes.org> Tested-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 11:34:30 -08:00
Frederic Weisbecker	2ebc45c44c	rcu/nocb: Remove rcu_node structure from nocb list when de-offloaded The nocb_gp_wait() function iterates over all CPUs in its group, including even those CPUs that have been de-offloaded. This is of course suboptimal, especially if none of the CPUs within the group are currently offloaded. This will become even more of a problem once a nocb kthread is created for all possible CPUs. Therefore use a standard double linked list to link all the offloaded rcu_data structures and safely add or delete these structure as we offload or de-offload them, respectively. Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Joel Fernandes <joel@joelfernandes.org> Tested-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 11:34:07 -08:00
Paul E. McKenney	fd796e4139	rcu-tasks: Use fewer callbacks queues if callback flood ends By default, when lock contention is encountered, the RCU Tasks flavors of RCU switch to using per-CPU queueing. However, if the callback flood ends, per-CPU queueing continues to be used, which introduces significant additional overhead, especially for callback invocation, which fans out a series of workqueue handlers. This commit therefore switches back to single-queue operation if at the beginning of a grace period there are very few callbacks. The definition of "very few" is set by the rcupdate.rcu_task_collapse_lim module parameter, which defaults to 10. This switch happens in two phases, with the first phase causing future callbacks to be enqueued on CPU 0's queue, but with all queues continuing to be checked for grace periods and callback invocation. The second phase checks to see if an RCU grace period has elapsed and if all remaining RCU-Tasks callbacks are queued on CPU 0. If so, only CPU 0 is checked for future grace periods and callback operation. Of course, the return of contention anywhere during this process will result in returning to per-CPU callback queueing. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:52:11 -08:00
Paul E. McKenney	2cee0789b4	rcu-tasks: Use separate ->percpu_dequeue_lim for callback dequeueing Decreasing the number of callback queues is a bit tricky because it is necessary to handle callbacks that were queued before the number of queues decreased, but which were not ready to invoke until afterwards. This commit takes a first step in this direction by maintaining a separate ->percpu_dequeue_lim to control callback dequeueing, in addition to the existing ->percpu_enqueue_lim which now controls only enqueueing. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:52:11 -08:00
Paul E. McKenney	ab97152f88	rcu-tasks: Use more callback queues if contention encountered The rcupdate.rcu_task_enqueue_lim module parameter allows system administrators to tune the number of callback queues used by the RCU Tasks flavors. However if callback storms are infrequent, it would be better to operate with a single queue on a given system unless and until that system actually needed more queues. Systems not needing more queues can then avoid the overhead of checking the extra queues and especially avoid the overhead of fanning workqueue handlers out to all CPUs to invoke callbacks. This commit therefore switches to using all the CPUs' callback queues if call_rcu_tasks_generic() encounters too much lock contention. The amount of lock contention to tolerate defaults to 100 contended lock acquisitions per jiffy, and can be adjusted using the new rcupdate.rcu_task_contend_lim module parameter. Such switching is undertaken only if the rcupdate.rcu_task_enqueue_lim module parameter is negative, which is its default value (-1). This allows savvy systems administrators to set the number of queues to some known good value and to not have to worry about the kernel doing any second guessing. [ paulmck: Apply feedback from Guillaume Tucker and kernelci. ] Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:52:11 -08:00
Paul E. McKenney	3063b33a34	rcu-tasks: Avoid raw-spinlocked wakeups from call_rcu_tasks_generic() If the caller of of call_rcu_tasks(), call_rcu_tasks_rude(), or call_rcu_tasks_trace() holds a raw spinlock, and then if call_rcu_tasks_generic() determines that the grace-period kthread must be awakened, then the wakeup might acquire a normal spinlock while a raw spinlock is held. This results in lockdep splats when the kernel is built with CONFIG_PROVE_RAW_LOCK_NESTING=y. This commit therefore defers the wakeup using irq_work_queue(). It would be nice to directly invoke wakeup when a raw spinlock is not held, but there is currently no way to check for this in all kernels. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:52:11 -08:00
Paul E. McKenney	7d13d30bb6	rcu-tasks: Count trylocks to estimate call_rcu_tasks() contention This commit converts the unconditional raw_spin_lock_rcu_node() lock acquisition in call_rcu_tasks_generic() to a trylock followed by an unconditional acquisition if the trylock fails. If the trylock fails, the failure is counted, but the count is reset to zero on each new jiffy. This statistic will be used to determine when to move from a single callback queue to per-CPU callback queues. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:52:11 -08:00
Paul E. McKenney	8610b65680	rcu-tasks: Add rcupdate.rcu_task_enqueue_lim to set initial queueing This commit adds a rcupdate.rcu_task_enqueue_lim module parameter that sets the initial number of callback queues to use for the RCU Tasks family of RCU implementations. This parameter allows testing of various fanout values. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:52:11 -08:00
Paul E. McKenney	ce9b1c667f	rcu-tasks: Make rcu_barrier_tasks() handle multiple callback queues Currently, rcu_barrier_tasks(), rcu_barrier_tasks_rude(), and rcu_barrier_tasks_trace() simply invoke the corresponding synchronize_rcu_tasks() function. This works because there is only one callback queue. However, there will soon be multiple callback queues. This commit therefore scans the queues currently in use, entraining a callback on each non-empty queue. Sequence numbers and reference counts are used to synchronize this process in a manner similar to the approach taken by rcu_barrier(). Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:52:11 -08:00
Paul E. McKenney	d363f833c6	rcu-tasks: Use workqueues for multiple rcu_tasks_invoke_cbs() invocations If there is a flood of callbacks, it is necessary to put multiple CPUs to work invoking those callbacks. This commit therefore uses a workqueue-flooding approach to parallelize RCU Tasks callback execution. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:52:00 -08:00
Paul E. McKenney	57881863ad	rcu-tasks: Abstract invocations of callbacks This commit adds a rcu_tasks_invoke_cbs() function that invokes all ready callbacks on all of the per-CPU lists that are currently in use. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:51:44 -08:00
Paul E. McKenney	4d1114c054	rcu-tasks: Abstract checking of callback lists This commit adds a rcu_tasks_need_gpcb() function that returns an indication of whether another grace period is required, and if no grace period is required, whether there are callbacks that need to be invoked. The function scans all per-CPU lists currently in use. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:51:27 -08:00
Paul E. McKenney	8dd593fddd	rcu-tasks: Add a ->percpu_enqueue_lim to the rcu_tasks structure This commit adds a ->percpu_enqueue_lim field to the rcu_tasks structure. This field contains two to the power of the ->percpu_enqueue_shift field, easing construction of iterators over the per-CPU queues that might contain RCU Tasks callbacks. Such iterators will be introduced in later commits. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:13:55 -08:00
Neeraj Upadhyay	65b629e704	rcu-tasks: Inspect stalled task's trc state in locked state On RCU tasks trace stall, inspect the RCU-tasks-trace specific states of stalled task in locked down state, using try_invoke_ on_locked_down_task(), to get reliable trc state of a non-running stalled task. This was tested using the following command: tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 8 --configs TRACE01 \ --bootargs "rcutorture.torture_type=tasks-tracing rcutorture.stall_cpu=10 \ rcutorture.stall_cpu_block=1 rcupdate.rcu_task_stall_timeout=100" --trust-make As expected, this produced the following console output for running and sleeping tasks. [ 21.520291] INFO: rcu_tasks_trace detected stalls on tasks: [ 21.521292] P85: ... nesting: 1N cpu: 2 [ 21.521966] task:rcu_torture_sta state:D stack:15080 pid: 85 ppid: 2 flags:0x00004000 [ 21.523384] Call Trace: [ 21.523808] __schedule+0x273/0x6e0 [ 21.524428] schedule+0x35/0xa0 [ 21.524971] schedule_timeout+0x1ed/0x270 [ 21.525690] ? del_timer_sync+0x30/0x30 [ 21.526371] ? rcu_torture_writer+0x720/0x720 [ 21.527106] rcu_torture_stall+0x24a/0x270 [ 21.527816] kthread+0x115/0x140 [ 21.528401] ? set_kthread_struct+0x40/0x40 [ 21.529136] ret_from_fork+0x22/0x30 [ 21.529766] 1 holdouts [ 21.632300] INFO: rcu_tasks_trace detected stalls on tasks: [ 21.632345] rcu_torture_stall end. [ 21.633293] P85: . [ 21.633294] task:rcu_torture_sta state:R running task stack:15080 pid: 85 ppid: 2 flags:0x00004000 [ 21.633299] Call Trace: [ 21.633301] ? vprintk_emit+0xab/0x180 [ 21.633306] ? vprintk_emit+0x11a/0x180 [ 21.633308] ? _printk+0x4d/0x69 [ 21.633311] ? __default_send_IPI_shortcut+0x1f/0x40 [ paulmck: Update to new v5.16 task_call_func() name. ] Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:13:55 -08:00
Paul E. McKenney	381a4f3b38	rcu-tasks: Use spin_lock_rcu_node() and friends This commit renames the rcu_tasks_percpu structure's ->cbs_pcpu_lock to ->lock and then uses spin_lock_rcu_node() and friends to acquire and release this lock, preparing for upcoming commits that will spread the grace-period process across multiple CPUs and kthreads. [ paulmck: Apply feedback from kernel test robot. ] Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:12:45 -08:00
Paul E. McKenney	53b541fbdb	rcutorture: Combine n_max_cbs from all kthreads in a callback flood With the addition of multiple callback-flood kthreads, the maximum number of callbacks from any one of those kthreads is reported in the rcutorture run summary. This commit changes this to report the sum of each kthread's maximum number of callbacks in a given callback-flooding episode. Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-07 16:36:18 -08:00
Paul E. McKenney	613b00fbe6	rcutorture: Add ability to limit callback-flood intensity The RCU tasks flavors of RCU now need concurrent callback flooding to test their ability to switch between single-queue mode and per-CPU queue mode, but their lack of heavy-duty forward-progress features rules out the use of rcutorture's current callback-flooding code. This commit therefore provides the ability to limit the intensity of the callback floods using a new ->cbflood_max field in the rcu_operations structure. When this field is zero, there is no limit, otherwise, each callback-flood kthread allocates at most ->cbflood_max callbacks. Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-07 16:36:18 -08:00
Paul E. McKenney	82e310033d	rcutorture: Enable multiple concurrent callback-flood kthreads This commit converts the rcutorture.fwd_progress module parameter from bool to int, so that it specifies the number of callback-flood kthreads. Values less than zero specify one kthread per CPU, however, the number of kthreads executing concurrently is limited to the number of online CPUs. This commit also reverse the order of the need-resched and callback-flood operations to cause the callback flooding to happen more nearly at the same time. Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-07 16:36:18 -08:00
Wander Lairson Costa	5ff7c9f9d7	rcutorture: Avoid soft lockup during cpu stall If we use the module stall_cpu option, we may get a soft lockup warning in case we also don't pass the stall_cpu_block option. Introduce the stall_no_softlockup option to avoid a soft lockup on cpu stall even if we don't use the stall_cpu_block option. Signed-off-by: Wander Lairson Costa <wander@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-07 16:36:17 -08:00
Li Zhijian	81faa4f6fb	locktorture,rcutorture,torture: Always log error message Unconditionally log messages corresponding to errors. Acked-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Li Zhijian <zhijianx.li@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-07 16:36:17 -08:00
Li Zhijian	86e7ed1bd5	rcuscale: Always log error message Unconditionally log messages corresponding to errors. Acked-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Li Zhijian <zhijianx.li@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-07 16:36:17 -08:00
Li Zhijian	f71f22b67d	refscale: Add missing '\n' to flush message Add '\n' to macros to flush message for each call. Acked-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Li Zhijian <zhijianx.li@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-07 16:36:12 -08:00
Li Zhijian	4feeb9d5f8	refscale: Always log the error message An OOM is a serious error that should be logged even in non-verbose runs. This commit therefore adds an unconditional SCALEOUT_ERRSTRING() macro and uses it instead of VERBOSE_SCALEOUT_ERRSTRING() when reporting an OOM. [ paulmck: Drop do-while from SCALEOUT_ERRSTRING() due to only single statement. ] Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-07 16:31:44 -08:00
Paul E. McKenney	9b073de1c7	rcu_tasks: Convert bespoke callback list to rcu_segcblist structure This commit moves from a bespoke head and tail pointer in the rcu_tasks_percpu structure to an rcu_segcblist structure, thus allowing associating the grace-period sequence number with groups of callbacks. This in turn will allow callbacks to be invoked independently on different CPUs. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-07 16:26:57 -08:00
Paul E. McKenney	b14fb4fbbc	rcu-tasks: Convert grace-period counter to grace-period sequence number This commit moves the rcu_tasks structure's ->n_gps grace-period-counter field to a ->task_gp_seq grce-period sequence number in order to enable use of the rcu_segcblist structure for the callback lists. This in turn permits CPUs to lag behind the RCU Tasks grace-period sequence number without suffering long-term slowdowns in callback invocation. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-07 16:26:57 -08:00
Paul E. McKenney	7a30871b6a	rcu-tasks: Introduce ->percpu_enqueue_shift for dynamic queue selection This commit introduces a ->percpu_enqueue_shift field to the rcu_tasks structure, and uses it to shift down the CPU number in order to select a rcu_tasks_percpu structure. This field is currently set to a sufficiently large shift count to always select the CPU-0 instance of the rcu_tasks_percpu structure, and later commits will adjust this. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-07 16:26:57 -08:00
Paul E. McKenney	cafafd6776	rcu-tasks: Create per-CPU callback lists Currently, RCU Tasks Trace (as well as the other two flavors of RCU Tasks) use a single global callback list. This works well and is simple, but expected changes in workload will cause this list to become a bottleneck. This commit therefore creates per-CPU callback lists for the various flavors of RCU Tasks, but continues queueing on a single list, namely that of CPU 0. Later commits will dynamically vary the number of lists in use to accommodate dynamic changes in workload. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Tested-by: kernel test robot <beibei.si@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-07 16:26:09 -08:00
Frederic Weisbecker	0598a4d442	rcu/nocb: Don't invoke local rcu core on callback overload from nocb kthread rcu_core() tries to ensure that its self-invocation in case of callbacks overload only happen in softirq/rcuc mode. Indeed it doesn't make sense to trigger local RCU core from nocb_cb kthread since it can execute on a CPU different from the target rdp. Also in case of overload, the nocb_cb kthread simply iterates a new loop of callbacks processing. However the "offloaded" check that aims at preventing misplaced rcu_core() invocations is wrong. First of all that state is volatile and second: softirq/rcuc can execute while the target rdp is offloaded. As a result rcu_core() can be invoked on the wrong CPU while in the process of (de-)offloading. Fix that with moving the rcu_core() self-invocation to rcu_core() itself, irrespective of the rdp offloaded state. Tested-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Valentin Schneider <valentin.schneider@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Neeraj Upadhyay <neeraju@codeaurora.org> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-07 16:24:44 -08:00
Frederic Weisbecker	a554ba2888	rcu: Apply callbacks processing time limit only on softirq Time limit only makes sense when callbacks are serviced in softirq mode because: _ In case we need to get back to the scheduler, cond_resched_tasks_rcu_qs() is called after each callback. _ In case some other softirq vector needs the CPU, the call to local_bh_enable() before cond_resched_tasks_rcu_qs() takes care about them via a call to do_softirq(). Therefore, make sure the time limit only applies to softirq mode. Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Valentin Schneider <valentin.schneider@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Neeraj Upadhyay <neeraju@codeaurora.org> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-07 16:24:44 -08:00

... 15 16 17 18 19 ...

2775 Commits