sched: Do not call __put_task_struct() on rt if pi_blocked_on is set

With PREEMPT_RT enabled, some of the calls to put_task_struct() coming
from rt_mutex_adjust_prio_chain() could happen in preemptible context and
with a mutex enqueued. That could lead to this sequence:

        rt_mutex_adjust_prio_chain()
          put_task_struct()
            __put_task_struct()
              sched_ext_free()
                spin_lock_irqsave()
                  rtlock_lock() --->  TRIGGERS
                                      lockdep_assert(!current->pi_blocked_on);

This is not a SCHED_EXT bug. The first cleanup function called by
__put_task_struct() is sched_ext_free() and it happens to take a
(RT) spin_lock, which in the scenario described above, would trigger
the lockdep assertion of "!current->pi_blocked_on".

Crystal Wood was able to identify the problem as __put_task_struct()
being called during rt_mutex_adjust_prio_chain(), in the context of
a process with a mutex enqueued.

Instead of adding more complex conditions to decide when to directly
call __put_task_struct() and when to defer the call, unconditionally
resort to the deferred call on PREEMPT_RT to simplify the code.

Fixes: 893cdaaa39 ("sched: avoid false lockdep splat in put_task_struct()")
Suggested-by: Crystal Wood <crwood@redhat.com>
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Wander Lairson Costa <wander@redhat.com>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/aGvTz5VaPFyj0pBV@uudg.org
This commit is contained in:
Luis Claudio R. Goncalves
2025-07-07 11:03:59 -03:00
committed by Peter Zijlstra
parent 7de9d4f946
commit 8671bad873

View File

@@ -131,24 +131,17 @@ static inline void put_task_struct(struct task_struct *t)
return;
/*
* In !RT, it is always safe to call __put_task_struct().
* Under RT, we can only call it in preemptible context.
*/
if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible()) {
static DEFINE_WAIT_OVERRIDE_MAP(put_task_map, LD_WAIT_SLEEP);
lock_map_acquire_try(&put_task_map);
__put_task_struct(t);
lock_map_release(&put_task_map);
return;
}
/*
* under PREEMPT_RT, we can't call put_task_struct
* Under PREEMPT_RT, we can't call __put_task_struct
* in atomic context because it will indirectly
* acquire sleeping locks.
* acquire sleeping locks. The same is true if the
* current process has a mutex enqueued (blocked on
* a PI chain).
*
* call_rcu() will schedule delayed_put_task_struct_rcu()
* In !RT, it is always safe to call __put_task_struct().
* Though, in order to simplify the code, resort to the
* deferred call too.
*
* call_rcu() will schedule __put_task_struct_rcu_cb()
* to be called in process context.
*
* __put_task_struct() is called when
@@ -161,7 +154,7 @@ static inline void put_task_struct(struct task_struct *t)
*
* delayed_free_task() also uses ->rcu, but it is only called
* when it fails to fork a process. Therefore, there is no
* way it can conflict with put_task_struct().
* way it can conflict with __put_task_struct().
*/
call_rcu(&t->rcu, __put_task_struct_rcu_cb);
}