linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-16 04:21:09 -04:00

Files

Richard Cheng 510a270554 sched_ext: sync disable_irq_work in bpf_scx_unreg()

When unregistered my self-written scx scheduler, the following panic
occurs.

[  229.923133] Kernel text patching generated an invalid instruction at 0xffff80009bc2c1f8!
[  229.923146] Internal error: Oops - BRK: 00000000f2000100 [#1]  SMP
[  230.077871] CPU: 48 UID: 0 PID: 1760 Comm: kworker/u583:7 Not tainted 7.0.0+ #3 PREEMPT(full)
[  230.086677] Hardware name: NVIDIA GB200 NVL/P3809-BMC, BIOS 02.05.12 20251107
[  230.093972] Workqueue: events_unbound bpf_map_free_deferred
[  230.099675] Sched_ext: invariant_0.1.0_aarch64_unknown_linux_gnu_debug (disabling), task: runnable_at=-174ms
[  230.116843] pc : 0xffff80009bc2c1f8
[  230.120406] lr : dequeue_task_scx+0x270/0x2d0
[  230.217749] Call trace:
[  230.228515]  0xffff80009bc2c1f8 (P)
[  230.232077]  dequeue_task+0x84/0x188
[  230.235728]  sched_change_begin+0x1dc/0x250
[  230.240000]  __set_cpus_allowed_ptr_locked+0x17c/0x240
[  230.245250]  __set_cpus_allowed_ptr+0x74/0xf0
[  230.249701]  ___migrate_enable+0x4c/0xa0
[  230.253707]  bpf_map_free_deferred+0x1a4/0x1b0
[  230.258246]  process_one_work+0x184/0x540
[  230.262342]  worker_thread+0x19c/0x348
[  230.266170]  kthread+0x13c/0x150
[  230.269465]  ret_from_fork+0x10/0x20
[  230.281393] Code: d4202000 d4202000 d4202000 d4202000 (d4202000)
[  230.287621] ---[ end trace 0000000000000000 ]---
[  231.160046] Kernel panic - not syncing: Oops - BRK: Fatal exception in interrupt

The root cause is that the JIT page backing ops->quiescent() is freed
before all callers of that function have stopped.

The expected ordering during teardown is:
    bitmap_zero(sch->has_op) + synchronize_rcu()
        -> guarantees no CPU will ever call sch->ops.* again
    -> only THEN free the BPF struct_ops JIT page

bpf_scx_unreg() is supposed to enforce the order, but after
commit f4a6c506d1 ("sched_ext: Always bounce scx_disable() through
irq_work"), disable_work is no longer queued directly, causing
kthread_flush_work() to be a noop. Thus, the caller drops the struct_ops
map too early and poisoned with AARCH64_BREAK_FAULT before
disable_workfn ever execute.

So the subsequent dequeue_task() still sees SCX_HAS_OP(sch, quiescent)
as true and calls ops.quiescent, which hit on the poisoned page and BRK
panic.

Add a helper scx_flush_disable_work() so the future use cases that want
to flush disable_work can use it.
Also amend the call for scx_root_enable_workfn() and
scx_sub_enable_workfn() which have similar pattern in the error path.

Fixes: f4a6c506d1 ("sched_ext: Always bounce scx_disable() through irq_work")
Signed-off-by: Richard Cheng <icheng@nvidia.com>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Reviewed-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

2026-04-24 07:26:48 -10:00

bpf

Merge tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

2026-04-15 12:59:16 -07:00

cgroup

Merge tag 'cgroup-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

2026-04-15 10:18:49 -07:00

configs

Remove WARN_ALL_UNSEEDED_RANDOM kernel config option

2026-02-23 11:18:48 -08:00

debug

treewide: Replace kmalloc with kmalloc_obj for non-scalar types

2026-02-21 01:02:28 -08:00

dma

dma-debug: suppress cacheline overlap warning when arch has no DMA alignment requirement

2026-03-30 09:41:18 +02:00

entry

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

2026-04-14 16:48:56 -07:00

events

Merge tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

2026-04-15 12:59:16 -07:00

futex

Merge tag 'locking-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2026-04-14 12:36:25 -07:00

gcov

Convert more 'alloc_obj' cases to default GFP_KERNEL arguments

2026-02-21 20:03:00 -08:00

irq

genirq/chip: Invoke add_interrupt_randomness() in handle_percpu_devid_irq()

2026-04-02 23:03:29 +02:00

kcsan

kcsan: test: Adjust "expect" allocation type for kmalloc_obj

2026-02-26 09:54:08 -08:00

livepatch

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

liveupdate

Merge tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

2026-04-15 12:59:16 -07:00

locking

Merge tag 'sched-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2026-04-14 13:33:36 -07:00

module

module: Simplify warning on positive returns from module_init()

2026-04-04 00:04:48 +00:00

power

Merge branches 'pm-cpuidle', 'pm-opp' and 'pm-sleep'

2026-04-10 12:37:27 +02:00

printk

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

rcu

Merge tag 'rcu.2026.03.31a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux

2026-04-13 09:36:45 -07:00

sched

sched_ext: sync disable_irq_work in bpf_scx_unreg()

2026-04-24 07:26:48 -10:00

time

Merge tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

2026-04-15 12:59:16 -07:00

trace

Merge tag 'trace-rv-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

2026-04-15 17:15:18 -07:00

unwind

Convert more 'alloc_obj' cases to default GFP_KERNEL arguments

2026-02-21 20:03:00 -08:00

.gitignore

kheaders: rebuild kheaders_data.tar.xz when a file is modified within a minute

2025-06-24 20:30:37 +09:00

acct.c

Merge tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

2026-04-13 14:20:11 -07:00

async.c

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

audit_fsnotify.c

audit: widen ino fields to u64

2026-03-06 14:31:26 +01:00

audit_tree.c

Convert 'alloc_flex' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

audit_watch.c

audit: widen ino fields to u64

2026-03-06 14:31:26 +01:00

audit.c

audit: handle unknown status requests in audit_receive_msg()

2026-03-10 15:22:43 -04:00

audit.h

audit: widen ino fields to u64

2026-03-06 14:31:26 +01:00

auditfilter.c

audit: fix coding style issues

2026-03-05 22:16:08 -05:00

auditsc.c

audit: widen ino fields to u64

2026-03-06 14:31:26 +01:00

backtracetest.c

…

bounds.c

x86/asm: Remove ANNOTATE_DATA_SPECIAL usage

2025-12-03 16:53:19 +01:00

capability.c

…

cfi.c

cfi: Move BPF CFI types and helpers to generic code

2025-07-31 18:23:53 -07:00

compat.c

…

configs.c

…

context_tracking.c

context_tracking: Remove rcu_task_trace_heavyweight_{enter,exit}()

2026-01-01 16:39:46 +08:00

cpu_pm.c

syscore: Pass context data to callbacks

2025-11-14 10:01:52 +01:00

cpu.c

Merge tag 'spdx-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx

2026-02-17 09:46:03 -08:00

crash_core_test.c

crash: add KUnit tests for crash_exclude_mem_range

2025-09-13 17:32:55 -07:00

crash_core.c

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

crash_dump_dm_crypt.c

crash_dump: don't log dm-crypt key bytes in read_key_from_user_keying

2026-03-10 16:01:48 -07:00

crash_reserve.c

crash: let architecture decide crash memory export to iomem_resource

2025-11-12 10:00:15 -08:00

cred.c

cred: remove unused set_security_override_from_ctx()

2026-01-06 20:52:57 -05:00

delayacct.c

delayacct: fix uapi timespec64 definition

2026-02-08 00:13:32 -08:00

dma.c

…

elfcorehdr.c

…

exec_domain.c

…

exit.c

pid_namespace: avoid optimization of accesses to ->child_reaper

2026-03-20 14:44:25 +01:00

exit.h

…

extable.c

…

fail_function.c

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

fork.c

Merge tag 'sched_ext-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext

2026-04-15 10:54:24 -07:00

freezer.c

freezer: Clarify that only cgroup1 freezer uses PM freezer

2025-10-30 20:10:27 +01:00

gen_kheaders.sh

kheaders: make it possible to override TAR

2025-08-06 10:23:36 +09:00

groups.c

treewide: Replace kmalloc with kmalloc_obj for non-scalar types

2026-02-21 01:02:28 -08:00

hung_task.c

hung_task: add hung_task_sys_info sysctl to dump sys info on task-hung

2025-11-20 14:03:43 -08:00

iomem.c

…

irq_work.c

…

jump_label.c

jump_label: use ATOMIC_INIT() for initialization of .enabled

2026-03-16 13:16:48 +01:00

kallsyms_internal.h

kallsyms: Get rid of kallsyms relative base

2026-01-22 15:58:22 -07:00

kallsyms_selftest.c

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

kallsyms_selftest.h

…

kallsyms.c

Merge tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

2026-02-12 12:13:01 -08:00

kcmp.c

…

Kconfig.freezer

…

Kconfig.hz

…

Kconfig.kexec

liveupdate: kho: move to kernel/liveupdate

2025-11-27 14:24:33 -08:00

Kconfig.locks

…

Kconfig.preempt

sched: Further restrict the preemption modes

2026-01-08 12:43:57 +01:00

kcov.c

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

kexec_core.c

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

kexec_elf.c

…

kexec_file.c

kexec: derive purgatory entry from symbol

2026-01-31 16:16:07 -08:00

kexec_internal.h

kexec: enable CMA based contiguous allocation

2025-08-02 12:01:38 -07:00

kexec.c

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

kheaders.c

…

kprobes.c

kprobes: Remove unneeded warnings from __arm_kprobe_ftrace()

2026-03-13 23:15:26 +09:00

kstack_erase.c

sysctl: remove __user qualifier from stack_erasing_sysctl buffer argument

2025-11-27 15:44:53 +01:00

ksyms_common.c

…

ksysfs.c

kernel: ksysfs: initialize kernel_kobj earlier

2026-04-03 19:39:52 +02:00

kthread.c

kthread: consolidate kthread exit paths to prevent use-after-free

2026-02-26 10:45:49 +01:00

latencytop.c

…

Makefile

kcov: Enable context analysis

2026-01-05 16:43:34 +01:00

module_signature.c

module: Give 'enum pkey_id_type' a more specific name

2026-03-24 21:42:37 +00:00

notifier.c

…

nscommon.c

nsfs: tighten permission checks for ns iteration ioctls

2026-02-27 22:00:08 +01:00

nsproxy.c

Merge tag 'vfs-7.1-rc1.mount.v2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

2026-04-14 19:59:25 -07:00

nstree.c

nstree: tighten permission checks for listing

2026-02-27 22:00:11 +01:00

padata.c

padata: Put CPU offline callback in ONLINE section to allow failure

2026-03-22 11:17:59 +09:00

panic.c

panic: add panic_force_cpu= parameter to redirect panic to a specific CPU

2026-02-03 08:21:26 -08:00

params.c

module: Clean up parse_args() arguments

2026-03-18 21:43:18 +00:00

pid_namespace.c

pid_namespace: allow opening pid_for_children before init was created

2026-03-20 14:44:26 +01:00

pid_sysctl.h

…

pid.c

pid: check init is created first after idr alloc

2026-03-20 14:44:26 +01:00

profile.c

…

ptrace.c

clone: add CLONE_AUTOREAP

2026-03-11 23:14:02 +01:00

range.c

…

reboot.c

treewide: Replace kmalloc with kmalloc_obj for non-scalar types

2026-02-21 01:02:28 -08:00

regset.c

…

relay.c

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

resource_kunit.c

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

resource.c

PCI: Align head space better

2026-03-27 10:19:08 -05:00

rseq.c

rseq: slice ext: Ensure rseq feature size differs from original rseq size

2026-02-23 11:19:19 +01:00

scftorture.c

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

scs.c

scs: fix a wrong parameter in __scs_magic

2025-11-12 10:00:13 -08:00

seccomp.c

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

signal.c

Merge tag 'kernel-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

2026-04-14 20:28:40 -07:00

smp.c

smp: Use system_percpu_wq instead of system_wq

2026-03-26 17:31:35 +01:00

smpboot.c

sched/smp: Use the SMP version of idle_thread_set_boot_cpu()

2025-06-13 08:47:20 +02:00

smpboot.h

…

softirq.c

softirq: Prepare for deferred hrtimer rearming

2026-02-27 16:40:13 +01:00

stacktrace.c

…

static_call_inline.c

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

static_call.c

…

stop_machine.c

sched/core: Fix migrate_swap() vs. hotplug

2025-07-01 15:02:03 +02:00

sys_ni.c

rseq: Implement sys_rseq_slice_yield()

2026-01-22 11:11:17 +01:00

sys.c

prctl: cfi: change the branch landing pad prctl()s to be more descriptive

2026-04-04 18:40:58 -06:00

sysctl-test.c

…

sysctl.c

sysctl: fix uninitialized variable in proc_do_large_bitmap

2026-03-26 09:32:19 +01:00

task_work.c

task_work: Fix NMI race condition

2025-10-29 10:29:54 +01:00

taskstats.c

…

torture.c

torture: Avoid modulo-zero error in torture_hrtimeout_ns()

2026-03-30 15:48:14 -04:00

tracepoint.c

Convert 'alloc_flex' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

tsacct.c

tsacct: skip all kernel threads

2026-01-26 19:07:13 -08:00

ucount.c

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

uid16.c

…

uid16.h

…

umh.c

treewide: Replace kmalloc with kmalloc_obj for non-scalar types

2026-02-21 01:02:28 -08:00

up.c

…

user_namespace.c

Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses

2026-02-22 08:26:33 -08:00

user-return-notifier.c

…

user.c

ns: drop custom reference count initialization for initial namespaces

2025-11-11 10:01:32 +01:00

utsname_sysctl.c

…

utsname.c

Merge tag 'namespace-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

2025-09-29 11:20:29 -07:00

vhost_task.c

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

vmcore_info.c

mm: rename the 'compound_head' field in the 'struct page' to 'compound_info'

2026-04-05 13:53:08 -07:00

watch_queue.c

Convert 'alloc_flex' family to use the new default GFP_KERNEL argument

2026-02-21 17:09:51 -08:00

watchdog_buddy.c

watchdog: fix opencoded cpumask_next_wrap() in watchdog_next_cpu()

2025-07-31 11:28:03 -04:00

watchdog_perf.c

watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency

2026-02-08 00:13:35 -08:00

watchdog.c

watchdog/softlockup: fix sample ring index wrap in need_counting_irqs()

2026-02-08 00:13:34 -08:00

workqueue_internal.h

workqueue: Show in-flight work item duration in stall diagnostics

2026-03-05 07:27:48 -10:00

workqueue.c

Merge tag 'wq-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

2026-04-15 10:32:08 -07:00