Commit Graph

1444030 Commits

Author SHA1 Message Date
Zhan Xusheng
b6eee96843 sched/fair: Fix overflow in vruntime_eligible()
Zhan Xusheng reported running into sporadic a s64 mult overflow in
vruntime_eligible().

When constructing a worst case scenario:

If you have cgroups, then you can have an entity of weight 2 (per
calc_group_shares()), and its vlag should then be bounded by: (slice+TICK_NSEC)
* NICE_0_LOAD, which is around 44 bits as per the comment on entity_key().

The other extreme is 100*NICE_0_LOAD, thus you get:

{key, weight}[] := {
  puny: { (slice + TICK_NSEC) * NICE_0_LOAD, 2               },
  max:  { 0,                                 100*NICE_0_LOAD },
}

The avg_vruntime() would end up being very close to 0 (which is
zero_vruntime), so no real help making that more accurate.

vruntime_eligible(puny) ends up with:

 avg  = 2 * puny.key (+ 0)
 load = 2 + 100 * NICE_0_LOAD

 avg >= puny.key * load

And that is: (slice + TICK_NSEC) * NICE_0_LOAD * NICE_0_LOAD * 100, which will
overflow s64.

Zhan suggested using __builtin_mul_overflow(), however after staring at
compiler output for various architectures using godbolt, it seems that using an
__int128 multiplication often results in better code.

Specifically, a number of architectures already compute the __int128 product to
determine the overflow. Eg. arm64 already has the 'smulh' instruction used. By
explicitly doing an __int128 multiply, it will emit the 'mul; smulh' pattern,
which modern cores can fuse (armv8-a clang-22.1.0). x86_64 has less branches
(no OF handling).

Since Linux has ARCH_SUPPORTS_INT128 to gate __int128 usage, also provide the
__builtin_mul_overflow() variant as a fallback.

[peterz: Changelog and __int128 bits]
Fixes: 556146ce5e ("sched/fair: Avoid overflow in enqueue_entity()")
Reported-by: Zhan Xusheng <zhanxusheng1024@gmail.com>
Closes: https://patch.msgid.link/20260415145742.10359-1-zhanxusheng%40xiaomi.com
Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260505103155.GN3102924%40noisy.programming.kicks-ass.net
2026-05-06 17:41:17 +02:00
Thomas Gleixner
e744060076 selftests/rseq: Expand for optimized RSEQ ABI v2
Update the selftests so they are executed for legacy (32 bytes RSEQ region)
and optimized RSEQ ABI v2 mode.

Fixes: d6200245c7 ("rseq: Allow registering RSEQ with slice extension")
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224428.009121296%40kernel.org
Cc: stable@vger.kernel.org
2026-05-06 17:41:08 +02:00
Thomas Gleixner
99428157dc rseq: Reenable performance optimizations conditionally
Due to the incompatibility with TCMalloc the RSEQ optimizations and
extended features (time slice extensions) have been disabled and made
run-time conditional.

The original RSEQ implementation, which TCMalloc depends on, registers a 32
byte region (ORIG_RSEG_SIZE). This region has a 32 byte alignment
requirement.

The extension safe newer variant exposes the kernel RSEQ feature size via
getauxval(AT_RSEQ_FEATURE_SIZE) and the alignment requirement via
getauxval(AT_RSEQ_ALIGN). The alignment requirement is that the registered
RSEQ region is aligned to the next power of two of the feature size. The
kernel currently has a feature size of 33 bytes, which means the alignment
requirement is 64 bytes.

The TCMalloc RSEQ region is embedded into a cache line aligned data
structure starting at offset 32 bytes so that bytes 28-31 and the
cpu_id_start field at bytes 32-35 form a 64-bit little endian pointer with
the top-most bit (63 set) to check whether the kernel has overwritten
cpu_id_start with an actual CPU id value, which is guaranteed to not have
the top most bit set.

As this is part of their performance tuned magic, it's a pretty safe
assumption, that TCMalloc won't use a larger RSEQ size.

This allows the kernel to declare that registrations with a size greater
than the original size of 32 bytes, which is the cases since time slice
extensions got introduced, as RSEQ ABI v2 with the following differences to
the original behaviour:

  1) Unconditional updates of the user read only fields (CPU, node, MMCID)
     are removed. Those fields are only updated on registration, task
     migration and MMCID changes.

  2) Unconditional evaluation of the criticial section pointer is
     removed. It's only evaluated when user space was interrupted and was
     scheduled out or before delivering a signal in the interrupted
     context.

  3) The read/only requirement of the ID fields is enforced. When the
     kernel detects that userspace manipulated the fields, the process is
     terminated. This ensures that multiple entities (libraries) can
     utilize RSEQ without interfering.

  4) Todays extended RSEQ feature (time slice extensions) and future
     extensions are only enabled in the v2 enabled mode.

Registrations with the original size of 32 bytes operate in backwards
compatible legacy mode without performance improvements and extended
features.

Unfortunately that also affects users of older GLIBC versions which
register the original size of 32 bytes and do not evaluate the kernel
required size in the auxiliary vector AT_RSEQ_FEATURE_SIZE.

That's the result of the lack of enforcement in the original implementation
and the unwillingness of a single entity to cooperate with the larger
ecosystem for many years.

Implement the required registration changes by restructuring the spaghetti
code and adding the size/version check. Also add documentation about the
differences of legacy and optimized RSEQ V2 mode.

Thanks to Mathieu for pointing out the ORIG_RSEQ_SIZE constraints!

Fixes: d6200245c7 ("rseq: Allow registering RSEQ with slice extension")
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224427.927160119%40kernel.org
Cc: stable@vger.kernel.org
2026-05-06 17:40:27 +02:00
Thomas Gleixner
82f572449c rseq: Implement read only ABI enforcement for optimized RSEQ V2 mode
The optimized RSEQ V2 mode requires that user space adheres to the ABI
specification and does not modify the read-only fields cpu_id_start,
cpu_id, node_id and mm_cid behind the kernel's back.

While the kernel does not rely on these fields, the adherence to this is a
fundamental prerequisite to allow multiple entities, e.g. libraries, in an
application to utilize the full potential of RSEQ without stepping on each
other toes.

Validate this adherence on every update of these fields. If the kernel
detects that user space modified the fields, the application is force
terminated.

Fixes: d6200245c7 ("rseq: Allow registering RSEQ with slice extension")
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224427.845230956%40kernel.org
Cc: stable@vger.kernel.org
2026-05-06 17:40:15 +02:00
Thomas Gleixner
fdf4eb6326 selftests/rseq: Validate legacy behavior
The RSEQ legacy mode behavior requires that the ID fields in the rseq
region are unconditionally updated on every context switch and before
signal delivery even if not required by the ABI specification.

To ensure that this behavior is preserved for legacy users in the future,
add a test which validates that with a sleep() and a signal sent to self.

Provide a run script which prevents GLIBC from registering a RSEQ region,
so that the test can register it's own legacy sized region.

Fixes: 566d8015f7 ("rseq: Avoid CPU/MM CID updates when no event pending")
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224427.764705536%40kernel.org
Cc: stable@vger.kernel.org
2026-05-06 17:39:01 +02:00
Thomas Gleixner
d97cb2ef0b selftests/rseq: Make registration flexible for legacy and optimized mode
rseq_register_current_thread() either uses the glibc registered RSEQ region
or registers it's own region with the legacy size of 32 bytes.

That worked so far, but becomes a problem when the kernel implements a
distinction between legacy and performance optimized behavior based on the
registration size as that does not allow to test both modes with the self
test suite.

Add two arguments to the function. One to enforce that the registration is
not using libc provided mode and one to tell the registration to use the
legacy size and not the kernel advertised size.

Rename it and make the original one a inline wrapper which preserves the
existing behavior.

Fixes: 566d8015f7 ("rseq: Avoid CPU/MM CID updates when no event pending")
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224427.677889423%40kernel.org
Cc: stable@vger.kernel.org
2026-05-05 16:03:11 +02:00
Thomas Gleixner
02b44d943b selftests/rseq: Skip tests if time slice extensions are not available
Don't fail, skip the test if the extensions are not enabled at compile or
runtime.

Fixes: 830969e782 ("selftests/rseq: Implement time slice extension test")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224427.597838491%40kernel.org
Cc: stable@vger.kernel.org
2026-05-05 16:03:11 +02:00
Thomas Gleixner
b9eac6a9d9 rseq: Revert to historical performance killing behaviour
The recent RSEQ optimization work broke the TCMalloc abuse of the RSEQ ABI
as it not longer unconditionally updates the CPU, node, mm_cid fields,
which are documented as read only for user space. Due to the observed
behavior of the kernel it was possible for TCMalloc to overwrite the
cpu_id_start field for their own purposes and rely on the kernel to update
it unconditionally after each context switch and before signal delivery.

The RSEQ ABI only guarantees that these fields are updated when the data
changes, i.e. the task is migrated or the MMCID of the task changes due to
switching from or to per CPU ownership mode.

The optimization work eliminated the unconditional updates and reduced them
to the documented ABI guarantees, which results in a massive performance
win for syscall, scheduling heavy work loads, which in turn breaks the
TCMalloc expectations.

There have been several options discussed to restore the TCMalloc
functionality while preserving the optimization benefits. They all end up
in a series of hard to maintain workarounds, which in the worst case
introduce overhead for everyone, e.g. in the scheduler.

The requirements of TCMalloc and the optimization work are diametral and
the required work arounds are a maintainence burden. They end up as fragile
constructs, which are blocking further optimization work and are pretty
much guaranteed to cause more subtle issues down the road.

The optimization work heavily depends on the generic entry code, which is
not used by all architectures yet. So the rework preserved the original
mechanism moslty unmodified to keep the support for architectures, which
handle rseq in their own exit to user space loop. That code is currently
optimized out by the compiler on architectures which use the generic entry
code.

This allows to revert back to the original behaviour by replacing the
compile time constant conditions with a runtime condition where required,
which disables the optimization and the dependend time slice extension
feature until the run-time condition can be enabled in the RSEQ
registration code on a per task basis again.

The following changes are required to restore the original behavior, which
makes TCMalloc work again:

  1) Replace the compile time constant conditionals with runtime
     conditionals where appropriate to prevent the compiler from optimizing
     the legacy mode out

  2) Enforce unconditional update of IDs on context switch for the
     non-optimized v1 mode

  3) Enforce update of IDs in the pre signal delivery path for the
     non-optimized v1 mode

  4) Enforce update of IDs in the membarrier(RSEQ) IPI for the
     non-optimized v1 mode

  5) Make time slice and future extensions depend on optimized v2 mode

This brings back the full performance problems, but preserves the v2
optimization code and for generic entry code using architectures also the
TIF_RSEQ optimization which avoids a full evaluation of the exit to user
mode loop in many cases.

Fixes: 566d8015f7 ("rseq: Avoid CPU/MM CID updates when no event pending")
Reported-by: Mathias Stearn <mathias@mongodb.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Closes: https://lore.kernel.org/CAHnCjA25b+nO2n5CeifknSKHssJpPrjnf+dtr7UgzRw4Zgu=oA@mail.gmail.com
Link: https://patch.msgid.link/20260428224427.517051752%40kernel.org
Cc: stable@vger.kernel.org
2026-05-05 16:02:57 +02:00
Thomas Gleixner
010b7723c0 rseq: Don't advertise time slice extensions if disabled
If time slice extensions have been disabled on the kernel command line,
then advertising them in RSEQ flags is wrong.

Adjust the conditionals to reflect reality, fixup the misleading comments
about the gap of these flags and the rseq::flags field.

Fixes: d6200245c7 ("rseq: Allow registering RSEQ with slice extension")
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224427.437059375%40kernel.org
Cc: stable@vger.kernel.org
2026-05-01 21:32:20 +02:00
Thomas Gleixner
e9766e6f7d rseq: Protect rseq_reset() against interrupts
rseq_reset() uses memset() to clear the tasks rseq data. That's racy
against membarrier() and preemption.

Guard it with irqsave to cure this.

Fixes: faba9d250e ("rseq: Introduce struct rseq_data")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224427.353887714%40kernel.org
Cc: stable@vger.kernel.org
2026-05-01 21:32:20 +02:00
Thomas Gleixner
2cb68e4512 rseq: Set rseq::cpu_id_start to 0 on unregistration
The RSEQ rework changed that to RSEQ_CPU_UNINITILIZED, which is obviously
incompatible. Revert back to the original behavior.

Fixes: 0f085b4188 ("rseq: Provide and use rseq_set_ids()")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224427.271566313%40kernel.org
Cc: stable@vger.kernel.org
2026-05-01 21:32:20 +02:00
Mark Brown
cb48828f06 selftests/rseq: Don't run tests with runner scripts outside of the scripts
The rseq selftests include two runner scripts run_param_test.sh and
run_syscall_errors_test.sh which set up the environment for test binaries
and run them with various parameters. Currently we list these test binaries
in TEST_GEN_PROGS but this results in the kselftest framework running them
directly as well as via the runners, resulting in duplication and spurious
failures when the environment is not correctly set up (eg, if glibc tries
to use rseq).

Move the binaries the runners invoke to TEST_GEN_PROGS_EXTENDED, binaries
listed there are built but not run by the framework.  The param_test
benchmarks are not moved since they are not run by run_param_test.sh.

Fixes: 830969e782 ("selftests/rseq: Implement time slice extension test")

Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260423-selftests-rseq-use-runner-v1-1-e13a133754c1@kernel.org
Cc: stable@vger.kernel.org
2026-05-01 21:32:20 +02:00
Zicheng Qu
3da56dc063 sched/fair: Clear rel_deadline when initializing forked entities
A yield-triggered crash can happen when a newly forked sched_entity
enters the fair class with se->rel_deadline unexpectedly set.

The failing sequence is:

  1. A task is forked while se->rel_deadline is still set.
  2. __sched_fork() initializes vruntime, vlag and other sched_entity
     state, but does not clear rel_deadline.
  3. On the first enqueue, enqueue_entity() calls place_entity().
  4. Because se->rel_deadline is set, place_entity() treats se->deadline
     as a relative deadline and converts it to an absolute deadline by
     adding the current vruntime.
  5. However, the forked entity's deadline is not a valid inherited
     relative deadline for this new scheduling instance, so the conversion
     produces an abnormally large deadline.
  6. If the task later calls sched_yield(), yield_task_fair() advances
     se->vruntime to se->deadline.
  7. The inflated vruntime is then used by the following enqueue path,
     where the vruntime-derived key can overflow when multiplied by the
     entity weight.
  8. This corrupts cfs_rq->sum_w_vruntime, breaks EEVDF eligibility
     calculation, and can eventually make all entities appear ineligible.
     pick_next_entity() may then return NULL unexpectedly, leading to a
     later NULL dereference.

A captured trace shows the effect clearly. Before yield, the entity's
vruntime was around:

  9834017729983308

After yield_task_fair() executed:

  se->vruntime = se->deadline

the vruntime jumped to:

  19668035460670230

and the deadline was later advanced further to:

  19668035463470230

This shows that the deadline had already become abnormally large before
yield_task_fair() copied it into vruntime.

rel_deadline is only meaningful when se->deadline really carries a
relative deadline that still needs to be placed against vruntime. A
freshly forked sched_entity should not inherit or retain this state.
Clear se->rel_deadline in __sched_fork(), together with the other
sched_entity runtime state, so that the first enqueue does not interpret
the new entity's deadline as a stale relative deadline.

Fixes: 82e9d0456e ("sched/fair: Avoid re-setting virtual deadline on 'migrations'")
Analyzed-by: Hui Tang <tanghui20@huawei.com>
Analyzed-by: Zhang Qiao <zhangqiao22@huawei.com>
Signed-off-by: Zicheng Qu <quzicheng@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260424071113.1199600-1-quzicheng@huawei.com
2026-04-28 09:19:54 +02:00
Vincent Guittot
ac8e69e693 sched/fair: Fix wakeup_preempt_fair() vs delayed dequeue
Similar to how pick_next_entity() must dequeue delayed entities, so too must
wakeup_preempt_fair(). Any delayed task being found means it is eligible and
hence past the 0-lag point, ready for removal.

Worse, by not removing delayed entities from consideration, it can skew the
preemption decision, with the end result that a short slice wakeup will not
result in a preemption.

                     tip/sched/core  tip/sched/core    +this patch
cyclictest slice  (ms) (default)2.8             8               8
hackbench slice   (ms) (default)2.8            20              20
Total Samples          |    22559           22595           22683
Average           (us) |      157              64( 59%)        59(  8%)
Median (P50)      (us) |       57              57(  0%)        58(- 2%)
90th Percentile   (us) |       64              60(  6%)        60(  0%)
99th Percentile   (us) |     2407              67( 97%)        67(  0%)
99.9th Percentile (us) |     3400            2288( 33%)       727( 68%)
Maximum           (us) |     5037            9252(-84%)      7461( 19%)

Fixes: f12e148892 ("sched/fair: Prepare pick_next_task() for delayed dequeue")
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260422093400.319251-1-vincent.guittot@linaro.org
2026-04-28 09:19:54 +02:00
Peter Zijlstra
c5cd6fd75b sched/fair: Fix the negative lag increase fix
Vincent reported that my rework of his original patch lost a little
something.

Specifically it got the return value wrong; it should not compare
against the old se->vlag, but rather against the current value. Since
the thing that matters is if the effective vruntime of an entity is
affected and the thing needs repositioning or not.

Fixes: 059258b0d4 ("sched/fair: Prevent negative lag increase during delayed dequeue")
Reported-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://patch.msgid.link/20260423094107.GT3102624%40noisy.programming.kicks-ass.net
2026-04-28 09:19:54 +02:00
Linus Torvalds
254f49634e Linux 7.1-rc1 v7.1-rc1 2026-04-26 14:19:00 -07:00
Linus Torvalds
14479877c1 Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fix from Stephen Boyd:
 "One more fix for the merge window to avoid a boot hang on
  Raspberry Pi 3B by marking the VEC clk critical so that it
  doesn't get turned off and hang the bus"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: bcm: rpi: Mark VEC clock as CLK_IGNORE_UNUSED
2026-04-26 14:03:20 -07:00
Linus Torvalds
20b64cf870 Merge tag 'tsm-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm
Pull PCIe TSP update from Dan Williams:
 "A small update for the TSM core. It is arguably a fix and coming in
  late as I have been offline the past few weeks:

   - Drop class_create() for the 'tsm' class"

* tag 'tsm-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm:
  virt: coco: change tsm_class to a const struct
2026-04-26 09:51:29 -07:00
Linus Torvalds
b935117fe6 Merge tag 'kbuild-fixes-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild fixes from Nicolas Schier:

 - builddeb - avoid recompiles for non-cross-compiles

   Avoid triggering complete rebuilds for non-cross-compile Debian
   package builds by only triggering the rebuild of host tools for
   actual cross-compile builds

 - Never respect CONFIG_WERROR / W=e to fixdep

   Avoid spurious rebuilds of fixdep w/ and w/o -Werror during a single
   kbuild invocation by never respecting CONFIG_WERROR for fixdep

* tag 'kbuild-fixes-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
  kbuild: Never respect CONFIG_WERROR / W=e to fixdep
  kbuild: builddeb - avoid recompiles for non-cross-compiles
2026-04-25 17:04:15 -07:00
Linus Torvalds
2ff1bc41ef Merge tag 'power-utilities-2026.04.25' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull power utility updates from Len Brown:
 "x86_energy_perf_policy:
   - Initial SoC Slider support

 turbostat:
   - Display HT siblings in cpu# order
   - Add Module-ID column
   - Print Core-ID and APIC-ID in hex
   - Fix misc bugs"

* tag 'power-utilities-2026.04.25' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power x86_energy_perf_policy: Version 2026.04.25
  tools/power x86_energy_perf_policy.8: Document SoC Slider Options
  tools/power x86_energy_perf_policy: Enhances SoC Slider related checks
  tools/power turbostat: v2026.04.21
  tools/power turbostat: Process HT siblings in CPU order
  tools/power turbostat: Show module_id column
  tools/power turbostat: Print core_id and apic_id in hex
  tools/power turbostat: Cleanup print helper functions
  tools/power turbostat: Fix --cpu-set 1 regression on HT systems
  tools/power turbostat: Fix --cpu-set 0 regression on HT systems
  tools/power turbostat: Fix unrecognized option '-P'
  tools/power turbostat: Fix AMD RAPL regression on big systems
  tools/power/x86: Add SOC slider and platform profile support
2026-04-25 16:58:34 -07:00
Linus Torvalds
211d593314 Merge tag 'rtc-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC updates from Alexandre Belloni:
 "Subsystem:
   - add data_race() in rtc_dev_poll()

  Drivers:
   - remove i2c_match_id usage
   - abx80x: Disable alarm feature if no interrupt attached
   - ti-k3: support resuming from IO DDR low power mode"

* tag 'rtc-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
  rtc: abx80x: Disable alarm feature if no interrupt attached
  rtc: ntxec: fix OF node reference imbalance
  rtc: pic32: allow driver to be compiled with COMPILE_TEST
  rtc: ti-k3: Add support to resume from IO DDR low power mode
  rtc: cmos: Use platform_get_irq_optional() in cmos_platform_probe()
  dt-bindings: rtc: add olpc,xo1-rtc to trivial-rtc
  dt-bindings: rtc: sc2731: Add compatible for SC2730
  rtc: add data_race() in rtc_dev_poll()
  rtc: armada38x: zalloc + calloc to single allocation
  dt-bindings: rtc: isl12026: convert to YAML schema
  dt-bindings: rtc: microcrystal,rv3028: Allow to specify vdd-supply
  rtc: max77686: convert to i2c_new_ancillary_device
  dt-bindings: rtc: mpfs-rtc: permit resets
  rtc: rx8025: Remove use of i2c_match_id()
  rtc: rv8803: Remove use of i2c_match_id()
  rtc: rs5c372: Remove use of i2c_match_id()
  rtc: pcf2127: Remove use of i2c_match_id()
  rtc: m41t80: Remove use of i2c_match_id()
  rtc: abx80x: Remove use of i2c_match_id()
2026-04-25 16:39:03 -07:00
Linus Torvalds
1d9f1b5e43 Merge tag 'for-next-tpm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull tpm updates from Jarkko Sakkinen:
 "Here are the accumulated fixes for 7.1-rc1 and a single structural
  change worth mentioning separately: Rafael's commit converting tpm_crb
  from ACPI driver to a platform driver"

* tag 'for-next-tpm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  tpm: tpm_tis: stop transmit if retries are exhausted
  tpm: tpm_tis: add error logging for data transfer
  tpm: avoid -Wunused-but-set-variable
  tpm: Use kfree_sensitive() to free auth session in tpm_dev_release()
  tpm2-sessions: Fix missing tpm_buf_destroy() in tpm2_read_public()
  tpm: Fix auth session leak in tpm2_get_random() error path
  tpm: i2c: atmel: fix block comment formatting
  tpm_crb: Convert ACPI driver to a platform one
  tpm: Make tcpci_pm_ops variable static const
2026-04-25 16:20:52 -07:00
Len Brown
6112da1ea4 Merge branches 'turbostat' and 'x86_energy_perf_policy' into power-utilities 2026-04-25 14:26:32 -04:00
Len Brown
f1c35c7319 tools/power x86_energy_perf_policy: Version 2026.04.25
Since v2025.11.22:
	Initial SoC Slider support
	SoC Slider is an SoC-wide power/performance policy setting.
	On SoC Slider systems, EPP plays a diminished role.

Whitespace cleanup via: indent -npro -kr -i8 -ts8 -sob -l160 -ss -ncs -cp1

No functional changes

Signed-off-by: Len Brown <len.brown@intel.com>
2026-04-25 14:18:45 -04:00
Len Brown
18c5b9ea4e tools/power x86_energy_perf_policy.8: Document SoC Slider Options
x86_energy_perf_policy accesses the SoC Slider via standard
user/kernel APIs to the processor_thermal_soc_slider driver.

Machines that support SoC Slider largely use it instead of EPP,
which may continue to exist in a diminished role, or vanish entirely.

Signed-off-by: Len Brown <len.brown@intel.com>
2026-04-25 14:16:53 -04:00
Len Brown
25ff5848c0 tools/power x86_energy_perf_policy: Enhances SoC Slider related checks
When processor_thermal_soc_slider is loaded, its slider
and offset modparams are visible.  Check that the driver
actually registered the profile named "SoC Slider" before
reading or writing these modparams.

n.b. This utility allows writing the Slider and Offset modparams
even if the driver policy is not "balanced".  Currently the
processor_thermal_soc_slider consults those modparams
only in "balanced" mode.

Signed-off-by: Len Brown <len.brown@intel.com>
2026-04-25 14:10:13 -04:00
Maíra Canal
522567362b clk: bcm: rpi: Mark VEC clock as CLK_IGNORE_UNUSED
On Raspberry Pi 3B, the VEC clock is used by the VideoCore firmware
display driver, which remains active until the vc4 driver loads and
sends NOTIFY_DISPLAY_DONE. If this clock is disabled during boot, a bus
lockup happens and the firmware becomes unresponsive, causing a complete
system lockup.

Mark the VEC clock with CLK_IGNORE_UNUSED so it survives the unused
clock disablement and remains available until the vc4 driver takes over
display management.

Fixes: 672299736a ("clk: bcm: rpi: Manage clock rate in prepare/unprepare callbacks")
Reported-by: Mark Brown <broonie@kernel.org>
Closes: https://lore.kernel.org/r/5f0bec08-f458-4fba-8bf3-06817a100c4c@sirena.org.uk
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patch.msgid.link/20260401111416.562279-2-mcanal@igalia.com
Tested-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Brian Masney <bmasney@redhat.com> # Active contributor to clk
Reviewed-by: Stefan Wahren <wahrenst@gmx.net>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2026-04-25 10:51:23 -07:00
Linus Torvalds
897d54018c Merge tag 'fbdev-for-7.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
Pull fbdev fixes from Helge Deller:

 - request memory region before use (cobalt_lcdfb, clps711x-fb, hgafb)

 - reference cleanups in failure path (offb, savage)

 - a spelling fix (atyfb)

* tag 'fbdev-for-7.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
  fbdev: hgafb: Request memory region before ioremap
  fbdev: clps711x-fb: Request memory region for MMIO
  fbdev: cobalt_lcdfb: Request memory region
  fbdev: atyfb: Fix spelling mistake "enfore" -> "enforce"
  fbdev: savage: fix probe-path EDID cleanup leaks
  fbdev: offb: fix PCI device reference leak on probe failure
2026-04-25 07:48:33 -07:00
Linus Torvalds
129d6eb266 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux
Pull ARM updates from Russell King:

 - fix a race condition handling PG_dcache_clean

 - further cleanups for the fault handling, allowing RT to be enabled

 - fixing nzones validation in adfs filesystem driver

 - fix for module unwinding

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux:
  ARM: 9463/1: Allow to enable RT
  ARM: 9472/1: fix race condition on PG_dcache_clean in __sync_icache_dcache()
  ARM: 9471/1: module: fix unwind section relocation out of range error
  fs/adfs: validate nzones in adfs_validate_bblk()
  ARM: provide individual is_translation_fault() and is_permission_fault()
  ARM: move FSR fault status definitions before fsr_fs()
  ARM: use BIT() and GENMASK() for fault status register fields
  ARM: move is_permission_fault() and is_translation_fault() to fault.h
  ARM: move vmalloc() lazy-page table population
  ARM: ensure interrupts are enabled in __do_user_fault()
2026-04-25 07:44:26 -07:00
Linus Torvalds
27d128c1cf Merge tag 'trace-ring-buffer-v7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ring-buffer fix from Steven Rostedt:

 - Fix accounting of persistent ring buffer rewind

   On boot up, the head page is moved back to the earliest point of the
   saved ring buffer. This is because the ring buffer being read by user
   space on a crash may not save the part it read. Rewinding the head
   page back to the earliest saved position helps keep those events from
   being lost.

   The number of events is also read during boot up and displayed in the
   stats file in the tracefs directory. It's also used for other
   accounting as well. On boot up, the "reader page" is accounted for
   but a rewind may put it back into the buffer and then the reader page
   may be accounted for again.

   Save off the original reader page and skip accounting it when
   scanning the pages in the ring buffer.

* tag 'trace-ring-buffer-v7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  ring-buffer: Do not double count the reader_page
2026-04-24 15:17:23 -07:00
Linus Torvalds
f3e3dbcea1 Merge tag 'block-7.1-20260424' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:

 - Series for zloop, fixing a variety of issues

 - t10-pi code cleanup

 - Fix for a merge window regression with the bio memory allocation mask

 - Fix for a merge window regression in ublk, caused by an issue with
   the maple tree iteration code at teardown

 - ublk self tests additions

 - Zoned device pgmap fixes

 - Various little cleanups and fixes

* tag 'block-7.1-20260424' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (21 commits)
  Revert "floppy: fix reference leak on platform_device_register() failure"
  ublk: avoid unpinning pages under maple tree spinlock
  ublk: refactor common helper ublk_shmem_remove_ranges()
  ublk: fix maple tree lockdep warning in ublk_buf_cleanup
  selftests: ublk: add ublk auto integrity test
  selftests: ublk: enable test_integrity_02.sh on fio 3.42
  selftests: ublk: remove unused argument to _cleanup
  block: only restrict bio allocation gfp mask asked to block
  block/blk-throttle: Add WQ_PERCPU to alloc_workqueue users
  block: Add WQ_PERCPU to alloc_workqueue users
  block: relax pgmap check in bio_add_page for compatible zone device pages
  block: add pgmap check to biovec_phys_mergeable
  floppy: fix reference leak on platform_device_register() failure
  ublk: use unchecked copy helpers for bio page data
  t10-pi: reduce ref tag code duplication
  zloop: remove irq-safe locking
  zloop: factor out zloop_mark_{full,empty} helpers
  zloop: set RQF_QUIET when completing requests on deleted devices
  zloop: improve the unaligned write pointer warning
  zloop: use vfs_truncate
  ...
2026-04-24 15:06:55 -07:00
Linus Torvalds
fa58e6e900 Merge tag 'io_uring-7.1-20260424' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:

 - Fix for a NOMMU bug with io_uring, where NOMMU doesn't grab page refs
   at mmap time. NOMMU also has entirely broken FOLL_PIN support, yet
   here we are

 - A few fixes covering minor issues introduced in this merge window

 - data race annotation to shut up KCSAN for when io-wq limits are
   applied

 - A nospec addition for direct descriptor file updating. Rest of the
   direct descriptor path already had this, but for some reason the
   update did not. Now they are all the same

 - Various minor defensive changes that claude identified and suggested
   terrible fixes for, turned into actually useful cleanups:

       - Use kvfree() for the imu cache. These can come from kmalloc or
         vmalloc depending on size, but the in-cache ones are capped
         where it's always kmalloc based. Change to kvfree() in the
         cleanup path, making future changes unlikely to mess that up

       - Negative kbuf consumption lengths. Can't happen right now, but
         cqe->res is used directly, which if other codes changes could
         then be an error value

 - Fix for an issue with the futex code, where partial wakes on a
   vectored fuxes would potentially wake the same futex twice, rather
   than move on to the next one. This could confuse an application as it
   would've expected the next futex to have been woken

 - Fix for a bug with ring resizing, where SQEs or CQEs might not have
   been copied correctly if large SQEs or CQEs are used in the ring.
   Application side issue, where SQEs or CQEs might have been lost
   during resize

 - Fix for a bug where EPOLL_URING_WAKE might have been lost, causing a
   multishot poll to not be terminated when it's nested, like it should
   have been

 - Fix for an issue with signed comparison of poll references for the
   slow path

 - Fix for a user struct UAF in the zcrx code

 - Two minor zcrx cleanups

* tag 'io_uring-7.1-20260424' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring: take page references for NOMMU pbuf_ring mmaps
  io_uring/poll: ensure EPOLL_ONESHOT is propagated for EPOLL_URING_WAKE
  io_uring/zcrx: warn on freelist violations
  io_uring/zcrx: clear RQ headers on init
  io_uring/zcrx: fix user_struct uaf
  io_uring/register: fix ring resizing with mixed/large SQEs/CQEs
  io_uring/futex: ensure partial wakes are appropriately dequeued
  io_uring/rw: add defensive hardening for negative kbuf lengths
  io_uring/rsrc: use kvfree() for the imu cache
  io_uring/rsrc: unify nospec indexing for direct descriptors
  io_uring: fix spurious fput in registered ring path
  io_uring: fix iowq_limits data race in tctx node addition
  io_uring/tctx: mark io_wq as exiting before error path teardown
  io_uring/tctx: check for setup tctx->io_wq before teardown
  io_uring/poll: fix signed comparison in io_poll_get_ownership()
2026-04-24 15:00:54 -07:00
Linus Torvalds
b85900e91c Merge tag 'nfs-for-7.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
 "Bugfixes:

   - Fix handling of ENOSPC so that if we have to resend writes, they
     are written synchronously

   - SUNRPC RDMA transport fixes from Chuck

   - Several fixes for delegated timestamps in NFSv4.2

   - Failure to obtain a directory delegation should not cause stat() to
     fail with NFSv4

   - Rename was failing to update timestamps when a directory delegation
     is held on NFSv4

   - Ensure we check rsize/wsize after crossing a NFSv4 filesystem
     boundary

   - NFSv4/pnfs:

      - If the server is down, retry the layout returns on reboot

      - Fallback to MDS could result in a short write being incorrectly
        logged

  Cleanups:

   - Use memcpy_and_pad in decode_fh"

* tag 'nfs-for-7.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (21 commits)
  NFS: Fix RCU dereference of cl_xprt in nfs_compare_super_address
  NFS: remove redundant __private attribute from nfs_page_class
  NFSv4.2: fix CLONE/COPY attrs in presence of delegated attributes
  NFS: fix writeback in presence of errors
  nfs: use memcpy_and_pad in decode_fh
  NFSv4.1: Apply session size limits on clone path
  NFSv4: retry GETATTR if GET_DIR_DELEGATION failed
  NFS: fix RENAME attr in presence of directory delegations
  pnfs/flexfiles: validate ds_versions_cnt is non-zero
  NFS/blocklayout: print each device used for SCSI layouts
  xprtrdma: Post receive buffers after RPC completion
  xprtrdma: Scale receive batch size with credit window
  xprtrdma: Replace rpcrdma_mr_seg with xdr_buf cursor
  xprtrdma: Decouple frwr_wp_create from frwr_map
  xprtrdma: Close lost-wakeup race in xprt_rdma_alloc_slot
  xprtrdma: Avoid 250 ms delay on backlog wakeup
  xprtrdma: Close sendctx get/put race that can block a transport
  nfs: update inode ctime after removexattr operation
  nfs: fix utimensat() for atime with delegated timestamps
  NFS: improve "Server wrote zero bytes" error
  ...
2026-04-24 14:20:03 -07:00
Linus Torvalds
ac2dc6d574 Merge tag 'ceph-for-7.1-rc1' of https://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
 "We have a series from Alex which extends CephFS client metrics with
  support for per-subvolume data I/O performance and latency tracking
  (metadata operations aren't included) and a good variety of fixes and
  cleanups across RBD and CephFS"

* tag 'ceph-for-7.1-rc1' of https://github.com/ceph/ceph-client:
  ceph: add subvolume metrics collection and reporting
  ceph: parse subvolume_id from InodeStat v9 and store in inode
  ceph: handle InodeStat v8 versioned field in reply parsing
  libceph: Fix slab-out-of-bounds access in auth message processing
  rbd: fix null-ptr-deref when device_add_disk() fails
  crush: cleanup in crush_do_rule() method
  ceph: clear s_cap_reconnect when ceph_pagelist_encode_32() fails
  ceph: only d_add() negative dentries when they are unhashed
  libceph: update outdated comment in ceph_sock_write_space()
  libceph: Remove obsolete session key alignment logic
  ceph: fix num_ops off-by-one when crypto allocation fails
  libceph: Prevent potential null-ptr-deref in ceph_handle_auth_reply()
2026-04-24 13:47:19 -07:00
Linus Torvalds
ff9726d7a0 Merge tag 'ntfs-for-7.1-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs
Pull ntfs updates from Namjae Jeon:

 - Fix potential data leakage by zeroing the portion of the straddle
   block beyond initialized_size when reading non-resident attributes

 - Remove unnecessary zeroing in ntfs_punch_hole() for ranges beyond
   initialized_size, as they are already returned as zeros on read

 - Fix writable check in ntfs_file_mmap_prepare() to correctly handle
   shared mappings using VMA_SHARED_BIT | VMA_MAYWRITE_BIT

 - Use page allocation instead of kmemdup() for IOMAP_INLINE data to
   ensure page-aligned address and avoid BUG trap in
   iomap_inline_data_valid() caused by the page boundary check

 - Add a size check before memory allocation in ntfs_attr_readall() and
   reject overly large attributes

 - Remove unneeded noop_direct_IO from ntfs_aops as it is no longer
   required following the FMODE_CAN_ODIRECT flag

 - Fix seven static analysis warnings reported by Smatch

* tag 'ntfs-for-7.1-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs:
  ntfs: use page allocation for resident attribute inline data
  ntfs: fix mmap_prepare writable check for shared mappings
  ntfs: fix potential 32-bit truncation in ntfs_write_cb()
  ntfs: fix uninitialized variable in ntfs_map_runlist_nolock
  ntfs: delete dead code
  ntfs: add missing error code in ntfs_mft_record_alloc()
  ntfs: fix uninitialized variables in ntfs_ea_set_wsl_inode()
  ntfs: fix uninitialized pointer in ntfs_write_mft_block
  ntfs: fix uninitialized variable in ntfs_write_simple_iomap_begin_non_resident
  ntfs: remove noop_direct_IO from address_space_operations
  ntfs: limit memory allocation in ntfs_attr_readall
  ntfs: not zero out range beyond init in punch_hole
  ntfs: zero out stale data in straddle block beyond initialized_size
2026-04-24 13:40:25 -07:00
Linus Torvalds
bdcb864c71 Merge tag '9p-for-7.1-rc1' of https://github.com/martinetd/linux
Pull 9p updates from Dominique Martinet:

 - 9p access flag fix (cannot change access flag since new mount API implem)

 - some minor cleanup

* tag '9p-for-7.1-rc1' of https://github.com/martinetd/linux:
  9p/trans_xen: replace simple_strto* with kstrtouint
  9p/trans_xen: make cleanup idempotent after dataring alloc errors
  9p: document missing enum values in kernel-doc comments
  9p: fix access mode flags being ORed instead of replaced
  9p: fix memory leak in v9fs_init_fs_context error path
2026-04-24 13:37:26 -07:00
Linus Torvalds
f9569c6ce4 Merge tag 'spdx-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx
Pull SPDX update from Greg KH:
 "Here is a single SPDX-like change for 7.1-rc1. It explicitly allows
  the use of SPDX-FileCopyrightText which has been used already in many
  files.

  At the same time, update checkpatch to catch any "non allowed" spdx
  identifiers as we don't want to go overboard here.

  This has been in linux-next for a long time with no reported problems"

* tag 'spdx-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
  LICENSES: Explicitly allow SPDX-FileCopyrightText
2026-04-24 13:30:54 -07:00
Linus Torvalds
cb4eb6771c Merge tag 'char-misc-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc / IIO / and others driver updates from Greg KH:
 "Here is the char/misc/iio and other smaller driver subsystem updates
  for 7.1-rc1. Lots of stuff in here, all tiny, but relevant for the
  different drivers they touch. Major points in here is:

   - the usual large set of new IIO drivers and updates for that
     subsystem (the large majority of this diffstat)

   - lots of comedi driver updates and bugfixes

   - coresight driver updates

   - interconnect driver updates and additions

   - mei driver updates

   - binder (both rust and C versions) updates and fixes

   - lots of other smaller driver subsystem updates and additions

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (405 commits)
  coresight: tpdm: fix invalid MMIO access issue
  mei: me: add nova lake point H DID
  mei: lb: add late binding version 2
  mei: bus: add mei_cldev_uuid
  w1: ds2490: drop redundant device reference
  bus: mhi: host: pci_generic: Add Telit FE912C04 modem support
  mei: csc: wake device while reading firmware status
  mei: csc: support controller with separate PCI device
  mei: convert PCI error to common errno
  mei: trace: print return value of pci_cfg_read
  mei: me: move trace into firmware status read
  mei: fix idle print specifiers
  mei: me: use PCI_DEVICE_DATA macro
  sonypi: Convert ACPI driver to a platform one
  misc: apds990x: fix all kernel-doc warnings
  most: usb: Use kzalloc_objs for endpoint address array
  hpet: Convert ACPI driver to a platform one
  misc: vmw_vmci: Fix spelling mistakes in comments
  parport: Remove completed item from to-do list
  char: remove unnecessary module_init/exit functions
  ...
2026-04-24 13:23:50 -07:00
Linus Torvalds
b2680ba4a2 Merge tag 'spi-fix-v7.1-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "This is quite a big set of fixes, almost all from Johan Hovold who is
  on an ongoing quest to clean up issues with probe and removal handling
  in drivers.

  There isn't anything too concerning here especially with the
  deregistration stuff which will very rarely get run in production
  systems since this is all platform devices in the SoC on embedded
  hardware, but it's all real issues which should be fixed. There's more
  in flight here.

  We also have a few other minor fixes, one from Felix Gu along the same
  lines as Johan's work and a couple of documentation things"

* tag 'spi-fix-v7.1-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (23 commits)
  spi: fix controller cleanup() documentation
  spi: fix resource leaks on device setup failure
  spi: axiado: clean up probe return value
  spi: axiado: rename probe error labels
  spi: axiado: fix runtime pm imbalance on probe failure
  spi: orion: clean up probe return value
  spi: orion: fix clock imbalance on registration failure
  spi: orion: fix runtime pm leak on unbind
  spi: imx: fix runtime pm leak on probe deferral
  spi: mpc52xx: fix use-after-free on registration failure
  spi: Fix the error description in the `ptp_sts_word_post` comment
  spi: topcliff-pch: fix use-after-free on unbind
  spi: topcliff-pch: fix controller deregistration
  spi: orion: fix controller deregistration
  spi: mxic: fix controller deregistration
  spi: mpc52xx: fix use-after-free on unbind
  spi: mpc52xx: fix controller deregistration
  spi: cadence-quadspi: fix controller deregistration
  spi: cadence: fix controller deregistration
  spi: mtk-snfi: fix memory leak in probe
  ...
2026-04-24 13:16:36 -07:00
Linus Torvalds
f643998361 Merge tag 'regulator-fix-v7.1-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
 "Just one trivial cleanup of the user visible prompts in Kconfig here,
  standardising how we describe Qualcomm"

* tag 'regulator-fix-v7.1-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: qcom: Unify user-visible "Qualcomm" name
2026-04-24 13:06:25 -07:00
Masami Hiramatsu (Google)
92d5a60672 ring-buffer: Do not double count the reader_page
Since the cpu_buffer->reader_page is updated if there are unwound
pages. After that update, we should skip the page if it is the
original reader_page, because the original reader_page is already
checked.

Cc: stable@vger.kernel.org
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Ian Rogers <irogers@google.com>
Link: https://patch.msgid.link/177701353063.2223789.1471163147644103306.stgit@mhiramat.tok.corp.google.com
Fixes: ca296d32ec ("tracing: ring_buffer: Rewind persistent ring buffer on reboot")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2026-04-24 15:34:39 -04:00
Linus Torvalds
6e2d43100c Merge tag 'regmap-fix-v7.1-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
 "There's couple of patches here that came in since my pull request:

   - What is effectively a quirk for shoehorning support for a wider
     range of I2C regmaps on weirdly restricted SMBus controllers

   - One minor fix for a memory leak on in error handling in the dummy
     driver used by the KUnit tests"

* tag 'regmap-fix-v7.1-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: ram: fix memory leaks in __regmap_init_ram() on error
  regmap-i2c: add SMBus byte/word reg16 bus for adapters lacking I2C_FUNC_I2C
2026-04-24 12:11:26 -07:00
Linus Torvalds
d0fc5bf9fe Merge tag 'gpio-fixes-for-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:

 - fix a regression in gpio-rockchip introduced on older chips during
   the merge window when converting to dynamic GPIO base

 - fix AST2700 debounce selector bit definitions in gpio-aspeed

* tag 'gpio-fixes-for-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: aspeed: fix AST2700 debounce selector bit definitions
  gpio: rockchip: Fix GPIO regression after conversion to dynamic base allocation
2026-04-24 11:59:46 -07:00
Linus Torvalds
1fe93b2a2a Merge tag 'sound-fix-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "Here are the rest of small updates for 7.1-rc1. All small fixes mostly
  for device-specific issues or regressions.

  Core:
   - Fix a potential data race in fasync handling

  USB-audio:
   - New device support: Line6 POD HD PRO, NexiGo N930W webcam
   - Fixes for Audio Advantage Micro II SPDIF switch and E-MU sample
     rates
   - Limit UAC2 rate parsing to prevent potential overflows

  HD-Audio:
   - Device-specific quirks for HP, Acer, and Honor laptops
   - Fix for TAS2781 SPI device abnormal sound
   - Move Intel firmware loading into probe work to avoid stalling

  ASoC:
   - New support for TI TAS5832
   - Fixes for SoundWire SDCA/DisCo boolean parsing
   - Driver-specific fixes for Intel SOF, ES8311, RT1320, and PXA2xx

  Misc:
   - Fixes for resource leaks and data races in 6fire, caiaq, als4000,
     and pcmtest drivers"

* tag 'sound-fix-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (41 commits)
  Revert "ALSA: pcmtest: fix reference leak on failed device registration"
  ASoC: tas2781: Add tas5832 support
  ASoC: dt-bindings: ti,tas2781: Add TAS5832 support
  ALSA: usb-audio: Fix Audio Advantage Micro II SPDIF switch
  ALSA: usb-audio: Avoid false E-MU sample-rate notifications
  ASoC: sdw_utils: cs42l43: allow spk component names to be combined
  ASoC: qcom: x1e80100: limit speaker volumes
  ALSA: hda/realtek - Add mute LED support for HP Victus 15-fa2xxx
  ALSA: pcmtest: Fix resource leaks in module init error paths
  ALSA: usb-audio/line6: Add support for POD HD PRO
  ALSA: hda/realtek: Add LED fixup for HP EliteBook 6 G2a Laptops
  ASoC: SDCA: Fix reading of mipi-sdca-control-deferrable
  regmap: sdw-mbq: Allow defers on undeferrable controls
  Revert "ALSA: usb-audio: Add quirk for SmartlinkTechnology M01"
  ALSA: als4000: Fix capture trigger chip->mode race
  ALSA: core: Fix potential data race at fasync handling
  ALSA: hda/tas2781: Fix sound abnormal issue on some SPI device
  ALSA: hda/realtek: add quirk for Acer Nitro 16 AN16-41
  ALSA: caiaq: Fix control_put() result and cache rollback
  ALSA: pcmtest: fix reference leak on failed device registration
  ...
2026-04-24 11:49:20 -07:00
Linus Torvalds
cf950766e9 Merge tag 'drm-fixes-2026-04-24' of https://gitlab.freedesktop.org/drm/kernel
Pull more drm fixes from Dave Airlie:
 "These are the regular fixes that have built up over last couple of
  weeks, all pretty minor and spread all over.

  atomic:
   - raise the vblank timeout to avoid it on virtual drivers
   - fix colorop duplication

  bridge:
   - stm_lvds: state check fix
   - dw-mipi-dsi: bridge reference leak fix

  panel:
   - visionx-rm69299: init fix

  dma-fence:
   - fix sparse warning

  dma-buf:
   - UAF fix

  panthor:
   - mapping fix

  arcgpu:
   - device_node reference leak fix

  nouveau:
   - memory leak in error path fix
   - overflow in reloc path for old hw fix

  hv:
   - Kconfig fix

  v3d:
   - infinite loop fix"

* tag 'drm-fixes-2026-04-24' of https://gitlab.freedesktop.org/drm/kernel:
  drm/nouveau: fix u32 overflow in pushbuf reloc bounds check
  MAINTAINERS: split hisilicon maintenance and add Yongbang Shi for hibmc-drm matainers
  drm/v3d: Reject empty multisync extension to prevent infinite loop
  drm/panel: visionox-rm69299: Make use of prepare_prev_first
  drm/drm_atomic: duplicate colorop states if plane color pipeline in use
  drm/nouveau: fix nvkm_device leak on aperture removal failure
  hv: Select CONFIG_SYSFB only for CONFIG_HYPERV_VMBUS
  dma-fence: Silence sparse warning in dma_fence_describe
  drm/bridge: dw-mipi-dsi: Fix bridge leak when host attach fails
  drm/arcpgu: fix device node leak
  drm/panthor: Fix outdated function documentation
  drm/panthor: Extend VM locked region for remap case to be a superset
  dma-buf: fix UAF in dma_buf_put() tracepoint
  drm/bridge: stm_lvds: Do not fail atomic_check on disabled connector
  drm/atomic: Increase timeout in drm_atomic_helper_wait_for_vblanks()
2026-04-24 11:44:52 -07:00
Linus Torvalds
92c4c9fdc8 Merge tag 'drm-next-2026-04-24' of https://gitlab.freedesktop.org/drm/kernel
Pull drm next fixes from Dave Airlie:
 "This is the first of two fixes for the merge PRs, the other is based
  on 7.0 branch. This mostly AMD fixes, a couple of weeks of backlog
  built up and this weeks. The main complaint I've seen is some boot
  warnings around the FP code handling which this should fix. Otherwise
  a single rcar-du and a single i915 fix.

  amdgpu:
   - SMU 14 fixes
   - Partition fixes
   - SMUIO 15.x fix
   - SR-IOV fixes
   - JPEG fix
   - PSP 15.x fix
   - NBIF fix
   - Devcoredump fixes
   - DPC fix
   - RAS fixes
   - Aldebaran smu fix
   - IP discovery fix
   - SDMA 7.1 fix
   - Runtime pm fix
   - MES 12.1 fix
   - DML2 fixes
   - DCN 4.2 fixes
   - YCbCr fixes
   - Freesync fixes
   - ISM fixes
   - Overlay cursor fix
   - DC FP fixes
   - UserQ locking fixes
   - DC idle state manager fix
   - ASPM fix
   - GPUVM SVM fix
   - DCE 6 fix

  amdkfd:
   - Fix memory clear handling
   - num_of_nodes bounds check fix

  i915:
   - Fix uninitialized variable in the alignment loop [psr]

  rcar-du:
   - fix NULL-ptr crash"

* tag 'drm-next-2026-04-24' of https://gitlab.freedesktop.org/drm/kernel: (75 commits)
  drm/amdkfd: Add upper bound check for num_of_nodes
  drm: rcar-du: Fix crash when no CMM is available
  drm/amd/display: Disable 10-bit truncation and dithering on DCE 6.x
  drm/amdgpu: OR init_pte_flags into invalid leaf PTE updates
  drm/amd: Adjust ASPM support quirk to cover more Intel hosts
  drm/amd/display: Undo accidental fix revert in amdgpu_dm_ism.c
  drm/i915/psr: Init variable to avoid early exit from et alignment loop
  drm/amdgpu: drop userq fence driver refs out of fence process()
  drm/amdgpu/userq: unpin and unref doorbell and wptr outside mutex
  drm/amdgpu/userq: use pm_runtime_resume_and_get and fix err handling
  drm/amdgpu/userq: unmap_helper dont return the queue state
  drm/amdgpu/userq: unmap is to be called before freeing doorbell/wptr bo
  drm/amdgpu/userq: hold root bo lock in caller of input_va_validate
  drm/amdgpu/userq: caller to take reserv lock for vas_list_cleanup
  drm/amdgpu/userq: create_mqd does not need userq_mutex
  drm/amdgpu/userq: dont lock root bo with userq_mutex held
  drm/amdgpu/userq: fix kerneldoc for amdgpu_userq_ensure_ev_fence
  drm/amdgpu/userq: clean the VA mapping list for failed queue creation
  drm/amdgpu/userq: avoid uneccessary locking in amdgpu_userq_create
  drm/amd/display: Fix ISM teardown crash from NULL dc dereference
  ...
2026-04-24 11:33:23 -07:00
Linus Torvalds
892c894b4b Merge tag 'locking-urgent-2026-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:

 - Fix ww_mutex regression, which caused hangs/pauses in some DRM drivers

 - Fix rtmutex proxy-rollback bug

* tag 'locking-urgent-2026-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/mutex: Fix ww_mutex wait_list operations
  rtmutex: Use waiter::task instead of current in remove_waiter()
2026-04-24 10:14:29 -07:00
Linus Torvalds
8f4e8687c8 Merge tag 'x86-urgent-2026-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:

 - Prevent deadlock during shstk sigreturn (Rick Edgecombe)

 - Disable FRED when PTI is forced on (Dave Hansen)

 - Revert a CPA INVLPGB optimization that did not properly handle
   discontiguous virtual addresses (Dave Hansen)

* tag 'x86-urgent-2026-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Revert INVLPGB optimization for set_memory code
  x86/cpu: Disable FRED when PTI is forced on
  x86/shstk: Prevent deadlock during shstk sigreturn
2026-04-24 10:05:42 -07:00
Linus Torvalds
feff82eb5f Merge tag 'riscv-for-linus-7.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Paul Walmsley:
 "There is one significant change outside arch/riscv in this pull
  request: the addition of a set of KUnit tests for strlen(), strnlen(),
  and strrchr().

  Otherwise, the most notable changes are to add some RISC-V-specific
  string function implementations, to remove XIP kernel support, to add
  hardware error exception handling, and to optimize our runtime
  unaligned access speed testing.

  A few comments on the motivation for removing XIP support. It's been
  broken in the RISC-V kernel for months. The code is not easy to
  maintain. Furthermore, for XIP support to truly be useful for RISC-V,
  we think that compile-time feature switches would need to be added for
  many of the RISC-V ISA features and microarchitectural properties that
  are currently implemented with runtime patching. No one has stepped
  forward to take responsibility for that work, so many of us think it's
  best to remove it until clear use cases and champions emerge.

  Summary:

   - Add Kunit correctness testing and microbenchmarks for strlen(),
     strnlen(), and strrchr()

   - Add RISC-V-specific strnlen(), strchr(), strrchr() implementations

   - Add hardware error exception handling

   - Clean up and optimize our unaligned access probe code

   - Enable HAVE_IOREMAP_PROT to be able to use generic_access_phys()

   - Remove XIP kernel support

   - Warn when addresses outside the vmemmap range are passed to
     vmemmap_populate()

   - Update the ACPI FADT revision check to warn if it's not at least
     ACPI v6.6, which is when key RISC-V-specific tables were added to
     the specification

   - Increase COMMAND_LINE_SIZE to 2048 to match ARM64, x86, PowerPC,
     etc.

   - Make kaslr_offset() a static inline function, since there's no need
     for it to show up in the symbol table

   - Add KASLR offset and SATP to the VMCOREINFO ELF notes to improve
     kdump support

   - Add Makefile cleanup rule for vdso_cfi copied source files, and add
     a .gitignore for the build artifacts in that directory

   - Remove some redundant ifdefs that check Kconfig macros

   - Add missing SPDX license tag to the CFI selftest

   - Simplify UTS_MACHINE assignment in the RISC-V Makefile

   - Clarify some unclear comments and remove some superfluous comments

   - Fix various English typos across the RISC-V codebase"

* tag 'riscv-for-linus-7.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (31 commits)
  riscv: Remove support for XIP kernel
  riscv: Reuse compare_unaligned_access() in check_vector_unaligned_access()
  riscv: Split out compare_unaligned_access()
  riscv: Reuse measure_cycles() in check_vector_unaligned_access()
  riscv: Split out measure_cycles() for reuse
  riscv: Clean up & optimize unaligned scalar access probe
  riscv: lib: add strrchr() implementation
  riscv: lib: add strchr() implementation
  riscv: lib: add strnlen() implementation
  lib/string_kunit: extend benchmarks to strnlen() and chr searches
  lib/string_kunit: add performance benchmark for strlen()
  lib/string_kunit: add correctness test for strrchr()
  lib/string_kunit: add correctness test for strnlen()
  lib/string_kunit: add correctness test for strlen()
  riscv: vdso_cfi: Add .gitignore for build artifacts
  riscv: vdso_cfi: Add clean rule for copied sources
  riscv: enable HAVE_IOREMAP_PROT
  riscv: mm: WARN_ON() for bad addresses in vmemmap_populate()
  riscv: acpi: update FADT revision check to 6.6
  riscv: add hardware error trap handler support
  ...
2026-04-24 10:00:37 -07:00
Linus Torvalds
ff57d59200 Merge tag 'loongarch-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:

 - Adjust build infrastructure for 32BIT/64BIT

 - Add HIGHMEM (PKMAP and FIX_KMAP) support

 - Show and handle CPU vulnerabilites correctly

 - Batch the icache maintenance for jump_label

 - Add more atomic instructions support for BPF JIT

 - Add more features (e.g. fsession) support for BPF trampoline

 - Some bug fixes and other small changes

* tag 'loongarch-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (21 commits)
  selftests/bpf: Enable CAN_USE_LOAD_ACQ_STORE_REL for LoongArch
  LoongArch: BPF: Add fsession support for trampolines
  LoongArch: BPF: Introduce emit_store_stack_imm64() helper
  LoongArch: BPF: Support up to 12 function arguments for trampoline
  LoongArch: BPF: Support small struct arguments for trampoline
  LoongArch: BPF: Open code and remove invoke_bpf_mod_ret()
  LoongArch: BPF: Support load-acquire and store-release instructions
  LoongArch: BPF: Support 8 and 16 bit read-modify-write instructions
  LoongArch: BPF: Add the default case in emit_atomic() and rename it
  LoongArch: Define instruction formats for AM{SWAP/ADD}.{B/H} and DBAR
  LoongArch: Batch the icache maintenance for jump_label
  LoongArch: Add flush_icache_all()/local_flush_icache_all()
  LoongArch: Add spectre boundry for syscall dispatch table
  LoongArch: Show CPU vulnerabilites correctly
  LoongArch: Make arch_irq_work_has_interrupt() true only if IPI HW exist
  LoongArch: Use get_random_canary() for stack canary init
  LoongArch: Improve the logging of disabling KASLR
  LoongArch: Align FPU register state to 32 bytes
  LoongArch: Handle CONFIG_32BIT in syscall_get_arch()
  LoongArch: Add HIGHMEM (PKMAP and FIX_KMAP) support
  ...
2026-04-24 09:54:45 -07:00