Commit Graph

1440385 Commits

Author SHA1 Message Date
Cunlong Li
22572dbcd3 cgroup: rstat: relax NMI guard after switch to try_cmpxchg
Commit 36df6e3dbd ("cgroup: make css_rstat_updated nmi safe") used
this_cpu_cmpxchg() for the lockless insertion, and therefore required
both ARCH_HAVE_NMI_SAFE_CMPXCHG and ARCH_HAS_NMI_SAFE_THIS_CPU_OPS in
the NMI guard: on archs without the latter, this_cpu_cmpxchg() falls
back to "local_irq_save() + plain cmpxchg", and local_irq_save()
cannot mask NMIs.

Commit 3309b63a22 ("cgroup: rstat: use LOCK CMPXCHG in
css_rstat_updated") later replaced this_cpu_cmpxchg() with plain
try_cmpxchg() to fix cross-CPU lockless-list corruption, but left the
NMI guard untouched.  After that switch, css_rstat_updated() no longer
performs any this_cpu_*() RMW operations and only relies on the arch
having NMI-safe cmpxchg, so ARCH_HAS_NMI_SAFE_THIS_CPU_OPS is no
longer required in the guard.

Relax the guard accordingly so that archs which have HAVE_NMI and
ARCH_HAVE_NMI_SAFE_CMPXCHG but not ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
(e.g. sparc, powerpc on PPC64/BOOK3S) can benefit from the existing
CONFIG_MEMCG_NMI_SAFETY_REQUIRES_ATOMIC path.  Without this, the css
is never queued in NMI on those archs, and the atomics staged by
account_{slab,kmem}_nmi_safe() are not drained by flush_nmi_stats().

Fixes: 3309b63a22 ("cgroup: rstat: use LOCK CMPXCHG in css_rstat_updated")
Signed-off-by: Cunlong Li <shenxiaogll@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-05-20 09:44:35 -10:00
Qing Ming
8817005efb cgroup/rstat: validate cpu before css_rstat_cpu() access
css_rstat_updated() is exposed as a BPF kfunc and accepts a
caller-provided cpu argument. The function uses cpu for per-cpu rstat
lookups without checking whether it refers to a valid possible CPU.

A BPF iter/cgroup program with CAP_BPF and CAP_PERFMON can pass an
invalid cpu value. On an unfixed UBSCAN_BOUNDS test kernel, cpu ==
0x7fffffff triggers:

  UBSAN: array-index-out-of-bounds in kernel/cgroup/rstat.c:31:9
  index 2147483647 is out of range for type 'long unsigned int [64]'
  Call Trace:
    css_rstat_updated
    bpf_iter_run_prog
    cgroup_iter_seq_show
    bpf_seq_read

Add cpu validation to the BPF-facing css_rstat_updated() kfunc and
move the common implementation to __css_rstat_updated() for in-kernel
callers.

Fixes: a319185be9 ("cgroup: bpf: enable bpf programs to integrate with rstat")
Signed-off-by: Qing Ming <a0yami@mailbox.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-05-18 09:31:52 -10:00
sunshaojie
345f401666 cgroup/cpuset: Return only actually allocated CPUs during partition invalidation
In update_parent_effective_cpumask() with partcmd_invalidate, the CPUs
to return to the parent are computed as:

    adding = cpumask_and(tmp->addmask, xcpus, parent->effective_xcpus);

where xcpus = user_xcpus(cs) which returns cs->exclusive_cpus (if set)
or cs->cpus_allowed. When exclusive_cpus is not set, user_xcpus(cs) can
contain CPUs that were never actually granted to the partition due to
sibling exclusion in compute_excpus(). Consequently, the invalidation
may return CPUs to the parent that remain in use by sibling partitions,
causing overlapping effective_cpus and triggering the
WARN_ON_ONCE(1) in generate_sched_domains().

Use cs->effective_xcpus instead, which reflects the CPUs actually
granted to this partition.

Reproducer (on a 4-CPU machine):

    cd /sys/fs/cgroup
    mkdir a1 b1

    # a1 becomes partition root with CPUs 0-1
    echo "0-1" > a1/cpuset.cpus
    echo "root" > a1/cpuset.cpus.partition

    # b1 becomes partition root with CPUs 1-2, but sibling exclusion
    # reduces its effective_xcpus to CPU 2 only
    echo "1-2" > b1/cpuset.cpus
    echo "root" > b1/cpuset.cpus.partition

    # b1 changes cpus_allowed to 0-1 -> partition invalidation
    echo "0-1" > b1/cpuset.cpus

    # Expected: CPUs 2-3  (only CPU 2 returned from b1)
    # Actual:   CPUs 1-3  (CPU 0-1 returned, overlapping with a1)
    cat cpuset.cpus.effective

dmesg will also show a WARNING from generate_sched_domains() reporting
overlapping partition root effective_cpus.

Fixes: 2a3602030d ("cgroup/cpuset: Don't invalidate sibling partitions on cpuset.cpus conflict")
Cc: stable@vger.kernel.org # v7.0+
Signed-off-by: sunshaojie <sunshaojie@kylinos.cn>
Tested-by: Chen Ridong <chenridong@huaweicloud.com>
Reviewed-by: Chen Ridong <chenridong@huaweicloud.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-05-13 08:54:53 -10:00
Yu Miao
7d8f3158a5 selftests/cgroup: Fix error path leaks in test_percpu_basic
When cg_name_indexed() returns NULL partway through the child creation
loop, the code returned -1 without running cleanup_children and cleanup.
That left the `parent` pathname allocation unreleased and did not remove
child cgroup directories already created under the parent. Fix by jumping
to cleanup_children instead of returning.

When cg_create() fails, `child` (the pathname from cg_name_indexed())
was not freed before cleanup_children. Fix by freeing `child` before
branching to cleanup_children.

Fixes: 90631e1dea ("kselftests: cgroup: add perpcu memory accounting test")
Signed-off-by: Yu Miao <yumiao@kylinos.cn>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-05-13 08:40:52 -10:00
Guopeng Zhang
5dd74441cb cgroup/cpuset: Reserve DL bandwidth only for root-domain moves
cpuset_can_attach() currently adds the bandwidth of all migrating
SCHED_DEADLINE tasks to sum_migrate_dl_bw. If the source and destination
cpuset effective CPU masks do not overlap, the whole sum is then
reserved in the destination root domain.

set_cpus_allowed_dl(), however, subtracts bandwidth from the source
root domain only when the affinity change really moves the task between
root domains. A DL task can move between cpusets that are still in the
same root domain, so including that task in sum_migrate_dl_bw can reserve
destination bandwidth without a matching source-side subtraction.

Share the root-domain move test with set_cpus_allowed_dl(). Keep
nr_migrate_dl_tasks counting all migrating deadline tasks for cpuset DL
task accounting, but add to sum_migrate_dl_bw only for tasks that need a
root-domain bandwidth move. Keep using the destination cpuset effective
CPU mask and leave the broader can_attach()/attach() transaction model
unchanged.

Fixes: 2ef269ef1a ("cgroup/cpuset: Free DL BW in case can_attach() fails")
Cc: stable@vger.kernel.org # v6.10+
Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn>
Reviewed-by: Waiman Long <longman@redhat.com>
Acked-by: Juri Lelli <juri.lelli@redhat.com>
Tested-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-05-11 10:27:14 -10:00
Guopeng Zhang
4a39eda5fd cgroup/cpuset: Reset DL migration state on can_attach() failure
cpuset_can_attach() accumulates temporary SCHED_DEADLINE migration
state in the destination cpuset while walking the taskset.

If a later task_can_attach() or security_task_setscheduler() check
fails, cgroup_migrate_execute() treats cpuset as the failing subsystem
and does not call cpuset_cancel_attach() for it. The partially
accumulated state is then left behind and can be consumed by a later
attach, corrupting cpuset DL task accounting and pending DL bandwidth
accounting.

Reset the pending DL migration state from the common error exit when
ret is non-zero. Successful can_attach() keeps the state for
cpuset_attach() or cpuset_cancel_attach().

Fixes: 2ef269ef1a ("cgroup/cpuset: Free DL BW in case can_attach() fails")
Cc: stable@vger.kernel.org # v6.10+
Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn>
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Chen Ridong <chenridong@huaweicloud.com>
Reviewed-by: Waiman Long <longman@redhat.com>
2026-05-10 22:14:49 -10:00
Hongfu Li
2a3d7256fa selftests/cgroup: Fix string comparison in write_test
Use string comparison (!=) instead of numeric comparison (-ne) for
cpuset values like "0-1".
For example:
$ [[ "0-1" != "2-3" ]] && echo "true" || echo "false"
true
$ [[ "0-1" -ne "2-3" ]] && echo "true" || echo "false"
false

Signed-off-by: Hongfu Li <lihongfu@kylinos.cn>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-05-10 15:54:12 -10:00
Hongfu Li
e32e6f0216 selftests/cgroup: Fix cg_read_strcmp() empty string comparison
cg_read_strcmp() allocated a buffer sized to strlen(expected) + 1,
then passed it to read_text() which calls read(fd, buf, size-1).

When comparing against an empty string (""), strlen("") = 0 gives a
1-byte buffer, and read() is asked to read 0 bytes.  The file content
is never actually read, so strcmp("", buf) always returns 0 regardless
of the real content.  This caused cg_test_proc_killed() to always
report the cgroup as empty immediately, making OOM tests pass without
verifying that processes were killed.

Signed-off-by: Hongfu Li <lihongfu@kylinos.cn>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-05-10 15:53:44 -10:00
Guopeng Zhang
796ad62204 cgroup/dmem: Return -ENOMEM on failed pool preallocation
get_cg_pool_unlocked() handles allocation failures under dmemcg_lock by
dropping the lock, preallocating a pool with GFP_KERNEL, and retrying the
locked lookup and creation path.

If the fallback allocation fails too, pool remains NULL. Since the loop
condition is while (!pool), the function can keep retrying instead of
propagating the allocation failure to the caller.

Set pool to ERR_PTR(-ENOMEM) when the fallback allocation fails so the
loop exits through the existing common return path. The callers already
handle ERR_PTR() from get_cg_pool_unlocked(), so this restores the
expected error path.

Fixes: b168ed458d ("kernel/cgroup: Add "dmem" memory accounting cgroup")
Cc: stable@vger.kernel.org # v6.14+
Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-05-10 15:43:46 -10:00
Chen Wandun
dde2f938d0 cgroup/cpuset: move PF_EXITING check before __GFP_HARDWALL in cpuset_current_node_allowed()
Since prepare_alloc_pages() unconditionally adds __GFP_HARDWALL for the
fast path when cpusets are enabled, the __GFP_HARDWALL check in
cpuset_current_node_allowed() causes the PF_EXITING escape path to be
skipped on the first allocation attempt.  This makes it unreachable in
the common case, so dying tasks can get stuck in direct reclaim or even
trigger OOM while trying to exit, despite being allowed to allocate from
any node.

Move the PF_EXITING check before __GFP_HARDWALL so that dying tasks
can allocate memory from any node to exit quickly, even when cpusets
are enabled.

Also update the function comment to reflect the actual behavior of
prepare_alloc_pages() and the corrected check ordering.

Signed-off-by: Chen Wandun <chenwandun@lixiang.com>
Acked-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-05-07 11:57:31 -10:00
T.J. Mercier
d8769544bd docs: cgroup-v1: Update charge-commit section
Commit 1d8f136a42 ("memcg/hugetlb: remove memcg hugetlb
try-commit-cancel protocol") removed mem_cgroup_commit_charge() and
mem_cgroup_cancel_charge(), but the docs still refer to those functions.
There is no longer any charge cancellation.

Update the docs to match the code.

Signed-off-by: T.J. Mercier <tjmercier@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-05-04 11:02:12 -10:00
Tejun Heo
93618edf75 cgroup: Defer css percpu_ref kill on rmdir until cgroup is depopulated
A chain of commits going back to v7.0 reworked rmdir to satisfy the
controller invariant that a subsystem's ->css_offline() must not run while
tasks are still doing kernel-side work in the cgroup.

[1] d245698d72 ("cgroup: Defer task cgroup unlink until after the task is done switching out")
[2] a72f73c4dd ("cgroup: Don't expose dead tasks in cgroup")
[3] 1b164b876c ("cgroup: Wait for dying tasks to leave on rmdir")
[4] 4c56a8ac68 ("cgroup: Fix cgroup_drain_dying() testing the wrong condition")
[5] 13e786b64b ("cgroup: Increment nr_dying_subsys_* from rmdir context")

[1] moved task cset unlink from do_exit() to finish_task_switch() so a
task's cset link drops only after the task has fully stopped scheduling.
That made tasks past exit_signals() linger on cset->tasks until their final
context switch, which led to a series of problems as what userspace expected
to see after rmdir diverged from what the kernel needs to wait for. [2]-[5]
tried to bridge that divergence: [2] filtered the exiting tasks from
cgroup.procs; [3] had rmdir(2) sleep in TASK_UNINTERRUPTIBLE for them; [4]
fixed the wait's condition; [5] made nr_dying_subsys_* visible
synchronously.

The cgroup_drain_dying() wait in [3] turned out to be a dead end. When the
rmdir caller is also the reaper of a zombie that pins a pidns teardown (e.g.
host PID 1 systemd reaping orphan pids that were re-parented to it during
the same teardown), rmdir blocks in TASK_UNINTERRUPTIBLE waiting for those
pids to free, the pids can't free because PID 1 is the reaper and it's stuck
in rmdir, and the system A-A deadlocks. No internal lock ordering breaks
this; the wait itself is the bug.

The css killing side that drove the original reorder, however, can be made
cleanly asynchronous: ->css_offline() is already async, run from
css_killed_work_fn() driven by percpu_ref_kill_and_confirm(). The fix is to
make that chain start only after all tasks have left the cgroup. rmdir's
user-visible side then returns as soon as cgroup.procs and friends are
empty, while ->css_offline() still runs only after the cgroup is fully
drained.

Verified by the original reproducer (pidns teardown + zombie reaper, runs
under vng) which hangs vanilla and succeeds here, and by per-commit
deterministic repros for [2], [3], [4], [5] with a boot parameter that
widens the post-exit_signals() window so each state is reliably reachable.
Some stress tests on top of that.

cgroup_apply_control_disable() has the same shape of pre-existing race:
when a controller is disabled via subtree_control, kill_css() ran
synchronously while tasks past exit_signals() could still be linked to
the cgroup's csets, and ->css_offline() could fire before they drained.
This patch preserves the existing synchronous behavior at that call site
(kill_css_sync() + kill_css_finish() back-to-back) and a follow-up patch
will defer kill_css_finish() there using a per-css trigger.

This seems like the right approach and I don't see problems with it. The
changes are somewhat invasive but not excessively so, so backporting to
-stable should be okay. If something does turn out to be wrong, the fallback
is to revert the entire chain ([1]-[5]) and rework in the development branch
instead.

v2: Pin cgrp across the deferred destroy work with explicit
    cgroup_get()/cgroup_put() around queue_work() and the work_fn. v1
    wasn't actually broken (ordered cgroup_offline_wq + queue_work order
    in cgroup_task_dead() saved it) but the explicit ref removes the
    dependency on those non-obvious invariants. Also note the
    pre-existing cgroup_apply_control_disable() race in the description;
    a follow-up will defer kill_css_finish() there.

Fixes: 1b164b876c ("cgroup: Wait for dying tasks to leave on rmdir")
Cc: stable@vger.kernel.org # v7.0+
Reported-and-tested-by: Martin Pitt <martin@piware.de>
Link: https://lore.kernel.org/all/afHNg2VX2jy9bW7y@piware.de/
Link: https://lore.kernel.org/all/35e0670adb4abeab13da2c321582af9f@kernel.org/
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2026-05-04 08:52:26 -10:00
Petr Vaněk
981cd33861 docs: cgroup: fix typo 'protetion' -> 'protection'
Fix a small typo in the description of the memory_hugetlb_accounting
mount option.

Signed-off-by: Petr Vaněk <arkamar@atlas.cz>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-04-27 07:55:40 -10:00
Petr Malat
13e786b64b cgroup: Increment nr_dying_subsys_* from rmdir context
Incrementing nr_dying_subsys_* in offline_css(), which is executed by
cgroup_offline_wq worker, leads to a race where user can see the value
to be 0 if he reads cgroup.stat after calling rmdir and before the worker
executes. This makes the user wrongly expect resources released by the
removed cgroup to be available for a new assignment.

Increment nr_dying_subsys_* from kill_css(), which is called from the
cgroup_rmdir() context.

Fixes: ab03125268 ("cgroup: Show # of subsystem CSSes in cgroup.stat")
Signed-off-by: Petr Malat <oss@malat.biz>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-04-23 07:37:40 -10:00
Guopeng Zhang
41d701ddc3 cgroup/cpuset: record DL BW alloc CPU for attach rollback
cpuset_can_attach() allocates DL bandwidth only when migrating
deadline tasks to a disjoint CPU mask, but cpuset_cancel_attach()
rolls back based only on nr_migrate_dl_tasks. This makes the DL
bandwidth alloc/free paths asymmetric: rollback can call dl_bw_free()
even when no dl_bw_alloc() was done.

Rollback also needs to undo the reservation against the same CPU/root
domain that was charged. Record the CPU used by dl_bw_alloc() and use
that state in cpuset_cancel_attach(). If no allocation happened,
dl_bw_cpu stays at -1 and rollback skips dl_bw_free(). If allocation
did happen, bandwidth is returned to the same CPU/root domain.

Successful attach paths are unchanged. This only fixes failed attach
rollback accounting.

Fixes: 2ef269ef1a ("cgroup/cpuset: Free DL BW in case can_attach() fails")
Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn>
Reviewed-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-04-17 08:57:37 -10:00
cuitao
c802f460dd cgroup/rdma: fix integer overflow in rdmacg_try_charge()
The expression `rpool->resources[index].usage + 1` is computed in int
arithmetic before being assigned to s64 variable `new`. When usage equals
INT_MAX (the default "max" value), the addition overflows to INT_MIN.
This negative value then passes the `new > max` check incorrectly,
allowing a charge that should be rejected and corrupting usage to
negative.

Fix by casting usage to s64 before the addition so the arithmetic is
done in 64-bit.

Fixes: 39d3e7584a ("rdmacg: Added rdma cgroup controller")
Signed-off-by: cuitao <cuitao@kylinos.cn>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-04-17 07:25:27 -10:00
Edward Adam Davis
a5b98009f1 sched/psi: fix race between file release and pressure write
A potential race condition exists between pressure write and cgroup file
release regarding the priv member of struct kernfs_open_file, which
triggers the uaf reported in [1].

Consider the following scenario involving execution on two separate CPUs:

   CPU0					CPU1
   ====					====
					vfs_rmdir()
					kernfs_iop_rmdir()
					cgroup_rmdir()
					cgroup_kn_lock_live()
					cgroup_destroy_locked()
					cgroup_addrm_files()
					cgroup_rm_file()
					kernfs_remove_by_name()
					kernfs_remove_by_name_ns()
 vfs_write()				__kernfs_remove()
 new_sync_write()			kernfs_drain()
 kernfs_fop_write_iter()		kernfs_drain_open_files()
 cgroup_file_write()			kernfs_release_file()
 pressure_write()			cgroup_file_release()
 ctx = of->priv;
					kfree(ctx);
 					of->priv = NULL;
					cgroup_kn_unlock()
 cgroup_kn_lock_live()
 cgroup_get(cgrp)
 cgroup_kn_unlock()
 if (ctx->psi.trigger)  // here, trigger uaf for ctx, that is of->priv

The cgroup_rmdir() is protected by the cgroup_mutex, it also safeguards
the memory deallocation of of->priv performed within cgroup_file_release().
However, the operations involving of->priv executed within pressure_write()
are not entirely covered by the protection of cgroup_mutex. Consequently,
if the code in pressure_write(), specifically the section handling the
ctx variable executes after cgroup_file_release() has completed, a uaf
vulnerability involving of->priv is triggered.

Therefore, the issue can be resolved by extending the scope of the
cgroup_mutex lock within pressure_write() to encompass all code paths
involving of->priv, thereby properly synchronizing the race condition
occurring between cgroup_file_release() and pressure_write().

And, if an live kn lock can be successfully acquired while executing
the pressure write operation, it indicates that the cgroup deletion
process has not yet reached its final stage; consequently, the priv
pointer within open_file cannot be NULL. Therefore, the operation to
retrieve the ctx value must be moved to a point *after* the live kn
lock has been successfully acquired.

In another situation, specifically after entering cgroup_kn_lock_live()
but before acquiring cgroup_mutex, there exists a different class of
race condition:

CPU0: write memory.pressure               CPU1: write cgroup.pressure=0
===========================		  =============================

kernfs_fop_write_iter()
 kernfs_get_active_of(of)
 pressure_write()
   cgroup_kn_lock_live(memory.pressure)
     cgroup_tryget(cgrp)
     kernfs_break_active_protection(kn)
     ... blocks on cgroup_mutex

                                     	  cgroup_pressure_write()
                                     	  cgroup_kn_lock_live(cgroup.pressure)
                                     	  cgroup_file_show(memory.pressure, false)
                                     	    kernfs_show(false)
                                     	      kernfs_drain_open_files()
                                     	        cgroup_file_release(of)
                                     	          kfree(ctx)
                                     	            of->priv = NULL
                                     	  cgroup_kn_unlock()

   ... acquires cgroup_mutex
   ctx = of->priv;        // may now be NULL
   if (ctx->psi.trigger)  // NULL dereference

Consequently, there is a possibility that of->priv is NULL, the pressure
write needs to check for this.

Now that the scope of the cgroup_mutex has been expanded, the original
explicit cgroup_get/put operations are no longer necessary, this is
because acquiring/releasing the live kn lock inherently executes a
cgroup get/put operation.

[1]
BUG: KASAN: slab-use-after-free in pressure_write+0xa4/0x210 kernel/cgroup/cgroup.c:4011
Call Trace:
 pressure_write+0xa4/0x210 kernel/cgroup/cgroup.c:4011
 cgroup_file_write+0x36f/0x790 kernel/cgroup/cgroup.c:4311
 kernfs_fop_write_iter+0x3b0/0x540 fs/kernfs/file.c:352

Allocated by task 9352:
 cgroup_file_open+0x90/0x3a0 kernel/cgroup/cgroup.c:4256
 kernfs_fop_open+0x9eb/0xcb0 fs/kernfs/file.c:724
 do_dentry_open+0x83d/0x13e0 fs/open.c:949

Freed by task 9353:
 cgroup_file_release+0xd6/0x100 kernel/cgroup/cgroup.c:4283
 kernfs_release_file fs/kernfs/file.c:764 [inline]
 kernfs_drain_open_files+0x392/0x720 fs/kernfs/file.c:834
 kernfs_drain+0x470/0x600 fs/kernfs/dir.c:525

Fixes: 0e94682b73 ("psi: introduce psi monitor")
Reported-by: syzbot+33e571025d88efd1312c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=33e571025d88efd1312c
Tested-by: syzbot+33e571025d88efd1312c@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Reviewed-by: Chen Ridong <chenridong@huaweicloud.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-04-17 07:25:09 -10:00
Linus Torvalds
d730905bc3 Merge tag 'mips_7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS updates from Thomas Bogendoerfer:

 - Support for Mobileye EyeQ6Lplus

 - Cleanups and fixes

* tag 'mips_7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (30 commits)
  MIPS/mtd: Handle READY GPIO in generic NAND platform data
  MIPS/input: Move RB532 button to GPIO descriptors
  MIPS: validate DT bootargs before appending them
  MIPS: Alchemy: Remove unused forward declaration
  MAINTAINERS: Mobileye: Add EyeQ6Lplus files
  MIPS: config: add eyeq6lplus_defconfig
  MIPS: Add Mobileye EyeQ6Lplus evaluation board dts
  MIPS: Add Mobileye EyeQ6Lplus SoC dtsi
  clk: eyeq: Add Mobileye EyeQ6Lplus OLB
  clk: eyeq: Adjust PLL accuracy computation
  clk: eyeq: Skip post-divisor when computing PLL frequency
  pinctrl: eyeq5: Add Mobileye EyeQ6Lplus OLB
  pinctrl: eyeq5: Use match data
  reset: eyeq: Add Mobileye EyeQ6Lplus OLB
  MIPS: Add Mobileye EyeQ6Lplus support
  dt-bindings: soc: mobileye: Add EyeQ6Lplus OLB
  dt-bindings: mips: Add Mobileye EyeQ6Lplus SoC
  MIPS: dts: loongson64g-package: Switch to Loongson UART driver
  mips: pci-mt7620: rework initialization procedure
  mips: pci-mt7620: add more register init values
  ...
2026-04-17 08:53:23 -07:00
Linus Torvalds
a10e80be63 Merge tag 'alpha-for-v7.1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/lindholm/alpha
Pull alpha updates from Magnus Lindholm:
 "One fix to silence pgprot_modify() compiler warnings, and one patch
  adding SECCOMP/SECCOMP_FILTER support together with the syscall and
  ptrace fixes needed for it"

* tag 'alpha-for-v7.1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/lindholm/alpha:
  alpha: Define pgprot_modify to silence tautological comparison warnings
  alpha: add support for SECCOMP and SECCOMP_FILTER
2026-04-17 08:34:43 -07:00
Linus Torvalds
01f492e181 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
 "Arm:

   - Add support for tracing in the standalone EL2 hypervisor code,
     which should help both debugging and performance analysis. This
     uses the new infrastructure for 'remote' trace buffers that can be
     exposed by non-kernel entities such as firmware, and which came
     through the tracing tree

   - Add support for GICv5 Per Processor Interrupts (PPIs), as the
     starting point for supporting the new GIC architecture in KVM

   - Finally add support for pKVM protected guests, where pages are
     unmapped from the host as they are faulted into the guest and can
     be shared back from the guest using pKVM hypercalls. Protected
     guests are created using a new machine type identifier. As the
     elusive guestmem has not yet delivered on its promises, anonymous
     memory is also supported

     This is only a first step towards full isolation from the host; for
     example, the CPU register state and DMA accesses are not yet
     isolated. Because this does not really yet bring fully what it
     promises, it is hidden behind CONFIG_ARM_PKVM_GUEST +
     'kvm-arm.mode=protected', and also triggers TAINT_USER when a VM is
     created. Caveat emptor

   - Rework the dreaded user_mem_abort() function to make it more
     maintainable, reducing the amount of state being exposed to the
     various helpers and rendering a substantial amount of state
     immutable

   - Expand the Stage-2 page table dumper to support NV shadow page
     tables on a per-VM basis

   - Tidy up the pKVM PSCI proxy code to be slightly less hard to
     follow

   - Fix both SPE and TRBE in non-VHE configurations so that they do not
     generate spurious, out of context table walks that ultimately lead
     to very bad HW lockups

   - A small set of patches fixing the Stage-2 MMU freeing in error
     cases

   - Tighten-up accepted SMC immediate value to be only #0 for host
     SMCCC calls

   - The usual cleanups and other selftest churn

  LoongArch:

   - Use CSR_CRMD_PLV for kvm_arch_vcpu_in_kernel()

   - Add DMSINTC irqchip in kernel support

  RISC-V:

   - Fix steal time shared memory alignment checks

   - Fix vector context allocation leak

   - Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi()

   - Fix double-free of sdata in kvm_pmu_clear_snapshot_area()

   - Fix integer overflow in kvm_pmu_validate_counter_mask()

   - Fix shift-out-of-bounds in make_xfence_request()

   - Fix lost write protection on huge pages during dirty logging

   - Split huge pages during fault handling for dirty logging

   - Skip CSR restore if VCPU is reloaded on the same core

   - Implement kvm_arch_has_default_irqchip() for KVM selftests

   - Factored-out ISA checks into separate sources

   - Added hideleg to struct kvm_vcpu_config

   - Factored-out VCPU config into separate sources

   - Support configuration of per-VM HGATP mode from KVM user space

  s390:

   - Support for ESA (31-bit) guests inside nested hypervisors

   - Remove restriction on memslot alignment, which is not needed
     anymore with the new gmap code

   - Fix LPSW/E to update the bear (which of course is the breaking
     event address register)

  x86:

   - Shut up various UBSAN warnings on reading module parameter before
     they were initialized

   - Don't zero-allocate page tables that are used for splitting
     hugepages in the TDP MMU, as KVM is guaranteed to set all SPTEs in
     the page table and thus write all bytes

   - As an optimization, bail early when trying to unsync 4KiB mappings
     if the target gfn can just be mapped with a 2MiB hugepage

  x86 generic:

   - Copy single-chunk MMIO write values into struct kvm_vcpu (more
     precisely struct kvm_mmio_fragment) to fix use-after-free stack
     bugs where KVM would dereference stack pointer after an exit to
     userspace

   - Clean up and comment the emulated MMIO code to try to make it
     easier to maintain (not necessarily "easy", but "easier")

   - Move VMXON+VMXOFF and EFER.SVME toggling out of KVM (not *all* of
     VMX and SVM enabling) as it is needed for trusted I/O

   - Advertise support for AVX512 Bit Matrix Multiply (BMM) instructions

   - Immediately fail the build if a required #define is missing in one
     of KVM's headers that is included multiple times

   - Reject SET_GUEST_DEBUG with -EBUSY if there's an already injected
     exception, mostly to prevent syzkaller from abusing the uAPI to
     trigger WARNs, but also because it can help prevent userspace from
     unintentionally crashing the VM

   - Exempt SMM from CPUID faulting on Intel, as per the spec

   - Misc hardening and cleanup changes

  x86 (AMD):

   - Fix and optimize IRQ window inhibit handling for AVIC; make it
     per-vCPU so that KVM doesn't prematurely re-enable AVIC if multiple
     vCPUs have to-be-injected IRQs

   - Clean up and optimize the OSVW handling, avoiding a bug in which
     KVM would overwrite state when enabling virtualization on multiple
     CPUs in parallel. This should not be a problem because OSVW should
     usually be the same for all CPUs

   - Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains
     about a "too large" size based purely on user input

   - Clean up and harden the pinning code for KVM_MEMORY_ENCRYPT_REG_REGION

   - Disallow synchronizing a VMSA of an already-launched/encrypted
     vCPU, as doing so for an SNP guest will crash the host due to an
     RMP violation page fault

   - Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped
     queries are required to hold kvm->lock, and enforce it by lockdep.
     Fix various bugs where sev_guest() was not ensured to be stable for
     the whole duration of a function or ioctl

   - Convert a pile of kvm->lock SEV code to guard()

   - Play nicer with userspace that does not enable
     KVM_CAP_EXCEPTION_PAYLOAD, for which KVM needs to set CR2 and DR6
     as a response to ioctls such as KVM_GET_VCPU_EVENTS (even if the
     payload would end up in EXITINFO2 rather than CR2, for example).
     Only set CR2 and DR6 when consumption of the payload is imminent,
     but on the other hand force delivery of the payload in all paths
     where userspace retrieves CR2 or DR6

   - Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT
     instead of vmcb02->save.cr2. The value is out of sync after a
     save/restore or after a #PF is injected into L2

   - Fix a class of nSVM bugs where some fields written by the CPU are
     not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so
     are not up-to-date when saved by KVM_GET_NESTED_STATE

   - Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE
     and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly
     initialized after save+restore

   - Add a variety of missing nSVM consistency checks

   - Fix several bugs where KVM failed to correctly update VMCB fields
     on nested #VMEXIT

   - Fix several bugs where KVM failed to correctly synthesize #UD or
     #GP for SVM-related instructions

   - Add support for save+restore of virtualized LBRs (on SVM)

   - Refactor various helpers and macros to improve clarity and
     (hopefully) make the code easier to maintain

   - Aggressively sanitize fields when copying from vmcb12, to guard
     against unintentionally allowing L1 to utilize yet-to-be-defined
     features

   - Fix several bugs where KVM botched rAX legality checks when
     emulating SVM instructions. There are remaining issues in that KVM
     doesn't handle size prefix overrides for 64-bit guests

   - Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails
     instead of somewhat arbitrarily synthesizing #GP (i.e. don't double
     down on AMD's architectural but sketchy behavior of generating #GP
     for "unsupported" addresses)

   - Cache all used vmcb12 fields to further harden against TOCTOU bugs

  x86 (Intel):

   - Drop obsolete branch hint prefixes from the VMX instruction macros

   - Use ASM_INPUT_RM() in __vmcs_writel() to coerce clang into using a
     register input when appropriate

   - Code cleanups

  guest_memfd:

   - Don't mark guest_memfd folios as accessed, as guest_memfd doesn't
     support reclaim, the memory is unevictable, and there is no storage
     to write back to

  LoongArch selftests:

   - Add KVM PMU test cases

  s390 selftests:

   - Enable more memory selftests

  x86 selftests:

   - Add support for Hygon CPUs in KVM selftests

   - Fix a bug in the MSR test where it would get false failures on
     AMD/Hygon CPUs with exactly one of RDPID or RDTSCP

   - Add an MADV_COLLAPSE testcase for guest_memfd as a regression test
     for a bug where the kernel would attempt to collapse guest_memfd
     folios against KVM's will"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (373 commits)
  KVM: x86: use inlines instead of macros for is_sev_*guest
  x86/virt: Treat SVM as unsupported when running as an SEV+ guest
  KVM: SEV: Goto an existing error label if charging misc_cg for an ASID fails
  KVM: SVM: Move lock-protected allocation of SEV ASID into a separate helper
  KVM: SEV: use mutex guard in snp_handle_guest_req()
  KVM: SEV: use mutex guard in sev_mem_enc_unregister_region()
  KVM: SEV: use mutex guard in sev_mem_enc_ioctl()
  KVM: SEV: use mutex guard in snp_launch_update()
  KVM: SEV: Assert that kvm->lock is held when querying SEV+ support
  KVM: SEV: Document that checking for SEV+ guests when reclaiming memory is "safe"
  KVM: SEV: Hide "struct kvm_sev_info" behind CONFIG_KVM_AMD_SEV=y
  KVM: SEV: WARN on unhandled VM type when initializing VM
  KVM: LoongArch: selftests: Add PMU overflow interrupt test
  KVM: LoongArch: selftests: Add basic PMU event counting test
  KVM: LoongArch: selftests: Add cpucfg read/write helpers
  LoongArch: KVM: Add DMSINTC inject msi to vCPU
  LoongArch: KVM: Add DMSINTC device support
  LoongArch: KVM: Make vcpu_is_preempted() as a macro rather than function
  LoongArch: KVM: Move host CSR_GSTAT save and restore in context switch
  LoongArch: KVM: Move host CSR_EENTRY save and restore in context switch
  ...
2026-04-17 07:18:03 -07:00
Borislav Petkov (AMD)
e55d98e775 x86/CPU: Fix FPDSS on Zen1
Zen1's hardware divider can leave, under certain circumstances, partial
results from previous operations.  Those results can be leaked by
another, attacker thread.

Fix that with a chicken bit.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-04-17 06:04:42 -07:00
Linus Torvalds
43cfbdda5a Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd updates from Jason Gunthorpe:
 "Several fixes:

   - Add missing static const

   - Correct type 1 emulation for VFIO_CHECK_EXTENSION when no-iommu is
     turned on

   - Fix selftest memory leak and syzkaller splat

   - Fix missed -EFAULT in fault reporting write() fops

   - Fix a race where map/unmap with the internal IOVA allocator can
     unmap things it should not"

* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
  iommufd: Fix a race with concurrent allocation and unmap
  iommufd/selftest: Remove MOCK_IOMMUPT_AMDV1 format
  iommufd: Fix return value of iommufd_fault_fops_write()
  iommufd: update outdated comment for renamed iommufd_hw_pagetable_alloc()
  iommufd/selftest: Fix page leaks in mock_viommu_{init,destroy}
  iommufd: vfio compatibility extension check for noiommu mode
  iommufd: Constify struct dma_buf_attach_ops
2026-04-16 21:21:55 -07:00
Linus Torvalds
87fe97a184 Merge tag 'for-linus-fwctl' of git://git.kernel.org/pub/scm/linux/kernel/git/fwctl/fwctl
Pull fwctl updates from Jason Gunthorpe:

 - New fwctl driver for Broadcom RDMA NICs

 - Bug fix for non-modular builds

* tag 'for-linus-fwctl' of git://git.kernel.org/pub/scm/linux/kernel/git/fwctl/fwctl:
  fwctl: Fix class init ordering to avoid NULL pointer dereference on device removal
  fwctl/bnxt_fwctl: Add documentation entries
  fwctl/bnxt_fwctl: Add bnxt fwctl device
  fwctl/bnxt_en: Create an aux device for fwctl
  fwctl/bnxt_en: Refactor aux bus functions to be more generic
  fwctl/bnxt_en: Move common definitions to include/linux/bnxt/
2026-04-16 21:15:56 -07:00
Linus Torvalds
8242c709d4 Merge tag 'soc-arm-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC ARM code updates from Arnd Bergmann:
 "These are again very minimal updates:

   - A workaround for firmware on Google Nexus 10

   - A fix for early debugging on OMAP1

   - A rework for Microchip SoC configuration

   - Cleanups on OMAP2 an R-Car-Gen2"

* tag 'soc-arm-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: omap2: dead code cleanup in kconfig for ARCH_OMAP4
  ARM: OMAP1: Fix DEBUG_LL and earlyprintk on OMAP16XX
  arm64: Kconfig: provide a top-level switch for Microchip platforms
  ARM: shmobile: rcar-gen2: Use of_phandle_args_equal() helper
  ARM: omap: fix all kernel-doc warnings
  ARM: omap2: Replace scnprintf with strscpy in omap3_cpuinfo
  ARM: samsung: exynos5250: Allow CPU1 to boot
2026-04-16 20:45:14 -07:00
Linus Torvalds
231d703058 Merge tag 'soc-defconfig-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC defconfig updates from Arnd Bergmann:
 "As usual, we enable a number of additional device drivers as loadable
  modules, to support the added platforms. The largest change this time
  is for OMAP2/3, which were not that well supported in the generic
  arm32 defconfig.

  The Tegra SoC platforms are now enabled by default in Kconfig when
  ARCH_TEGRA is enabled, which means the defconfig change is done at the
  same time as the Kconfig change here"

* tag 'soc-defconfig-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (25 commits)
  arch/arm: Drop CONFIG_FIRMWARE_EDID from defconfig files
  arm64: defconfig: Enable DP83TG720 PHY driver
  arm64: tegra: defconfig: Drop redundant ARCH_TEGRA_foo_SOC
  ARM: tegra: defconfig: Drop redundant ARCH_TEGRA_foo_SOC
  arm64: defconfig: enable pci-pwrctrl-generic as module
  arm64: defconfig: Enable Lontium LT8713sx driver
  arm64: defconfig: Enable Qualcomm Eliza SoC display clock controller
  arm64: defconfig: enable IPQ5210 RDP504 base configs
  arm64: defconfig: Enable Milos LPASS LPI pinctrl driver
  arm64: defconfig: Enable Kaanapali clock controllers
  arm64: defconfig: Enable configs for Arduino VENTUNO Q
  arm64: defconfig: Enable Qualcomm Eliza basic resource providers
  arm64: defconfig: Enable S5KJN1 camera sensor
  arm64: defconfig: Enable configurations for Toradex Aquila AM69
  arm64: defconfig: remove SENSORS_SA67MCU
  arm64: defconfig: Enable Qualcomm WCD937x headphone codec as module
  arm64: defconfig: Enable QCOMTEE module for QTEE-enabled Qualcomm SoCs
  ARM: shmobile: defconfig: Refresh for v7.0-rc1
  arm: multi_v7_defconfig: Enable more OMAP 3/4 related configs
  ARM: multi_v7_defconfig: omap2plus_defconfig: Enable ITE IT66121 driver
  ...
2026-04-16 20:40:20 -07:00
Linus Torvalds
31b43c079f Merge tag 'soc-drivers-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC driver updates from Arnd Bergmann:
 "The driver updates again are all over the place with many minor fixes
  going into platform specific code. The most notable changes are:

   - Support for Microchip pic64gx system controllers
   - Work on cleaning up devicetree bindings for SoC drivers, and
     converting them into the new format
   - Lots of smaller changes for Qualcomm SoC drivers, including support
     for a number of newly supported chips
   - reset controller API cleanups and a new driver for Cix Sky1
   - Reworks of the Tegra PMC and CBB drivers, along with a change to
     how individual Tegra SoCs get selected in Kconfig and BPMP firmware
     driver updates including a refresh of the ABI header to match the
     version used by firmware
   - STM32 updates to the firewall bus driver and support for the debug
     bus through OP-TEE
   - SCMI firmware driver improvements for reliability, in particular
     for dealing with broken firmware interrupts
   - Memory driver updates for Tegra, and a patch to remove the unused
     Baikal T1 driver"

* tag 'soc-drivers-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (193 commits)
  firmware: arm_ffa: Use the correct buffer size during RXTX_MAP
  firmware: qcom: scm: Allow QSEECOM on Lenovo IdeaCentre Mini X
  clk: spear: fix resource leak in clk_register_vco_pll()
  reset: rzv2h-usb2phy: Add support for VBUS mux controller registration
  reset: rzv2h-usb2phy: Convert to regmap API
  dt-bindings: reset: renesas,rzv2h-usb2phy: Document RZ/G3E USB2PHY reset
  dt-bindings: reset: renesas,rzv2h-usb2phy: Add '#mux-state-cells' property
  soc: microchip: add mpfs gpio interrupt mux driver
  dt-bindings: soc: microchip: document PolarFire SoC's gpio interrupt mux
  gpio: mpfs: Add interrupt support
  soc: qcom: ubwc: add helpers to get programmable values
  soc: qcom: ubwc: add helper to get min_acc length
  firmware: qcom: scm: Register gunyah watchdog device
  soc: qcom: socinfo: Add SoC ID for SA8650P
  dt-bindings: arm: qcom,ids: Add SoC ID for SA8650P
  firmware: qcom: scm: Allow QSEECOM on Mahua CRD
  soc: qcom: wcnss: simplify allocation of req
  soc: qcom: pd-mapper: Add support for Eliza
  soc: qcom: aoss: compare against normalized cooling state
  soc: qcom: llcc: fix v1 SB syndrome register offset
  ...
2026-04-16 20:34:34 -07:00
Linus Torvalds
e65f4718a5 Merge tag 'soc-dt-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC devicetree updates from Arnd Bergmann:
 "A number of SoC platforms are adding modernized variants of their
  already supported chips time, with a total of 12 new SoCs, and two
  older SoC getting removed:

   - Qualcomm Glymur is a compute SoC using 18 Oryon-2 CPU cores
   - Qualcomm Mahua is a variant of Glymur with only 12 CPU cores, but
     largely identical.
   - Qualcomm Eliza is an embeded platform for mobile phone (SM7750) and
     IOT (QC7790S/M) workloads
   - Qualcomm IPQ5210 is a wireless networking SoC using Cortex-A53
     cores
   - Qualcomm apq8084 and ipq806x had only rudimentary support but no
     actual products using them, so they are now gone.
   - Axis ARTPEC-9 is a follow-up to the ARTPEC-8 embedded SoC, using
     the Samsung SoC platform but now with Cortex-A55 cores
   - ARM Zena is a virtual platform in FVP using Cortex-A720AE cores,
     with additional versions planned to be merged in the future.
   - ARM corstone-1000-a320 is a reference platform for IOT, using
     low-end Cortex-A320 cores
   - Microchip LAN9691 is an updated 64-bit variant of the arm32 lan966x
     series of networking SoCs
   - Microchip PIC64GX is an embedded RISC-V chip using SIFIVE U54 CPU
     cores
   - Rockchip RV1103B is the low-end 32-bit single-core vision processor
   - Renesas RZ/G3L (r9a08g046) is an industrial embedded chip using
     Cortex-A55 cores, similar to the G3E and G3S variants we already
     supported.
   - NXP S32N79 is an automotive SoC using Cortex-A78AE cores, a
     significant upgrade from the older S32V and S32G series

  These all come with at least one reference board or an initial product
  using these, in total there are 67 newly added boards. The ones for
  already supported SoCs are:

   - Two more Aspeed BMC based boards
   - Three older tablets based on 32-bit OMAP4 and Exynos5 SoCs
   - One Set-top-box based on Allwinner H6
   - 22 additional industrial/embedded boards using 64-bit NXP i.MX8M or
     i.MX9 SoCs
   - 20 Qualcomm SoC based machines across all possible markets:
     workstation, gaming, laptop, phone, networking, reference, ...
   - Three more Rockchips rk35xx based boards
   - Four variants of the Toradex Verdin using TI AM62

  Other notable bits are:

   - A cleanup for the 32-bit Tegra paz00 board moved the last board
     specific code on Tegra into equivalent dts syntax.
   - There continues to be a significant number of fixes for static
     checking of dtc syntax, but it feels like this is slowing down,
     hopefully getting into a state where most known issues are
     addressed
   - Additional hardware support for many existing boards across SoC
     families, notably Qualcomm, Broadcom, i.MX2, i.MX6, Rockchips,
     STM32, Mediatek, Tegra, TI and Microchip"

* tag 'soc-dt-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (841 commits)
  arm64: dts: ti: k3: Use memory-region-names for r5f
  ARM: dts: imx: Add DT overlays for DH i.MX6 DHCOM SoM and boards
  ARM: dts: imx6sx: remove fallback compatible string fsl,imx28-lcdif
  ARM: dts: imx25: rename node name tcq to touchscreen
  ARM: dts: imx: b850v3: Disable unused usdhc4
  ARM: dts: imx: b850v3: Define GPIO line names
  ARM: dts: imx: b850v3: Use alphabetical sorting
  ARM: dts: imx: bx50v3: Configure phy-mode to eliminate a warning
  ARM: dts: imx: bx50v3: Configure switch PHY max-speed to 100Mbps
  ARM: dts: imx7ulp: Add CPU clock and OPP table support
  ARM: dts: imx7-mba7: Deassert BOOT_EN after boot
  ARM: dts: tqma7: add boot phase properties
  ARM: dts: imx7s: add boot phase properties
  ARM: dts: tqma6ul[l]: correct spelling of TQ-Systems
  ARM: dts: mba6ulx: add boot phase properties
  ARM: dts: imx6ul[l]-tqma6ul[l]: add boot phase properties
  ARM: dts: imx6ul/imx6ull: add boot phase properties
  ARM: dts: imx6qdl-mba6: add boot phase properties
  ARM: dts: imx6qdl-tqma6: add boot phase properties
  ARM: dts: imx6qdl: add boot phase properties
  ...
2026-04-16 20:28:48 -07:00
Linus Torvalds
440d6635b2 Merge tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:

 - "pid: make sub-init creation retryable" (Oleg Nesterov)

   Make creation of init in a new namespace more robust by clearing away
   some historical cruft which is no longer needed. Also some
   documentation fixups

 - "selftests/fchmodat2: Error handling and general" (Mark Brown)

   Fix and a cleanup for the fchmodat2() syscall selftest

 - "lib: polynomial: Move to math/ and clean up" (Andy Shevchenko)

 - "hung_task: Provide runtime reset interface for hung task detector"
   (Aaron Tomlin)

   Give administrators the ability to zero out
   /proc/sys/kernel/hung_task_detect_count

 - "tools/getdelays: use the static UAPI headers from
   tools/include/uapi" (Thomas Weißschuh)

   Teach getdelays to use the in-kernel UAPI headers rather than the
   system-provided ones

 - "watchdog/hardlockup: Improvements to hardlockup" (Mayank Rungta)

   Several cleanups and fixups to the hardlockup detector code and its
   documentation

 - "lib/bch: fix undefined behavior from signed left-shifts" (Josh Law)

   A couple of small/theoretical fixes in the bch code

 - "ocfs2/dlm: fix two bugs in dlm_match_regions()" (Junrui Luo)

 - "cleanup the RAID5 XOR library" (Christoph Hellwig)

   A quite far-reaching cleanup to this code. I can't do better than to
   quote Christoph:

     "The XOR library used for the RAID5 parity is a bit of a mess right
      now. The main file sits in crypto/ despite not being cryptography
      and not using the crypto API, with the generic implementations
      sitting in include/asm-generic and the arch implementations
      sitting in an asm/ header in theory. The latter doesn't work for
      many cases, so architectures often build the code directly into
      the core kernel, or create another module for the architecture
      code.

      Change this to a single module in lib/ that also contains the
      architecture optimizations, similar to the library work Eric
      Biggers has done for the CRC and crypto libraries later. After
      that it changes to better calling conventions that allow for
      smarter architecture implementations (although none is contained
      here yet), and uses static_call to avoid indirection function call
      overhead"

 - "lib/list_sort: Clean up list_sort() scheduling workarounds"
   (Kuan-Wei Chiu)

   Clean up this library code by removing a hacky thing which was added
   for UBIFS, which UBIFS doesn't actually need

 - "Fix bugs in extract_iter_to_sg()" (Christian Ehrhardt)

   Fix a few bugs in the scatterlist code, add in-kernel tests for the
   now-fixed bugs and fix a leak in the test itself

 - "kdump: Enable LUKS-encrypted dump target support in ARM64 and
   PowerPC" (Coiby Xu)

   Enable support of the LUKS-encrypted device dump target on arm64 and
   powerpc

 - "ocfs2: consolidate extent list validation into block read callbacks"
   (Joseph Qi)

   Cleanup, simplify, and make more robust ocfs2's validation of extent
   list fields (Kernel test robot loves mounting corrupted fs images!)

* tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (127 commits)
  ocfs2: validate group add input before caching
  ocfs2: validate bg_bits during freefrag scan
  ocfs2: fix listxattr handling when the buffer is full
  doc: watchdog: fix typos etc
  update Sean's email address
  ocfs2: use get_random_u32() where appropriate
  ocfs2: split transactions in dio completion to avoid credit exhaustion
  ocfs2: remove redundant l_next_free_rec check in __ocfs2_find_path()
  ocfs2: validate extent block list fields during block read
  ocfs2: remove empty extent list check in ocfs2_dx_dir_lookup_rec()
  ocfs2: validate dx_root extent list fields during block read
  ocfs2: fix use-after-free in ocfs2_fault() when VM_FAULT_RETRY
  ocfs2: handle invalid dinode in ocfs2_group_extend
  .get_maintainer.ignore: add Askar
  ocfs2: validate bg_list extent bounds in discontig groups
  checkpatch: exclude forward declarations of const structs
  tools/accounting: handle truncated taskstats netlink messages
  taskstats: set version in TGID exit notifications
  ocfs2/heartbeat: fix slot mapping rollback leaks on error paths
  arm64,ppc64le/kdump: pass dm-crypt keys to kdump kernel
  ...
2026-04-16 20:11:56 -07:00
Linus Torvalds
0b2f2b1fc0 Merge tag 'v7.1-rc1-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client updates from Steve French:

 - Fix integer underflow in encrypted read

 - Four debug patches, adding a few tracepoints

 - Minor update to MAINTAINERS file (preferred server URL for cifs)

 - Remove the BUG_ON() calls in d_mark_tmpfile_name

* tag 'v7.1-rc1-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  MAINTAINERS: change git.samba.org to https
  smb: client: fix integer underflow in receive_encrypted_read()
  smb: client: add tracepoints for deferred handle caching
  smb: client: add oplock level to smb3_open_done tracepoint
  smb: client: add tracepoint for local lock conflicts
  smb: client: add tracepoints for lock operations
  vfs: get rid of BUG_ON() in d_mark_tmpfile_name()
2026-04-16 19:14:55 -07:00
Linus Torvalds
3cd8b194bf Merge tag 'v7.1-rc-part1-smbdirect-fixes' of git://git.samba.org/ksmbd
Pull smbdirect updates from Steve French:
 "Move smbdirect server and client code to common directory:

   - temporary use of smbdirect_all_c_files.c to allow micro steps

   - factor out common functions into a smbdirect.ko.

   - convert cifs.ko to use smbdirect.ko

   - convert ksmbd.ko to use smbdirect.ko

   - let smbdirect.ko use global workqueues

   - move ib_client logic from ksmbd.ko into smbdirect.ko

   - remove smbdirect_all_c_files.c hack again

   - some locking and teardown related fixes on top"

* tag 'v7.1-rc-part1-smbdirect-fixes' of git://git.samba.org/ksmbd: (145 commits)
  smb: smbdirect: let smbdirect_connection_deregister_mr_io unlock while waiting
  smb: smbdirect: fix the logic in smbdirect_socket_destroy_sync() without an error
  smb: smbdirect: fix copyright header of smbdirect.h
  smb: smbdirect: change smbdirect_socket_parameters.{initiator_depth,responder_resources} to __u16
  smb: smbdirect: remove unused SMBDIRECT_USE_INLINE_C_FILES logic
  smb: server: no longer use smbdirect_socket_set_custom_workqueue()
  smb: client: no longer use smbdirect_socket_set_custom_workqueue()
  smb: smbdirect: introduce global workqueues
  smb: smbdirect: prepare use of dedicated workqueues for different steps
  smb: smbdirect: remove unused smbdirect_connection_mr_io_recovery_work()
  smb: smbdirect: wrap rdma_disconnect() in rdma_[un]lock_handler()
  smb: server: make use of smbdirect_netdev_rdma_capable_mode_type()
  smb: smbdirect: introduce smbdirect_netdev_rdma_capable_mode_type()
  smb: server: make use of smbdirect.ko
  smb: server: remove unused ksmbd_transport_ops.prepare()
  smb: server: make use of smbdirect_socket_{listen,accept}()
  smb: server: only use public smbdirect functions
  smb: server: make use of smbdirect_socket_create_accepting()/smbdirect_socket_release()
  smb: server: make use of smbdirect_{socket_init_accepting,connection_wait_for_connected}()
  smb: server: make use of smbdirect_connection_send_iter() and related functions
  ...
2026-04-16 08:25:04 -07:00
Linus Torvalds
d3d9443f8b Merge tag 'livepatching-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching
Pull livepatching updates from Petr Mladek:

 - Add two new selftests

* tag 'livepatching-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
  selftests/livepatch: add test for module function patching
  selftests: livepatch: test-ftrace: livepatch a traced function
2026-04-16 08:13:27 -07:00
Linus Torvalds
090748e62f Merge tag 'm68k-for-v7.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull m68k updates from Geert Uytterhoeven:

 - Add support for QEMU virt-ctrl, and use it for system reset
   and power off on the virt platform

 - defconfig updates

 - Miscellaneous fixes and improvements

* tag 'm68k-for-v7.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k: virt: Switch to qemu-virt-ctrl driver
  power: reset: Add QEMU virt-ctrl driver
  m68k: defconfig: Update defconfigs for v7.0-rc1
  m68k: emu: Replace unbounded sprintf() in nfhd_init_one()
  m68k: uapi: Add ucontext.h
  m68k: defconfig: hp300: Enable monochrome and 16-color linux logos
  m68k: q40: Remove commented out code
2026-04-16 08:11:01 -07:00
Linus Torvalds
948ef73f7e Merge tag 'efi-next-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI updates from Ard Biesheuvel:
 "Again not a busy cycle for EFI, just some minor tweaks and bug fixes:

   - Enable boot graphics resource table (BGRT) on Xen/x86

   - Correct a misguided assumption in the memory attributes table
     sanity check

   - Start tagging efi_mem_reserve()'d regions as MEMBLOCK_RSRV_KERN

   - Some other minor fixes and cleanups"

* tag 'efi-next-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi/capsule-loader: fix incorrect sizeof in phys array reallocation
  efi: Tag memblock reservations of boot services regions as RSRV_KERN
  memblock: Permit existing reserved regions to be marked RSRV_KERN
  efi/memattr: Fix thinko in table size sanity check
  efi: libstub: fix type of fdt 32 and 64bit variables
  efi: Drop unused efi_range_is_wc() function
  efi: Enable BGRT loading under Xen
  efi: make efi_mem_type() and efi_mem_attributes() work on Xen PV
2026-04-16 08:06:25 -07:00
Linus Torvalds
f0bf3eac92 Merge tag 'vfio-v7.1-rc1' of https://github.com/awilliam/linux-vfio
Pull VFIO updates from Alex Williamson:

 - Update QAT vfio-pci variant driver for Gen 5, 420xx devices (Vijay
   Sundar Selvamani, Suman Kumar Chakraborty, Giovanni Cabiddu)

 - Fix vfio selftest MMIO DMA mapping selftest (Alex Mastro)

 - Conversions to const struct class in support of class_create()
   deprecation (Jori Koolstra)

 - Improve selftest compiler compatibility by avoiding initializer on
   variable-length array (Manish Honap)

 - Define new uAPI for drivers supporting migration to advise user-
   space of new initial data for reducing target startup latency.
   Implemented for mlx5 vfio-pci variant driver (Yishai Hadas)

 - Enable vfio selftests on aarch64, not just cross-compiles reporting
   arm64 (Ted Logan)

 - Update vfio selftest driver support to include additional DSA devices
   (Yi Lai)

 - Unconditionally include debugfs root pointer in vfio device struct,
   avoiding a build failure seen in hisi_acc variant driver without
   debugfs otherwise (Arnd Bergmann)

 - Add support for the s390 ISM (Internal Shared Memory) device via a
   new variant driver. The device is unique in the size of its BAR space
   (256TiB) and lack of mmap support (Julian Ruess)

 - Enforce that vfio-pci drivers implement a name in their ops structure
   for use in sequestering SR-IOV VFs (Alex Williamson)

 - Prune leftover group notifier code (Paolo Bonzini)

 - Fix Xe vfio-pci variant driver to avoid migration support as a
   dependency in the reset path and missing release call (Michał
   Winiarski)

* tag 'vfio-v7.1-rc1' of https://github.com/awilliam/linux-vfio: (23 commits)
  vfio/xe: Add a missing vfio_pci_core_release_dev()
  vfio/xe: Reorganize the init to decouple migration from reset
  vfio: remove dead notifier code
  vfio/pci: Require vfio_device_ops.name
  MAINTAINERS: add VFIO ISM PCI DRIVER section
  vfio/ism: Implement vfio_pci driver for ISM devices
  vfio/pci: Rename vfio_config_do_rw() to vfio_pci_config_rw_single() and export it
  vfio: unhide vdev->debug_root
  vfio/qat: add support for Intel QAT 420xx VFs
  vfio: selftests: Support DMR and GNR-D DSA devices
  vfio: selftests: Build tests on aarch64
  vfio/mlx5: Add REINIT support to VFIO_MIG_GET_PRECOPY_INFO
  vfio/mlx5: consider inflight SAVE during PRE_COPY
  net/mlx5: Add IFC bits for migration state
  vfio: Adapt drivers to use the core helper vfio_check_precopy_ioctl
  vfio: Add support for VFIO_DEVICE_FEATURE_MIG_PRECOPY_INFOv2
  vfio: Define uAPI for re-init initial bytes during the PRE_COPY phase
  vfio: selftests: Fix VLA initialisation in vfio_pci_irq_set()
  vfio: uapi: fix comment typo
  vfio: mdev: replace mtty_dev->vd_class with a const struct class
  ...
2026-04-16 08:01:16 -07:00
Petr Mladek
448c0f8cb7 Merge branch 'for-7.1/module-function-test' into for-linus 2026-04-16 10:33:43 +02:00
Stefan Metzmacher
d09a040c18 smb: smbdirect: let smbdirect_connection_deregister_mr_io unlock while waiting
We should not hold a mutex locked during wait_for_completion()
holding a reference is enough.

Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Henrique Carvalho <henrique.carvalho@suse.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-15 21:58:24 -05:00
Stefan Metzmacher
25c2e34931 smb: smbdirect: fix the logic in smbdirect_socket_destroy_sync() without an error
If smbdirect_socket_destroy_sync() and sc->first_error was not set
we should set -ESHUTDOWN, that's a better condition
doing it only implicitly with the
sc->status < SMBDIRECT_SOCKET_DISCONNECTING check.

Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Henrique Carvalho <henrique.carvalho@suse.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-15 21:58:24 -05:00
Stefan Metzmacher
3892007f2b smb: smbdirect: fix copyright header of smbdirect.h
Everything in smbdirect.h was taken from my out of
tree prototype.

Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Henrique Carvalho <henrique.carvalho@suse.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-15 21:58:24 -05:00
Stefan Metzmacher
735610d0ce smb: smbdirect: change smbdirect_socket_parameters.{initiator_depth,responder_resources} to __u16
We still limit this to U8_MAX as the rdma api only uses __u8
and that's also the limit for Infiniband and RoCE*,
while iWarp would be able to support larger values at
the protocol level.

As struct smbdirect_socket_parameters will be part
of the uapi for IPPROTO_SMBDIRECT in future, change it
now even if userspace sockets won't be supported yet.

Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Acked-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-15 21:58:24 -05:00
Stefan Metzmacher
aa43bb2c0f smb: smbdirect: remove unused SMBDIRECT_USE_INLINE_C_FILES logic
We always build as standalone module (or as part of the core kernel).

This also removes unused elements from struct smbdirect_socket
and unused exports.

Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-15 21:58:24 -05:00
Stefan Metzmacher
649c47559a smb: server: no longer use smbdirect_socket_set_custom_workqueue()
smbdirect.ko has global workqueues now, so we should use these
default once.

Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-15 21:58:24 -05:00
Stefan Metzmacher
73dc52d294 smb: client: no longer use smbdirect_socket_set_custom_workqueue()
smbdirect.ko has global workqueues now, so we should use these
default once.

Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-15 21:58:24 -05:00
Stefan Metzmacher
1adde16a9e smb: smbdirect: introduce global workqueues
These will be used in future and callers should no
longer use smbdirect_socket_set_custom_workqueue().

Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-15 21:58:24 -05:00
Stefan Metzmacher
e4ce1fca04 smb: smbdirect: prepare use of dedicated workqueues for different steps
This is a preparation in order to have global workqueues in
the smbdirect module instead of having the caller to
provide one.

Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-15 21:58:24 -05:00
Stefan Metzmacher
00ac2a4fe0 smb: smbdirect: remove unused smbdirect_connection_mr_io_recovery_work()
This would actually never be used as we only move to
SMBDIRECT_MR_ERROR when we directly call
smbdirect_socket_schedule_cleanup().

Doing an ib_dereg_mr/ib_alloc_mr dance on
working connection is not needed and
it's also pointless on a broken connection
as we don't reuse any ib_pd.

Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-15 21:58:24 -05:00
Stefan Metzmacher
a40e6f0166 smb: smbdirect: wrap rdma_disconnect() in rdma_[un]lock_handler()
This might not be needed, but it controls the order
of ib_drain_qp() and rdma_disconnect().

Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-15 21:58:24 -05:00
Stefan Metzmacher
33b2894e8d smb: server: make use of smbdirect_netdev_rdma_capable_mode_type()
This removes is basically the same logic.

Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-15 21:58:24 -05:00
Stefan Metzmacher
81a7a3a0fa smb: smbdirect: introduce smbdirect_netdev_rdma_capable_mode_type()
This is basically a copy of ksmbd_rdma_capable_netdev() in the
server, but this also prints a message when a device is renamed.

The differences are:
- It uses rdma_for_each_port() instead of implementing the
  same logic again.
- It returns RDMA_NODE_{UNSPECIFIED,IB_CA,RNIC} values instead of bool

Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-15 21:58:24 -05:00
Stefan Metzmacher
50bdab9ae4 smb: server: make use of smbdirect.ko
This means we no longer inline the common smbdirect
.c files and use the exported functions from the
module instead.

Note the connection specific logging is still
redirect to ksmbd.ko functions via
smbdirect_socket_set_logging().

We still don't use real socket layer,
but we're very close...

Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-15 21:58:24 -05:00
Stefan Metzmacher
98bdc5fda9 smb: server: remove unused ksmbd_transport_ops.prepare()
This is no longer needed for smbdirect.

Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-04-15 21:58:24 -05:00