Commit Graph

1430063 Commits

Author SHA1 Message Date
Chuck Lever
704f3f640f xprtrdma: Post receive buffers after RPC completion
rpcrdma_post_recvs() runs in CQ poll context and its cost
falls on the latency-critical path between polling a Receive
completion and waking the RPC consumer. Every cycle spent
refilling the Receive Queue delays delivery of the reply to
the NFS layer.

Move the rpcrdma_post_recvs() call in rpcrdma_reply_handler()
to after the RPC has been decoded and completed. The larger
batch size from the preceding patch provides sufficient
Receive Queue headroom to absorb the brief delay before
buffers are replenished.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2026-04-13 12:05:01 -07:00
Chuck Lever
93b4791adb xprtrdma: Scale receive batch size with credit window
The fixed RPCRDMA_MAX_RECV_BATCH of 7 results in frequent
small ib_post_recv batches during high-rate workloads. With
a 128-slot credit window, receives are reposted every 7th
completion, each batch incurring atomic serialization and a
doorbell write.

Replace the fixed batch constant with a per-endpoint value
scaled to 25% of the negotiated credit window. For a typical
128-credit connection this raises the batch from 7 to 32,
reducing doorbell frequency by roughly 4x and amortizing the
per-batch atomic and MMIO costs over a larger group of
receive WRs.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2026-04-13 12:05:00 -07:00
Chuck Lever
7a079ab57c xprtrdma: Replace rpcrdma_mr_seg with xdr_buf cursor
The FRWR registration path converts data through three
representations: xdr_buf -> rpcrdma_mr_seg[] -> scatterlist[]
-> ib_map_mr_sg(). The rpcrdma_mr_seg intermediate is a relic
of when multiple registration strategies existed (FMR, physical,
FRWR). Only FRWR remains, so this indirection and the 6240-byte
rl_segments[260] array embedded in each rpcrdma_req serve no
purpose.

Introduce struct rpcrdma_xdr_cursor to track position within
an xdr_buf during iterative MR registration. Rewrite frwr_map to
populate scatterlist entries directly from the xdr_buf regions
(head kvec, page list, tail kvec). The boundary logic for
non-SG_GAPS devices is simpler because the xdr_buf structure
guarantees that page-region entries after the first start at
offset 0, and that head/tail kvecs are separate regions that
naturally break at MR boundaries.

Fix a pre-existing bug in rpcrdma_encode_write_list where the
write-pad statistics accumulator added mr->mr_length from the last
data MR rather than the write-pad MR. The refactored code uses
ep->re_write_pad_mr->mr_length.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2026-04-13 12:04:58 -07:00
Chuck Lever
6f2e565fb3 xprtrdma: Decouple frwr_wp_create from frwr_map
frwr_wp_create is the only caller of frwr_map outside the encode
path. It registers a single 4-byte write-pad region from a stack-
local rpcrdma_mr_seg. Inlining the registration logic directly
(sg_init_table + sg_set_page + ib_dma_map_sg + ib_map_mr_sg +
IOVA mangle + reg_wr setup) eliminates the coupling that would
otherwise complicate the removal of rpcrdma_mr_seg from frwr_map's
interface.

The inlined version adds a proper error-unwind ladder: on failure,
the DMA mapping (if established) is released, ep->re_write_pad_mr is
cleared, and the MR is returned to the transport free list. The old
frwr_map-based code relied on rpcrdma_mrs_destroy at teardown to
reclaim partially-initialized MRs.

This is a one-time setup path; duplicating ~20 lines is a reasonable
tradeoff for decoupling the write-pad registration from the data-
path MR registration.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2026-04-13 11:56:27 -07:00
Chuck Lever
765bde47fe xprtrdma: Close lost-wakeup race in xprt_rdma_alloc_slot
xprt_rdma_alloc_slot() and xprt_rdma_free_slot() lack serialization
between the buffer pool and the backlog queue.  A buffer freed
after rpcrdma_buffer_get() finds the pool empty but before
rpc_sleep_on() places the task on the backlog is returned to the
pool with no waiter to wake, leaving the task stuck on the backlog
indefinitely.

After joining the backlog, re-check the pool and route any
recovered buffer through xprt_wake_up_backlog(), whose queue lock
serializes with concurrent wakeups and avoids double-assignment
of slots.

Because xprt_rdma_free_slot() does not hold reserve_lock, the
XPRT_CONGESTED double-check in xprt_throttle_congested() is
ineffective: a task can join the backlog through that path after
free_slot has already found it empty and cleared the bit.  Avoid
this by using xprt_add_backlog_noncongested(), which queues the
task without setting XPRT_CONGESTED, so every allocation reaches
xprt_rdma_alloc_slot() and its post-sleep re-check.

Fixes: edb41e61a5 ("xprtrdma: Make rpc_rqst part of rpcrdma_req")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2026-04-13 11:56:20 -07:00
Chuck Lever
100142093e xprtrdma: Avoid 250 ms delay on backlog wakeup
Commit a721035477 ("SUNRPC/xprt: async tasks mustn't block waiting
for memory") changed xprt_rdma_alloc_slot() to set tk_status to
-ENOMEM so that call_reserveresult() would sleep HZ/4 before
retrying.  That rationale applies to xprt_dynamic_alloc_slot(),
where an immediate retry under memory pressure wastes CPU, but not
to the RDMA backlog path: a task woken from the backlog has a slot
waiting for it, so the 250 ms rpc_delay adds latency without
benefit.

This also aligns the code with the existing kernel-doc for
xprt_rdma_alloc_slot(), which already documented %-EAGAIN.

Fixes: a721035477 ("SUNRPC/xprt: async tasks mustn't block waiting for memory")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2026-04-13 11:52:54 -07:00
Chuck Lever
24297c7cd3 xprtrdma: Close sendctx get/put race that can block a transport
rpcrdma_sendctx_get_locked() and rpcrdma_sendctx_put_locked() can
race in a way that leaves XPRT_WRITE_SPACE set permanently, blocking
all further sends on the transport:

  get_locked              put_locked (Send completion)
  ----------              --------------------------
  read rb_sc_tail
    -> ring full
                          advance rb_sc_tail
                          xprt_write_space():
                            test_bit(WRITE_SPACE)
                            -> not set, return
  set_bit(WRITE_SPACE)
  return NULL (-EAGAIN)

After the sender releases XPRT_LOCKED, the release path refuses to
wake the next task because XPRT_WRITE_SPACE is set. The sender
retries, finds XPRT_WRITE_SPACE still set, and sleeps on
xprt_sending. No further Send completions arrive to clear the flag
because no new Sends can be posted.

With nconnect, the stalled transport's share of congestion credits
are never returned, starving the remaining transports as well.

Fixes: 05eb06d866 ("xprtrdma: Fix occasional transport deadlock")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2026-04-13 11:52:49 -07:00
Jeff Layton
9c332d7f63 nfs: update inode ctime after removexattr operation
xfstest generic/728 fails with delegated timestamps. The client does a
removexattr and then a stat to test the ctime, which doesn't change. The
stat() doesn't trigger a GETATTR because of the delegated timestamps, so
it relies on the cached ctime, which is wrong.

The setxattr compound has a trailing GETATTR, which ensures that its
ctime gets updated. Follow the same strategy with removexattr.

Fixes: 3e1f02123f ("NFSv4.2: add client side XDR handling for extended attributes")
Reported-by: Olga Kornievskaia <aglo@umich.edu>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2026-04-13 11:46:08 -07:00
Jeff Layton
16d99dce93 nfs: fix utimensat() for atime with delegated timestamps
xfstest generic/221 is failing with delegated timestamps enabled.  When
the client holds a WRITE_ATTRS_DELEG delegation, and a userland process
does a utimensat() for only the atime, the ctime is not properly
updated. The problem is that the client tries to cache the atime update,
but there is no mtime update, so the delegated attribute update never
updates the ctime.

Delegated timestamps don't have a mechanism to update the ctime in
accordance with atime-only changes due to utimensat() and the like.
Change the client to issue an RPC in this case, so that the ctime gets
properly updated alongside the atime.

Fixes: 40f45ab381 ("NFS: Further fixes to attribute delegation a/mtime changes")
Reported-by: Olga Kornievskaia <aglo@umich.edu>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2026-04-13 11:46:01 -07:00
Olga Kornievskaia
3a06bac55b NFS: improve "Server wrote zero bytes" error
When a pnfs error occurs, the IO is retried against the MDS. However,
the initial IO leads to the kernel logging "Serer wrote zero bytes"
when in fact the MDS IO will not fail and thus the error misleads
administrators that the system is experiencing issues.

When pnfs IO fails which triggers pnfs_write_done_resent_to_mds() which
would end up clearing nfs_pgio_header's pages structure (copying the
content into a new one to do new RPC calls to the MDS). Thus,
in nfs_writeback_result() when we have no pages to work with no need
to try and also therefore skip logging the message about 0bytes.

Fixes: 6c75dc0d49 ("NFS: merge _full and _partial write rpc_ops")
Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2026-04-13 11:17:31 -07:00
Trond Myklebust
1805e6b2f4 NFSv4/pnfs: If the server is down, retry the layout returns on reboot
If a layout return is embedded in a CLOSE or DELEGRETURN rpc call, and
the metadata server reboots, the expectation now is that the client
should resend the layout return once the server comes back up.
This patch changes the current behaviour of dropping the layouts on the
floor, and instead queues them up for retrying.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2026-04-13 09:26:19 -07:00
Linus Torvalds
028ef9c96e Linux 7.0 v7.0 2026-04-12 13:48:06 -07:00
Linus Torvalds
10d97b74e2 Merge tag 'edac_urgent_for_7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC fix from Borislav Petkov:

 - Fix the error path ordering when the driver-private descriptor
   allocation fails

* tag 'edac_urgent_for_7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/mc: Fix error path ordering in edac_mc_alloc()
2026-04-12 11:56:07 -07:00
Linus Torvalds
35bdc192d8 Merge tag 'wq-for-7.0-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fix from Tejun Heo:
 "This is a fix for a stall which triggers on ordered workqueues when
  there are multiple inactive work items during workqueue property
  changes through sysfs, which doesn't happen that frequently.

  While really late, the fix is very low risk as it just repeats an
  operation which is already being performed:

   - Fix incomplete activation of multiple inactive works when
     unplugging a pool_workqueue, where the pending_pwqs list
     wasn't being updated for subsequent works"

* tag 'wq-for-7.0-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: Add pool_workqueue to pending_pwqs list when unplugging multiple inactive works
2026-04-12 10:42:40 -07:00
Linus Torvalds
ab3dee2640 Merge tag 'timers-urgent-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
 "Two fixes for the time/timers subsystem:

   - Invert the inverted fastpath decision in check_tick_dependency(),
     which prevents NOHZ full to stop the tick. That's a regression
     introduced in the 7.0 merge window.

   - Prevent a unpriviledged DoS in the clockevents code, where user
     space can starve the timer interrupt by arming a timerfd or posix
     interval timer in a tight loop with an absolute expiry time in the
     past. The fix turned out to be incomplete and was was amended
     yesterday to make it work on some 20 years old AMD machines as
     well. All issues with it have been confirmed to be resolved by
     various reporters"

* tag 'timers-urgent-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clockevents: Prevent timer interrupt starvation
  tick/nohz: Fix inverted return value in check_tick_dependency() fast path
2026-04-12 10:01:55 -07:00
Linus Torvalds
02640d8886 Merge tag 'sched-urgent-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
 "Fix DL server related slowdown to deferred fair tasks"

* tag 'sched-urgent-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/deadline: Use revised wakeup rule for dl_server
2026-04-12 08:30:20 -07:00
Linus Torvalds
d71358127c Merge tag 'ras-urgent-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 MCE fix from Ingo Molnar:
 "Fix incorrect hardware errors reported on Zen3 CPUs, such as bogus
  L3 cache deferred errors (Yazen Ghannam)"

* tag 'ras-urgent-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce/amd: Filter bogus hardware errors on Zen3 clients
2026-04-12 08:27:09 -07:00
Linus Torvalds
c919577eee Merge tag 'perf-urgent-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Four Intel uncore PMU driver fixes by Zide Chen"

* tag 'perf-urgent-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/uncore: Remove extra double quote mark
  perf/x86/intel/uncore: Fix die ID init and look up bugs
  perf/x86/intel/uncore: Skip discovery table for offline dies
  perf/x86/intel/uncore: Fix iounmap() leak on global_init failure
2026-04-12 08:17:52 -07:00
Linus Torvalds
8648ac819d Merge tag 'v7.0-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:

 - Enforce rx socket buffer limit in af_alg

 - Fix array overflow in af_alg_pull_tsgl

 - Fix out-of-bounds access when parsing extensions in X.509

 - Fix minimum rx size check in algif_aead

* tag 'v7.0-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: algif_aead - Fix minimum RX size check for decryption
  X.509: Fix out-of-bounds access when parsing extensions
  crypto: af_alg - Fix page reassignment overflow in af_alg_pull_tsgl
  crypto: af_alg - limit RX SG extraction by receive buffer budget
2026-04-12 08:11:02 -07:00
Herbert Xu
3d14bd48e3 crypto: algif_aead - Fix minimum RX size check for decryption
The check for the minimum receive buffer size did not take the
tag size into account during decryption.  Fix this by adding the
required extra length.

Reported-by: syzbot+aa11561819dc42ebbc7c@syzkaller.appspotmail.com
Reported-by: Daniel Pouzzner <douzzer@mega.nu>
Fixes: d887c52d6a ("crypto: algif_aead - overhaul memory management")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-04-12 13:38:19 +08:00
Lukas Wunner
d702c34082 X.509: Fix out-of-bounds access when parsing extensions
Leo reports an out-of-bounds access when parsing a certificate with
empty Basic Constraints or Key Usage extension because the first byte of
the extension is read before checking its length.  Fix it.

The bug can be triggered by an unprivileged user by submitting a
specially crafted certificate to the kernel through the keyrings(7) API.
Leo has demonstrated this with a proof-of-concept program responsibly
disclosed off-list.

Fixes: 30eae2b037 ("KEYS: X.509: Parse Basic Constraints for CA")
Fixes: 567671281a ("KEYS: X.509: Parse Key Usage")
Reported-by: Leo Lin <leo@depthfirst.com> # off-list
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Ignat Korchagin <ignat@linux.win>
Cc: stable@vger.kernel.org # v6.4+
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-04-12 13:38:19 +08:00
Herbert Xu
31d00156e5 crypto: af_alg - Fix page reassignment overflow in af_alg_pull_tsgl
When page reassignment was added to af_alg_pull_tsgl the original
loop wasn't updated so it may try to reassign one more page than
necessary.

Add the check to the reassignment so that this does not happen.

Also update the comment which still refers to the obsolete offset
argument.

Reported-by: syzbot+d23888375c2737c17ba5@syzkaller.appspotmail.com
Fixes: e870456d8e ("crypto: algif_skcipher - overhaul memory management")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-04-12 13:38:19 +08:00
Linus Torvalds
f5459048c3 Merge tag 'i2c-for-7.0-final' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fix from Wolfram Sang:

 - imx: set dma_slave_config to 0 and avoid uninitialized fields

* tag 'i2c-for-7.0-final' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: imx: zero-initialize dma_slave_config for eDMA
2026-04-11 17:06:05 -07:00
Linus Torvalds
e753c16cb3 Merge tag 'spi-fix-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "A couple of changes here, one update to MAINTAINERS for the AMD
  controller and a chnage from Pei Xiao which in spite of the changelog
  is actually a fix - previously the zynq-qspi driver leaked a clock
  enable for every flash operation it did which isn't good, these extra
  enables were removed when doing the enable cleanup which are probably
  a good idea anyway"

* tag 'spi-fix-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  MAINTAINERS: Update AMD SPI driver maintainers
  spi: zynq-qspi: Simplify clock handling with devm_clk_get_enabled()
2026-04-11 12:49:21 -07:00
Linus Torvalds
e8ab311052 Merge tag 'regulator-fix-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
 "One last fix for v7.0, the BD72720 incorrectly described which DCDC is
  tied to the LDO for its LDON-HEAD mode which automates using the DCDC
  to more efficiently drop a supply for delivery via the LDO"

* tag 'regulator-fix-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: bd71828-regulator.c: Fix LDON-HEAD mode
2026-04-11 12:35:16 -07:00
Linus Torvalds
086aca1030 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "s390:
   - vsie: Fix races with partial gmap invalidations

  x86:
   - Use __DECLARE_FLEX_ARRAY() for UAPI structures with VLAs"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: s390: vsie: Fix races with partial gmap invalidations
  KVM: x86: Use __DECLARE_FLEX_ARRAY() for UAPI structures with VLAs
2026-04-11 11:45:20 -07:00
Linus Torvalds
558b9206d5 Merge tag 'probes-fixes-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing probe fix from Masami Hiramatsu:
 "Reject non-closed empty immediate strings

  Fix a buffer index underflow bug that occurred when passing an
  non-closed empty immediate string to the probe event"

* tag 'probes-fixes-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/probe: reject non-closed empty immediate strings
2026-04-11 11:33:08 -07:00
Linus Torvalds
6b5199f4cf Merge tag 'usb-7.0-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fix from Greg KH:
 "Here is a single USB fix for a reported regression in a recent USB
  typec patch for 7.0-final. Sorry for the late submission, but it does
  fix a problem that people have been seeing with 7.0-rc7 and the stable
  kernels (due to a backported fix from there.)

  This has been in linux-next this week with no reported issues, and the
  reporter (Takashi), has said it resolves the problem they were seeing"

* tag 'usb-7.0-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: typec: ucsi: skip connector validation before init
2026-04-11 11:30:18 -07:00
Linus Torvalds
778322a06e Merge tag 'input-for-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
 "Two fixes for force feedback handling in uinput driver:

   - fix circular locking dependency in uinput

   - fix potential corruption of uinput event queue"

* tag 'input-for-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: uinput - take event lock when submitting FF request "event"
  Input: uinput - fix circular locking dependency with ff-core
2026-04-11 11:12:38 -07:00
Paolo Bonzini
1fe7294dfb Merge tag 'kvm-s390-master-7.0-4' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
KVM: s390: One very last second fix

Fix one more gmap-rewrite issue: races with partial gmap invalidations.
2026-04-11 14:11:23 +02:00
Paolo Bonzini
0e9b0e0124 Merge tag 'kvm-x86-fixes-7.1' of https://github.com/kvm-x86/linux into HEAD
KVM x86 fixes for 7.1

Declare flexible arrays in uAPI structures using __DECLARE_FLEX_ARRAY() so
that KVM's uAPI headers can be included in C++ projects.
2026-04-11 14:10:44 +02:00
Linus Torvalds
e774d5f1bc Merge tag 'riscv-for-linus-v7.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Paul Walmsley:
 "Before v7.0 is released, fix a few issues with the CFI patchset,
  merged earlier in v7.0-rc, that primarily affect interfaces to
  non-kernel code:

   - Improve the prctl() interface for per-task indirect branch landing
     pad control to expand abbreviations and to resemble the speculation
     control prctl() interface

   - Expand the "LP" and "SS" abbreviations in the ptrace uapi header
     file to "branch landing pad" and "shadow stack", to improve
     readability

   - Fix a typo in a CFI-related macro name in the ptrace uapi header
     file

   - Ensure that the indirect branch tracking state and shadow stack
     state are unlocked immediately after an exec() on the new task so
     that libc subsequently can control it

   - While working in this area, clean up the kernel-internal,
     cross-architecture prctl() function names by expanding the
     abbreviations mentioned above"

* tag 'riscv-for-linus-v7.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  prctl: cfi: change the branch landing pad prctl()s to be more descriptive
  riscv: ptrace: cfi: expand "SS" references to "shadow stack" in uapi headers
  prctl: rename branch landing pad implementation functions to be more explicit
  riscv: ptrace: expand "LP" references to "branch landing pads" in uapi headers
  riscv: cfi: clear CFI lock status in start_thread()
  riscv: ptrace: cfi: fix "PRACE" typo in uapi header
2026-04-10 17:27:08 -07:00
Linus Torvalds
c43adb3613 Merge tag 'drm-fixes-2026-04-11' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "Last set of fixes, a few vc4, and i915, one xe and one ethosu Kconfig
  fix.

  xe:
   - Fix HW engine idleness unit conversion

  i915:
   - Drop check for changed VM in EXECBUF
   - Fix refcount underflow race in intel_engine_park_heartbeat
   - Do not use pipe_src as borders for SU area in PSR

  vc4:
   - runtime pm reference fix
   - memory leak fixes
   - locking fix

  ethosu:
   - make ARM only"

* tag 'drm-fixes-2026-04-11' of https://gitlab.freedesktop.org/drm/kernel:
  drm/i915/gem: Drop check for changed VM in EXECBUF
  drm/i915/gt: fix refcount underflow in intel_engine_park_heartbeat
  drm/xe: Fix bug in idledly unit conversion
  drm/i915/psr: Do not use pipe_src as borders for SU area
  accel: ethosu: Add hardware dependency hint
  drm/vc4: Protect madv read in vc4_gem_object_mmap() with madv_lock
  drm/vc4: Fix a memory leak in hang state error path
  drm/vc4: Fix memory leak of BO array in hang state
  drm/vc4: Release runtime PM reference after binding V3D
2026-04-10 17:18:20 -07:00
Dave Airlie
b3be33f2c1 Merge tag 'drm-intel-fixes-2026-04-09' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
- Drop check for changed VM in EXECBUF
- Fix refcount underflow race in intel_engine_park_heartbeat
- Do not use pipe_src as borders for SU area in PSR

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patch.msgid.link/add6fPHRC7Bc8Uri@jlahtine-mobl
2026-04-11 07:35:22 +10:00
Thomas Gleixner
d6e152d905 clockevents: Prevent timer interrupt starvation
Calvin reported an odd NMI watchdog lockup which claims that the CPU locked
up in user space. He provided a reproducer, which sets up a timerfd based
timer and then rearms it in a loop with an absolute expiry time of 1ns.

As the expiry time is in the past, the timer ends up as the first expiring
timer in the per CPU hrtimer base and the clockevent device is programmed
with the minimum delta value. If the machine is fast enough, this ends up
in a endless loop of programming the delta value to the minimum value
defined by the clock event device, before the timer interrupt can fire,
which starves the interrupt and consequently triggers the lockup detector
because the hrtimer callback of the lockup mechanism is never invoked.

As a first step to prevent this, avoid reprogramming the clock event device
when:
     - a forced minimum delta event is pending
     - the new expiry delta is less then or equal to the minimum delta

Thanks to Calvin for providing the reproducer and to Borislav for testing
and providing data from his Zen5 machine.

The problem is not limited to Zen5, but depending on the underlying
clock event device (e.g. TSC deadline timer on Intel) and the CPU speed
not necessarily observable.

This change serves only as the last resort and further changes will be made
to prevent this scenario earlier in the call chain as far as possible.

[ tglx: Updated to restore the old behaviour vs. !force and delta <= 0 and
  	fixed up the tick-broadcast handlers as pointed out by Borislav ]

Fixes: d316c57ff6 ("[PATCH] clockevents: add core functionality")
Reported-by: Calvin Owens <calvin@wbinvd.org>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Tested-by: Calvin Owens <calvin@wbinvd.org>
Tested-by: Borislav Petkov <bp@alien8.de>
Link: https://lore.kernel.org/lkml/acMe-QZUel-bBYUh@mozart.vkv.me/
Link: https://patch.msgid.link/20260407083247.562657657@kernel.org
2026-04-10 22:45:38 +02:00
Linus Torvalds
7c6c4ed80b Merge tag 'vfs-7.0-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
 "The kernfs rbtree is keyed by (hash, ns, name) where the hash
  is seeded with the raw namespace pointer via init_name_hash(ns).

  The resulting hash values are exposed to userspace through
  readdir seek positions, and the pointer-based ordering in
  kernfs_name_compare() is observable through entry order.

  Switch from raw pointers to ns_common::ns_id for both hashing
  and comparison.

  A preparatory commit first replaces all const void * namespace
  parameters with const struct ns_common * throughout kernfs, sysfs,
  and kobject so the code can access ns->ns_id. Also compare the
  ns_id when hashes match in the rbtree to handle crafted collisions.

  Also fix eventpoll RCU grace period issue and a cachefiles refcount
  problem"

* tag 'vfs-7.0-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  kernfs: make directory seek namespace-aware
  kernfs: use namespace id instead of pointer for hashing and comparison
  kernfs: pass struct ns_common instead of const void * for namespace tags
  eventpoll: defer struct eventpoll free to RCU grace period
  cachefiles: fix incorrect dentry refcount in cachefiles_cull()
2026-04-10 08:40:49 -07:00
Linus Torvalds
96463e4e02 Merge tag 'turbostat-fixes-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull turbostat fixes from Len Brown:

 - Fix a memory allocation issue that could corrupt output values or
   SEGV

 - Fix a perf initilization issue that could exit on some HW + kernels

 - Minor fixes

* tag 'turbostat-fixes-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: Allow execution to continue after perf_l2_init() failure
  tools/power turbostat: Fix delimiter bug in print functions
  tools/power turbostat: Fix --show/--hide for individual cpuidle counters
  tools/power turbostat: Fix incorrect format variable
  tools/power turbostat: Consistently use print_float_value()
  tools/power/turbostat: Fix microcode patch level output for AMD/Hygon
  tools/power turbostat: Eliminate unnecessary data structure allocation
  tools/power turbostat: Fix swidle header vs data display
  tools/power turbostat: Fix illegal memory access when SMT is present and disabled
2026-04-10 08:36:32 -07:00
Linus Torvalds
017102b40c Merge tag 'gpio-fixes-for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:

 - gracefully handle missing regmap in gpio-bd72720

 - fix IRQ resource release in gpio-tegra

 - return -ENOMEM on devm_kzalloc() failure instead of -ENODEV in
   gpio-tegra

* tag 'gpio-fixes-for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: tegra: return -ENOMEM on allocation failure in probe
  gpio: tegra: fix irq_release_resources calling enable instead of disable
  gpio: bd72720: handle missing regmap
2026-04-10 08:32:30 -07:00
Linus Torvalds
77c3c619d2 Merge tag 'pinctrl-v7.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
 "Some late pin control fixes. I'm not happy to have bugs so late in the
  kernel cycle, but they are all driver specifics so I guess it's how it
  is.

   - Three fixes for the Intel pin control driver fixing the feature set
     for the new silicon

   - One fix for an IRQ storm in the MCP23S08 pin controller/GPIO
     expander"

* tag 'pinctrl-v7.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: mcp23s08: Disable all pin interrupts during probe
  pinctrl: intel: Enable 3-bit PAD_OWN feature
  pinctrl: intel: Fix the revision for new features (1kOhm PD, HW debouncer)
  pinctrl: intel: Improve capability support
2026-04-10 08:26:40 -07:00
David Arcari
ba893caead tools/power turbostat: Allow execution to continue after perf_l2_init() failure
Currently, if perf_l2_init() fails turbostat exits after issuing the
following error (which was encountered on AlderLake):

turbostat: perf_l2_init(cpu0, 0x0, 0xff24) REFS: Invalid argument

This occurs because perf_l2_init() calls err(). However, the code has been
written in such a manner that it is able to perform cleanup and continue.
Therefore, this issue can be addressed by changing the appropriate calls
to err() to warnx().

Additionally, correct the PMU type arguments passed to the warning strings
in the ecore and lcore blocks so the logs accurately reflect the failing
counter type.

Signed-off-by: David Arcari <darcari@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2026-04-10 09:04:32 -04:00
Samasth Norway Ananda
57df6923ca gpio: tegra: return -ENOMEM on allocation failure in probe
devm_kzalloc() failure in tegra_gpio_probe() returns -ENODEV, which
indicates "no such device". The correct error code for a memory
allocation failure is -ENOMEM.

Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda@oracle.com>
Link: https://patch.msgid.link/20260409185853.2163034-1-samasth.norway.ananda@oracle.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
2026-04-10 09:01:24 +02:00
Dave Airlie
93be8c74b6 Merge tag 'drm-misc-fixes-2026-04-09' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Several fixes for v3d about memory leak, runtime PM, and locking, and a
Kconfig improvement for ethosu.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patch.msgid.link/20260409-omniscient-tomato-coucal-edbadc@penduick
2026-04-10 14:25:57 +10:00
Linus Torvalds
9a9c8ce300 Merge tag 'kbuild-fixes-7.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild fixes from Nathan Chancellor:

 - Make modules-cpio-pkg respect INSTALL_MOD_PATH so that it can be
   used with distribution initramfs files that have a merged /usr,
   such as Fedora

 - Silence an instance of -Wunused-but-set-global, a strengthening
   of -Wunused-but-set-variable in tip of tree Clang, in modpost,
   as the variable for extra warnings is currently unused

* tag 'kbuild-fixes-7.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
  modpost: Declare extra_warn with unused attribute
  kbuild: modules-cpio-pkg: Respect INSTALL_MOD_PATH
2026-04-09 16:48:44 -07:00
Linus Torvalds
b42ed3bb88 Merge tag 'efi-fixes-for-v7.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fix from Ard Biesheuvel:
 "Fix an incorrect preprocessor conditional that may result in duplicate
  instances of sysfb_primary_display on x86"

* tag 'efi-fixes-for-v7.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  firmware: efi: Never declare sysfb_primary_display on x86
2026-04-09 11:21:21 -07:00
Linus Torvalds
bb2ea74eeb Merge tag 'sound-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "Still a bit higher amount than wished, but nothing looks really scary,
  and all changes are about nice and smooth device-specific fixes.

   - HD-audio quirks, one revert for a regression and another oneliner

   - AMD ACP quirks

   - Fixes for SDCA interrupt handling

   - A few Intel SOF, avs and NVL fixes

   - Fixes for TAS2552 DT, NAU8325, and STM32"

* tag 'sound-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ASoC: amd: acp: update DMI quirk and add ACP DMIC for Lenovo platforms
  ASoC: SDCA: Unregister IRQ handlers on module remove
  ASoC: SDCA: mask Function_Status value
  ASoC: SDCA: Fix overwritten var within for loop
  ASoC: stm32_sai: fix incorrect BCLK polarity for DSP_A/B, LEFT_J
  ASoC: SOF: Intel: hda: modify period size constraints for ACE4
  ALSA: hda/intel: enforce stricter period-size alignment for Intel NVL
  ASoC: nau8325: Add software reset during probe
  Revert "ALSA: hda/realtek: Add quirk for Gigabyte Technology to fix headphone"
  ASoC: Intel: avs: Fix memory leak in avs_register_i2s_test_boards()
  ASoC: SOF: Intel: fix iteration in is_endpoint_present()
  ASoC: SOF: Intel: Fix endpoint index if endpoints are missing
  ASoC: SDCA: Fix errors in IRQ cleanup
  ASoC: amd: acp: add Lenovo P16s G5 AMD quirk for legacy SDW machine
  ASoC: dt-bindings: ti,tas2552: Add sound-dai-cells
  ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 14IAH10
2026-04-09 11:17:16 -07:00
Linus Torvalds
4e1538b1f1 Merge tag 'mmc-v7.0-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:

 - vub300: Fix use-after-free and NULL-deref on disconnect

* tag 'mmc-v7.0-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: vub300: fix use-after-free on disconnect
  mmc: vub300: fix NULL-deref on disconnect
2026-04-09 11:13:15 -07:00
Linus Torvalds
d58305b2db Merge tag 'pmdomain-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain fixes from Ulf Hansson:

 - imx: Prevent hang at power down for imx8mp-blk-ctrl

 - thead: Fix buffer overflow for TH1520 AON driver

 - Change Ulf Hansson's email

* tag 'pmdomain-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
  MAINTAINERS, mailmap: Change Ulf Hansson's email
  pmdomain: imx8mp-blk-ctrl: Keep the NOC_HDCP clock enabled
  firmware: thead: Fix buffer overflow and use standard endian macros
2026-04-09 11:09:12 -07:00
Linus Torvalds
3ffcd57823 Merge tag 'dma-mapping-7.0-2026-04-09' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping fix from Marek Szyprowski:
 "A fix for DMA-mapping subsystem, which hides annoying, false-positive
  warnings from DMA-API debug on coherent platforms like x86_64 (Mikhail
  Gavrilov)"

* tag 'dma-mapping-7.0-2026-04-09' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
  dma-debug: suppress cacheline overlap warning when arch has no DMA alignment requirement
2026-04-09 11:02:35 -07:00
Linus Torvalds
a55f7f5f29 Merge tag 'net-7.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from netfilter, IPsec and wireless. This is again
  considerably bigger than the old average. No known outstanding
  regressions.

  Current release - regressions:

   - net: increase IP_TUNNEL_RECURSION_LIMIT to 5

   - eth: ice: fix PTP timestamping broken by SyncE code on E825C

  Current release - new code bugs:

   - eth: stmmac: dwmac-motorcomm: fix eFUSE MAC address read failure

  Previous releases - regressions:

   - core: fix cross-cache free of KFENCE-allocated skb head

   - sched: act_csum: validate nested VLAN headers

   - rxrpc: fix call removal to use RCU safe deletion

   - xfrm:
      - wait for RCU readers during policy netns exit
      - fix refcount leak in xfrm_migrate_policy_find

   - wifi: rt2x00usb: fix devres lifetime

   - mptcp: fix slab-use-after-free in __inet_lookup_established

   - ipvs: fix NULL deref in ip_vs_add_service error path

   - eth:
      - airoha: fix memory leak in airoha_qdma_rx_process()
      - lan966x: fix use-after-free and leak in lan966x_fdma_reload()

  Previous releases - always broken:

   - ipv6: ioam: fix potential NULL dereferences in __ioam6_fill_trace_data()

   - ipv4: nexthop: avoid duplicate NHA_HW_STATS_ENABLE on nexthop group
     dump

   - bridge: guard local VLAN-0 FDB helpers against NULL vlan group

   - xsk: tailroom reservation and MTU validation

   - rxrpc:
      - fix to request an ack if window is limited
      - fix RESPONSE authenticator parser OOB read

   - netfilter: nft_ct: fix use-after-free in timeout object destroy

   - batman-adv: hold claim backbone gateways by reference

   - eth:
      - stmmac: fix PTP ref clock for Tegra234
      - idpf: fix PREEMPT_RT raw/bh spinlock nesting for async VC handling
      - ipa: fix GENERIC_CMD register field masks for IPA v5.0+"

* tag 'net-7.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (104 commits)
  net: lan966x: fix use-after-free and leak in lan966x_fdma_reload()
  net: lan966x: fix page pool leak in error paths
  net: lan966x: fix page_pool error handling in lan966x_fdma_rx_alloc_page_pool()
  nfc: pn533: allocate rx skb before consuming bytes
  l2tp: Drop large packets with UDP encap
  net: ipa: fix event ring index not programmed for IPA v5.0+
  net: ipa: fix GENERIC_CMD register field masks for IPA v5.0+
  MAINTAINERS: Add Prashanth as additional maintainer for amd-xgbe driver
  devlink: Fix incorrect skb socket family dumping
  af_unix: read UNIX_DIAG_VFS data under unix_state_lock
  Revert "mptcp: add needs_id for netlink appending addr"
  mptcp: fix slab-use-after-free in __inet_lookup_established
  net: txgbe: leave space for null terminators on property_entry
  net: ioam6: fix OOB and missing lock
  rxrpc: proc: size address buffers for %pISpc output
  rxrpc: only handle RESPONSE during service challenge
  rxrpc: Fix buffer overread in rxgk_do_verify_authenticator()
  rxrpc: Fix leak of rxgk context in rxgk_verify_response()
  rxrpc: Fix integer overflow in rxgk_verify_response()
  rxrpc: Fix missing error checks for rxkad encryption/decryption failure
  ...
2026-04-09 08:39:25 -07:00
Linus Torvalds
8b02520ec5 Merge tag 'iommu-fixes-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull IOMMU fix from Will Deacon:

 - Fix regression introduced by the empty MMU gather fix in -rc7, where
   the ->iotlb_sync() callback can be elided incorrectly, resulting in
   boot failures (hangs), crashes and potential memory corruption.

* tag 'iommu-fixes-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
  iommu: Ensure .iotlb_sync is called correctly
2026-04-09 08:36:31 -07:00