Commit Graph

1368927 Commits

Author SHA1 Message Date
Trond Myklebust
4ec752ce6d NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file
Use store_release_wake_up() instead of wake_up_var_locked(), because the
waiter cannot retake the nfs_uuid->lock.

Acked-by: Mike Snitzer <snitzer@kernel.org>
Tested-by: Mike Snitzer <snitzer@kernel.org>
Suggested-by: NeilBrown <neil@brown.name>
Link: https://lore.kernel.org/all/175262948827.2234665.1891349021754495573@noble.neil.brown.name/
Fixes: 21fb440346 ("nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-08-05 16:45:40 -07:00
Trond Myklebust
fdd015de76 NFS/localio: nfs_uuid_put() fix races with nfs_open/close_local_fh()
In order for the wait in nfs_uuid_put() to be safe, it is necessary to
ensure that nfs_uuid_add_file() doesn't add a new entry once the
nfs_uuid->net has been NULLed out.

Also fix up the wake_up_var_locked() / wait_var_event_spinlock() to both
use the nfs_uuid address, since nfl, and &nfl->uuid could be used elsewhere.

Acked-by: Mike Snitzer <snitzer@kernel.org>
Tested-by: Mike Snitzer <snitzer@kernel.org>
Link: https://lore.kernel.org/all/175262893035.2234665.1735173020338594784@noble.neil.brown.name/
Fixes: 21fb440346 ("nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-08-05 16:45:40 -07:00
Trond Myklebust
e144d53cf2 NFS/localio: nfs_close_local_fh() fix check for file closed
If the struct nfs_file_localio is closed, its list entry will be empty,
but the nfs_uuid->files list might still contain other entries.

Acked-by: Mike Snitzer <snitzer@kernel.org>
Tested-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neil@brown.name>
Fixes: 21fb440346 ("nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-08-05 16:45:39 -07:00
Trond Myklebust
b9defd611a NFSv4: Remove duplicate lookups, capability probes and fsinfo calls
When crossing into a new filesystem, the NFSv4 client will look up the
new directory, and then call nfs4_server_capabilities() as well as
nfs4_do_fsinfo() at least twice.

This patch removes the duplicate calls, and reduces the initial lookup
to retrieve just a minimal set of attributes.

Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-08-04 09:48:16 -07:00
Trond Myklebust
b01f21cacd NFS: Fix the setting of capabilities when automounting a new filesystem
Capabilities cannot be inherited when we cross into a new filesystem.
They need to be reset to the minimal defaults, and then probed for
again.

Fixes: 54ceac4515 ("NFS: Share NFS superblocks per-protocol per-server per-FSID")
Cc: stable@vger.kernel.org
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-08-04 09:16:45 -07:00
Olga Kornievskaia
cc5d59081f sunrpc: fix client side handling of tls alerts
A security exploit was discovered in NFS over TLS in tls_alert_recv
due to its assumption that there is valid data in the msghdr's
iterator's kvec.

Instead, this patch proposes the rework how control messages are
setup and used by sock_recvmsg().

If no control message structure is setup, kTLS layer will read and
process TLS data record types. As soon as it encounters a TLS control
message, it would return an error. At that point, NFS can setup a kvec
backed control buffer and read in the control message such as a TLS
alert. Scott found that a msg iterator can advance the kvec pointer
as a part of the copy process thus we need to revert the iterator
before calling into the tls_alert_recv.

Fixes: dea034b963 ("SUNRPC: Capture CMSG metadata on client-side receive")
Suggested-by: Trond Myklebust <trondmy@hammerspace.com>
Suggested-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Link: https://lore.kernel.org/r/20250731180058.4669-3-okorniev@redhat.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-08-03 12:45:47 -07:00
Li RongQing
533210f239 nfs/localio: use read_seqbegin() rather than read_seqbegin_or_lock()
The usage of read_seqbegin_or_lock() in nfs_copy_boot_verifier()
is wrong. "seq" is always even and thus "or_lock" has no effect.

nfs_copy_boot_verifier() just copies 8 bytes and is supposed to be
very rare operation, so we do not need the adaptive locking in this case.

Signed-off-by: Li RongQing <lirongqing@baidu.com>
Link: https://lore.kernel.org/r/20250731081038.3478-1-lirongqing@baidu.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-08-03 12:45:47 -07:00
Benjamin Coddington
99765233ab NFS: Fixup allocation flags for nfsiod's __GFP_NORETRY
If the NFS client is doing writeback from a workqueue context, avoid using
__GFP_NORETRY for allocations if the task has set PF_MEMALLOC_NOIO or
PF_MEMALLOC_NOFS.  The combination of these flags makes memory allocation
failures much more likely.

We've seen those allocation failures show up when the loopback driver is
doing writeback from a workqueue to a file on NFS, where memory allocation
failure results in errors or corruption within the loopback device's
filesystem.

Suggested-by: Trond Myklebust <trondmy@kernel.org>
Fixes: 0bae835b63 ("NFS: Avoid writeback threads getting stuck in mempool_alloc()")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/f83ac1155a4bc670f2663959a7a068571e06afd9.1752111622.git.bcodding@redhat.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-28 12:01:21 -04:00
Olga Kornievskaia
9acb237def NFSv4.2: another fix for listxattr
Currently, when the server supports NFS4.1 security labels then
security.selinux label in included twice. Instead, only add it
when the server doesn't possess security label support.

Fixes: 243fea1346 ("NFSv4.2: fix listxattr to return selinux security label")
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Link: https://lore.kernel.org/r/20250722205641.79394-1-okorniev@redhat.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-28 11:51:18 -04:00
Trond Myklebust
ef93a685e0 NFS: Fix filehandle bounds checking in nfs_fh_to_dentry()
The function needs to check the minimal filehandle length before it can
access the embedded filehandle.

Reported-by: zhangjian <zhangjian496@huawei.com>
Fixes: 20fa190272 ("nfs: add export operations")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-22 09:50:56 -04:00
Trond Myklebust
f66e6bffc5 SUNRPC: Silence warnings about parameters not being described
Warning: net/sunrpc/auth_gss/gss_krb5_crypto.c:902 function parameter
'len' not described in 'krb5_etm_decrypt'
Warning: net/sunrpc/auth_gss/gss_krb5_crypto.c:902 function parameter
'buf' not described in 'krb5_etm_decrypt'

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-22 08:10:41 -04:00
Trond Myklebust
ec0abdda89 NFS: Clean up pnfs_put_layout_hdr()/pnfs_destroy_layout_final()
Use the wake_up_var_locked() and wait_var_event_spinlock() helpers.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-22 08:10:41 -04:00
Trond Myklebust
1db3a48e83 NFS: Fix wakeup of __nfs_lookup_revalidate() in unblock_revalidate()
Use store_release_wake_up() to add the appropriate memory barrier before
calling wake_up_var(&dentry->d_fsdata).

Reported-by: Lukáš Hejtmánek<xhejtman@ics.muni.cz>
Suggested-by: Santosh Pradhan <santosh.pradhan@gmail.com>
Link: https://lore.kernel.org/all/18945D18-3EDB-4771-B019-0335CE671077@ics.muni.cz/
Fixes: 99bc9f2eb3 ("NFS: add barriers when testing for NFS_FSDATA_BLOCKED")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-22 08:10:41 -04:00
Christoph Hellwig
f5b3108e6a NFS: use a hash table for delegation lookup
nfs_delegation_find_inode currently has to walk the entire list of
delegations per inode, which can become pretty large, and can become even
larger when increasing the delegation watermark.

Add a hash table to speed up the delegation lookup, sized as a fraction
of the delegation watermark.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250718081509.2607553-6-hch@lst.de
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-22 08:10:41 -04:00
Christoph Hellwig
2fb4af5ea3 NFS: track active delegations per-server
The active delegation watermark was added to avoid overloading servers.
Track the active delegation per-server instead of globally so that clients
talking to multiple servers aren't limited by the global limit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20250718081509.2607553-5-hch@lst.de
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-22 08:10:41 -04:00
Christoph Hellwig
aee077d8ed NFS: move the delegation_watermark module parameter
Keep the module_param_named next to the variable declaration instead of
somewhere unrelated, following the best practice in the rest of the
kernel.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20250718081509.2607553-4-hch@lst.de
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-22 08:10:41 -04:00
Christoph Hellwig
7375bbad46 NFS: cleanup nfs_inode_reclaim_delegation
Reduce a level of indentation for most of the code in this function.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20250718081509.2607553-3-hch@lst.de
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-22 08:10:41 -04:00
Christoph Hellwig
67173860a7 NFS: cleanup error handling in nfs4_server_common_setup
Return error directly instead of using a goto label for it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250718081509.2607553-2-hch@lst.de
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-22 08:10:41 -04:00
Tigran Mkrtchyan
f06bedfa62 pNFS/flexfiles: don't attempt pnfs on fatal DS errors
When an applications get killed (SIGTERM/SIGINT) while pNFS client performs a connection
to DS, client ends in an infinite loop of connect-disconnect. This
source of the issue, it that flexfilelayoutdev#nfs4_ff_layout_prepare_ds gets an error
on nfs4_pnfs_ds_connect with status ERESTARTSYS, which is set by rpc_signal_task, but
the error is treated as transient, thus retried.

The issue is reproducible with Ctrl+C the following script(there should be ~1000 files in
a directory, client should must not have any connections to DSes):

```
echo 3 > /proc/sys/vm/drop_caches

for i in *
do
   head -1 $i
done
```

The change aims to propagate the nfs4_ff_layout_prepare_ds error state
to the caller that can decide whatever this is a retryable error or not.

Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Link: https://lore.kernel.org/r/20250627071751.189663-1-tigran.mkrtchyan@desy.de
Fixes: 260f32adb8 ("pNFS/flexfiles: Check the result of nfs4_pnfs_ds_connect")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-22 08:10:17 -04:00
Christoph Hellwig
c262b444bd NFS: drop __exit from nfs_exit_keyring
Otherwise built-in NFS can lead to sectіon mismatches.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250714062450.1468117-1-hch@lst.de
Fixes: 87268f7a4f ("nfs: create a kernel keyring")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-22 08:10:17 -04:00
Christoph Hellwig
f3fc8f0649 NFS: pass struct nfs_client_initdata to nfs4_set_client
Passed the partially filled out structure to nfs4_set_client instead of
11 arguments that then get stashed into the structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-22 08:10:17 -04:00
Sergey Bashirov
7db6e66663 pNFS: Fix disk addr range check in block/scsi layout
At the end of the isect translation, disc_addr represents the physical
disk offset. Thus, end calculated from disk_addr is also a physical disk
offset. Therefore, range checking should be done using map->disk_offset,
not map->start.

Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250702133226.212537-1-sergeybashirov@gmail.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:28 -07:00
Sergey Bashirov
81438498a2 pNFS: Fix stripe mapping in block/scsi layout
Because of integer division, we need to carefully calculate the
disk offset. Consider the example below for a stripe of 6 volumes,
a chunk size of 4096, and an offset of 70000.

chunk = div_u64(offset, dev->chunk_size) = 70000 / 4096 = 17
offset = chunk * dev->chunk_size = 17 * 4096 = 69632
disk_offset_wrong = div_u64(offset, dev->nr_children) = 69632 / 6 = 11605
disk_chunk = div_u64(chunk, dev->nr_children) = 17 / 6 = 2
disk_offset = disk_chunk * dev->chunk_size = 2 * 4096 = 8192

Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250701122341.199112-1-sergeybashirov@gmail.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:28 -07:00
Sergey Bashirov
d897d81671 pNFS: Handle RPC size limit for layoutcommits
When there are too many block extents for a layoutcommit, they may not
all fit into the maximum-sized RPC. This patch allows the generic pnfs
code to properly handle -ENOSPC returned by the block/scsi layout driver
and trigger additional layoutcommits if necessary.

Co-developed-by: Konstantin Evtushenko <koevtushenko@yandex.com>
Signed-off-by: Konstantin Evtushenko <koevtushenko@yandex.com>
Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250630183537.196479-5-sergeybashirov@gmail.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:28 -07:00
Sergey Bashirov
66642bbee5 pNFS: Add prepare commit trace to block/scsi layout
Replace dprintk with trace event in ext_tree_prepare_commit() function.

Co-developed-by: Konstantin Evtushenko <koevtushenko@yandex.com>
Signed-off-by: Konstantin Evtushenko <koevtushenko@yandex.com>
Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250630183537.196479-4-sergeybashirov@gmail.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:28 -07:00
Sergey Bashirov
d84c4754f8 pNFS: Fix extent encoding in block/scsi layout
The ext_tree_encode_commit() function may be called multiple times for
the same file, layout, and last written byte if the provided buffer is
not large enough to encode all extents in it.

The first problem is that the last written byte field must be zeroed
only on a successful call, otherwise we will lose its actual value and
get an integer overflow on the next encoding attempt.

The second problem is that we can't count and encode in one pass. The
extent state changes during encoding, so if we return -ENOSPC but have
already encoded some extents into a small buffer, they will not be
re-encoded into a new larger buffer on the next try. As a result, the
client never commits these extents to the server.

Co-developed-by: Konstantin Evtushenko <koevtushenko@yandex.com>
Signed-off-by: Konstantin Evtushenko <koevtushenko@yandex.com>
Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250630183537.196479-3-sergeybashirov@gmail.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:28 -07:00
Sergey Bashirov
9768797c21 pNFS: Fix uninited ptr deref in block/scsi layout
The error occurs on the third attempt to encode extents. When function
ext_tree_prepare_commit() reallocates a larger buffer to retry encoding
extents, the "layoutupdate_pages" page array is initialized only after the
retry loop. But ext_tree_free_commitdata() is called on every iteration
and tries to put pages in the array, thus dereferencing uninitialized
pointers.

An additional problem is that there is no limit on the maximum possible
buffer_size. When there are too many extents, the client may create a
layoutcommit that is larger than the maximum possible RPC size accepted
by the server.

During testing, we observed two typical scenarios. First, one memory page
for extents is enough when we work with small files, append data to the
end of the file, or preallocate extents before writing. But when we fill
a new large file without preallocating, the number of extents can be huge,
and counting the number of written extents in ext_tree_encode_commit()
does not help much. Since this number increases even more between
unlocking and locking of ext_tree, the reallocated buffer may not be
large enough again and again.

Co-developed-by: Konstantin Evtushenko <koevtushenko@yandex.com>
Signed-off-by: Konstantin Evtushenko <koevtushenko@yandex.com>
Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250630183537.196479-2-sergeybashirov@gmail.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:28 -07:00
Dr. David Alan Gilbert
3b3bc9a1f7 NFS: Remove unused function nfs_umount
nfs_umount() has been unused since 2013's
commit 4580a92d44 ("NFS: Use server-recommended security flavor by
default (NFSv3)")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Link: https://lore.kernel.org/r/20250218215250.263709-1-linux@treblig.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:28 -07:00
Dr. David Alan Gilbert
48693d119b SUNRPC: Remove unused xdr functions
Remove a bunch of unused xdr_*decode* functions:
  The last use of xdr_decode_netobj() was removed in 2021 by:
commit 7cf96b6d01 ("lockd: Update the NLMv4 SHARE arguments decoder to
use struct xdr_stream")
  The last use of xdr_decode_string_inplace() was removed in 2021 by:
commit 3049e974a7 ("lockd: Update the NLMv4 FREE_ALL arguments decoder
to use struct xdr_stream")
  The last use of xdr_stream_decode_opaque() was removed in 2024 by:
commit fed8a17c61 ("xdrgen: typedefs should use the built-in string and
opaque functions")

  The functions xdr_stream_decode_string() and
xdr_stream_decode_opaque_dup() were both added in 2018 by the
commit 0e779aa703 ("SUNRPC: Add helpers for decoding opaque and string
types")
but never used.

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Link: https://lore.kernel.org/r/20250712233006.403226-1-linux@treblig.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:28 -07:00
Christoph Hellwig
87268f7a4f nfs: create a kernel keyring
Create a kernel .nfs keyring similar to the nvme .nvme one.  Unlike for
a userspace-created keyrind, tlshd is a possesor of the keys with this
and thus the keys don't need user read permissions.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Link: https://lore.kernel.org/r/20250515115107.33052-3-hch@lst.de
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:28 -07:00
Christoph Hellwig
90c9550a8d NFS: support the kernel keyring for TLS
Allow tlshd to use a per-mount key from the kernel keyring similar
to NVMe over TCP.

Note that tlshd expects keys and certificates stored in the kernel
keyring to be in DER format, not the PEM format used for file based keys
and certificates, so they need to be converted before they are added
to the keyring, which is a bit unexpected.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Link: https://lore.kernel.org/r/20250515115107.33052-2-hch@lst.de
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:27 -07:00
Trond Myklebust
72508db0fe NFS: Allow folio migration for the case of mode == MIGRATE_SYNC
When the mode is MIGRATE_SYNC, we are allowed to call nfs_wb_folio()
under the folio lock.

Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:27 -07:00
Jeff Layton
b0b7cdc994 nfs: new tracepoint in match_stateid operation
Add new tracepoints in the NFSv4 match_stateid minorversion op that show
the info in both stateids.

Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20250618-nfs-tracepoints-v2-4-540c9fb48da2@kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:27 -07:00
Jeff Layton
5dd03d14b3 nfs: new tracepoint in nfs_delegation_need_return
Add a tracepoint in the function that decides whether to return a
delegation to the server.

Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20250618-nfs-tracepoints-v2-3-540c9fb48da2@kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:27 -07:00
Jeff Layton
0139a30ada nfs: add a tracepoint to nfs_inode_detach_delegation_locked
We have tracepoints for setting a delegation and reclaiming them. Add a
tracepoint for when the delegation is being detached from the inode.

Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20250618-nfs-tracepoints-v2-2-540c9fb48da2@kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:27 -07:00
Jeff Layton
8c206b0a12 nfs: add cache_validity to the nfs_inode_event tracepoints
Managing the cache_validity flags is the deep voodoo of NFS cache
coherency. Let's have a little extra visibility into that value via the
nfs_inode_event tracepoints.

Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20250618-nfs-tracepoints-v2-1-540c9fb48da2@kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:27 -07:00
Anthony Iliopoulos
2c665d91c2 NFS: remove unused pnfs_ld_data field from struct nfs_server
The last code that was using this was removed via commit 20d655d619
("pnfs/blocklayout: use the device id cache") which was merged in
v3.18-rc1, so it can be removed completely.

Signed-off-by: Anthony Iliopoulos <ailiop@suse.com>
Link: https://lore.kernel.org/r/20250613094439.82338-4-ailiop@suse.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:25 -07:00
Anthony Iliopoulos
74a33326cf NFS: remove unused time_delta field from struct nfs_server
The last code that was using this was removed via commit ca0daa277a
("NFS: Cache aggressively when file is open for writing") which was
merged in v4.8-rc1, so it can be removed completely.

Signed-off-by: Anthony Iliopoulos <ailiop@suse.com>
Link: https://lore.kernel.org/r/20250613094439.82338-3-ailiop@suse.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:21 -07:00
Anthony Iliopoulos
0715a72ee9 NFS: remove unused wpages field from struct nfs_server
The wpages field is not serving any purpose since commit c63c7b0513
("NFS: Fix a race when doing NFS write coalescing") which was merged in
v2.6.22-rc1. Remove it completely.

Signed-off-by: Anthony Iliopoulos <ailiop@suse.com>
Link: https://lore.kernel.org/r/20250613094439.82338-2-ailiop@suse.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:19 -07:00
Tigran Mkrtchyan
a9e2183720 pnfs: add pnfs_ds_connect trace point
This tracepoint aims to expose pnfs DS connect status

Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Link: https://lore.kernel.org/r/20250610151246.9147-1-tigran.mkrtchyan@desy.de
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:03 -07:00
NeilBrown
c1b0b9d79f nfs: use lock_two_nondirectories()
Rather than open-coding this function call it to make intention clear
and to use "correct" nesting levels (parent and child are for
directories).

This is purely cosmetic with no expected change in behaviour.

Signed-off-by: NeilBrown <neil@brown.name>
Link: https://lore.kernel.org/r/174942091741.608730.3327223511347232829@noble.neil.brown.name
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:02 -07:00
Trond Myklebust
4b54274147 NFS: Return the file btime in the statx results when appropriate
If the server supports the NFSv4.x "create_time" attribute, then return
it as part of the statx results.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/eae27d6467e08aaa67e0ac6ae7119263a0f83349.1748515333.git.bcodding@redhat.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:02 -07:00
Anne Marie Merritt
1c7ae2dd3f nfs: Add timecreate to nfs inode
Add tracking of the create time (a.k.a. btime) along with corresponding
bitfields, request, and decode xdr routines.

Signed-off-by: Anne Marie Merritt <annemarie.merritt@primarydata.com>
Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/1e3677b0655fa2bbaba0817b41d111d94a06e5ee.1748515333.git.bcodding@redhat.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:02 -07:00
Trond Myklebust
ce60ab3964 Expand the type of nfs_fattr->valid
We need to be able to track more than 32 attributes per inode.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/1e3405fca54efd0be7c91c1da77917b94f5dfcc4.1748515333.git.bcodding@redhat.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-07-14 15:20:02 -07:00
Linus Torvalds
347e9f5043 Linux 6.16-rc6 v6.16-rc6 2025-07-13 14:25:58 -07:00
Linus Torvalds
3cd752194e Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Fixes for a few clk drivers and bindings:

 - Add a missing property to the Mediatek MT8188 clk binding to
   keep binding checks happy

 - Avoid an OOB by setting the correct number of parents in
   dispmix_csr_clk_dev_data

 - Allocate clk_hw structs early in probe to avoid an ordering
   issue where clk_parent_data points to an unallocated clk_hw
   when the child clk is registered before the parent clk in the
   SCMI clk driver

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  dt-bindings: clock: mediatek: Add #reset-cells property for MT8188
  clk: imx: Fix an out-of-bounds access in dispmix_csr_clk_dev_data
  clk: scmi: Handle case where child clocks are initialized before their parents
2025-07-13 11:37:35 -07:00
Linus Torvalds
5d5d62298b Merge tag 'x86_urgent_for_v6.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:

 - Update Kirill's email address

 - Allow hugetlb PMD sharing only on 64-bit as it doesn't make a whole
   lotta sense on 32-bit

 - Add fixes for a misconfigured AMD Zen2 client which wasn't even
   supposed to run Linux

* tag 'x86_urgent_for_v6.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  MAINTAINERS: Update Kirill Shutemov's email address for TDX
  x86/mm: Disable hugetlb page table sharing on 32-bit
  x86/CPU/AMD: Disable INVLPGB on Zen2
  x86/rdrand: Disable RDSEED on AMD Cyan Skillfish
2025-07-13 10:41:19 -07:00
Linus Torvalds
41998eeb29 Merge tag 'irq_urgent_for_v6.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Borislav Petkov:

 - Fix a case of recursive locking in the MSI code

 - Fix a randconfig build failure in armada-370-xp irqchip

* tag 'irq_urgent_for_v6.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/irq-msi-lib: Fix build with PCI disabled
  PCI/MSI: Prevent recursive locking in pci_msix_write_tph_tag()
2025-07-13 10:36:55 -07:00
Linus Torvalds
0a197b7576 Merge tag 'perf_urgent_for_v6.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Borislav Petkov:

 - Prevent perf_sigtrap() from observing an exiting task and warning
   about it

* tag 'perf_urgent_for_v6.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix WARN in perf_sigtrap()
2025-07-13 10:34:47 -07:00
Linus Torvalds
3f31a806a6 Merge tag 'mm-hotfixes-stable-2025-07-11-16-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
 "19 hotfixes. A whopping 16 are cc:stable and the remainder address
  post-6.15 issues or aren't considered necessary for -stable kernels.

  14 are for MM.  Three gdb-script fixes and a kallsyms build fix"

* tag 'mm-hotfixes-stable-2025-07-11-16-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  Revert "sched/numa: add statistics of numa balance task"
  mm: fix the inaccurate memory statistics issue for users
  mm/damon: fix divide by zero in damon_get_intervals_score()
  samples/damon: fix damon sample mtier for start failure
  samples/damon: fix damon sample wsse for start failure
  samples/damon: fix damon sample prcl for start failure
  kasan: remove kasan_find_vm_area() to prevent possible deadlock
  scripts: gdb: vfs: support external dentry names
  mm/migrate: fix do_pages_stat in compat mode
  mm/damon/core: handle damon_call_control as normal under kdmond deactivation
  mm/rmap: fix potential out-of-bounds page table access during batched unmap
  mm/hugetlb: don't crash when allocating a folio if there are no resv
  scripts/gdb: de-reference per-CPU MCE interrupts
  scripts/gdb: fix interrupts.py after maple tree conversion
  maple_tree: fix mt_destroy_walk() on root leaf node
  mm/vmalloc: leave lazy MMU mode on PTE mapping error
  scripts/gdb: fix interrupts display after MCP on x86
  lib/alloc_tag: do not acquire non-existent lock in alloc_tag_top_users()
  kallsyms: fix build without execinfo
2025-07-12 10:30:47 -07:00