Commit Graph

1368207 Commits

Author SHA1 Message Date
Dave Airlie
58ce2aec57 Merge tag 'drm-intel-next-fixes-2025-05-28' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
- Fix the enabling/disabling of DP audio SDP splitting

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://lore.kernel.org/r/aDaztAmV_erxo1Am@jlahtine-mobl
2025-05-29 07:36:18 +10:00
Tigran Mkrtchyan
e3e3775392 flexfiles/pNFS: update stats on NFS4ERR_DELAY for v4.1 DSes
On NFS4ERR_DELAY nfs slient updates its stats, but misses for
flexfiles v4.1 DSes.

Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:14 -04:00
NeilBrown
c25a89770d nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu pointer
Instead of calling xchg() and unrcu_pointer() before
nfsd_file_put_local(), we now pass pointer to the __rcu pointer and call
xchg() and unrcu_pointer() inside that function.

Where unrcu_pointer() is currently called the internals of "struct
nfsd_file" are not known and that causes older compilers such as gcc-8
to complain.

In some cases we have a __kernel (aka normal) pointer not an __rcu
pointer so we need to cast it to __rcu first.  This is strictly a
weakening so no information is lost.  Somewhat surprisingly, this cast
is accepted by gcc-8.

This has the pleasing result that the cmpxchg() which sets ro_file and
rw_file, and also the xchg() which clears them, are both now in the nfsd
code.

Reported-by: Pali Rohár <pali@kernel.org>
Reported-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Fixes: 86e0041225 ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:14 -04:00
NeilBrown
21fb440346 nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()
nfs_uuid_put() and nfs_close_local_fh() can race if a "struct
nfs_file_localio" is released at the same time that nfsd calls
nfs_localio_invalidate_clients().

It is important that neither of these functions completes after the
other has started looking at a given nfs_file_localio and before it
finishes.

If nfs_uuid_put() exits while nfs_close_local_fh() is closing ro_file
and rw_file it could return to __nfd_file_cache_purge() while some files
are still referenced so the purge may not succeed.

If nfs_close_local_fh() exits while nfsd_uuid_put() is still closing the
files then the "struct nfs_file_localio" could be freed while
nfsd_uuid_put() is still looking at it.  This side is currently handled
by copying the pointers out of ro_file and rw_file before deleting from
the list in nfsd_uuid.  We need to preserve this while ensuring that
nfsd_uuid_put() does wait for nfs_close_local_fh().

This patch use nfl->uuid and nfl->list to provide the required
interlock.

nfs_uuid_put() removes the nfs_file_localio from the list, then drops
locks and puts the two files, then reclaims the spinlock and sets
->nfs_uuid to NULL.

nfs_close_local_fh() operates in the reverse order, setting ->nfs_uuid
to NULL, then closing the files, then unlinking from the list.

If nfs_uuid_put() finds that ->nfs_uuid is already NULL, it waits for
the nfs_file_localio to be removed from the list.  If
nfs_close_local_fh() find that it has already been unlinked it waits for
->nfs_uuid to become NULL.  This ensure that one of the two tries to
close the files, but they each waits for the other.

As nfs_uuid_put() is making the list empty, change from a
list_for_each_safe loop to a while that always takes the first entry.
This makes the intent more clear.
Also don't move the list to a temporary local list as this would defeat
the guarantees required for the interlock.

Fixes: 86e0041225 ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:14 -04:00
NeilBrown
74fc55ab2a nfs_localio: duplicate nfs_close_local_fh()
nfs_close_local_fh() is called from two different places for quite
different use case.

It is called from nfs_uuid_put() when the nfs_uuid is being detached -
possibly because the nfs server is not longer serving that filesystem.
In this case there will always be an nfs_uuid and so rcu_read_lock() is
not needed.

It is also called when the nfs_file_localio is no longer needed.  In
this case there may not be an active nfs_uuid.

These two can race, and handling the race properly while avoiding
excessive locking will require different handling on each side.

This patch prepares the way by opencoding nfs_close_local_fh() into
nfs_uuid_put(), then simplifying the code there as befits the context.

Fixes: 86e0041225 ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:14 -04:00
NeilBrown
e6f7e1487a nfs_localio: simplify interface to nfsd for getting nfsd_file
The nfsd_localio_operations structure contains nfsd_file_get() to get a
reference to an nfsd_file.  This is only used in one place, where
nfsd_open_local_fh() is also used.

This patch combines the two, calling nfsd_open_local_fh() passing a
pointer to where the nfsd_file pointer might be stored.  If there is a
pointer there an nfsd_file_get() can get a reference, that reference is
returned.  If not a new nfsd_file is acquired, stored at the pointer,
and returned.  When we store a reference we also increase the refcount
on the net, as that refcount is decrements when we clear the stored
pointer.

We now get an extra reference *before* storing the new nfsd_file at the
given location.  This avoids possible races with the nfsd_file being
freed before the final reference can be taken.

This patch moves the rcu_dereference() needed after fetching from
ro_file or rw_file into the nfsd code where the 'struct nfs_file' is
fully defined.  This avoids an error reported by older versions of gcc
such as gcc-8 which complain about rcu_dereference() use in contexts
where the structure (which will supposedly be accessed) is not fully
defined.

Reported-by: Pali Rohár <pali@kernel.org>
Reported-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Fixes: 86e0041225 ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:14 -04:00
NeilBrown
77e82fb2c6 nfs_localio: always hold nfsd net ref with nfsd_file ref
Having separate nfsd_file_put and nfsd_file_put_local in struct
nfsd_localio_operations doesn't make much sense.  The difference is that
nfsd_file_put doesn't drop a reference to the nfs_net which is what
keeps nfsd from shutting down.

Currently, if nfsd tries to shutdown it will invalidate the files stored
in the list from the nfs_uuid and this will drop all references to the
nfsd net that the client holds.  But the client could still hold some
references to nfsd_files for active IO.  So nfsd might think is has
completely shut down local IO, but hasn't and has no way to wait for
those active IO requests to complete.

So this patch changes nfsd_file_get to nfsd_file_get_local and has it
increase the ref count on the nfsd net and it replaces all calls to
->nfsd_put_file to ->nfsd_put_file_local.

It also changes ->nfsd_open_local_fh to return with the refcount on the
net elevated precisely when a valid nfsd_file is returned.

This means that whenever the client holds a valid nfsd_file, there will
be an associated count on the nfsd net, and so the count can only reach
zero when all nfsd_files have been returned.

nfs_local_file_put() is changed to call nfs_to_nfsd_file_put_local()
instead of replacing calls to one with calls to the other because this
will help a later patch which changes nfs_to_nfsd_file_put_local() to
take an __rcu pointer while nfs_local_file_put() doesn't.

Fixes: 86e0041225 ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:14 -04:00
NeilBrown
ed9be31733 nfs_localio: use cmpxchg() to install new nfs_file_localio
Rather than using nfs_uuid.lock to protect installing
a new ro_file or rw_file, change to use cmpxchg().
Removing the file already uses xchg() so this improves symmetry
and also makes the code a little simpler.

Also remove the optimisation of not taking the lock, and not removing
the nfs_file_localio from the linked list, when both ->ro_file and
->rw_file are already NULL.  Given that ->nfs_uuid was not NULL, it is
extremely unlikely that neither ->ro_file or ->rw_file is NULL so
this optimisation can be of little value and it complicates
understanding of the code - why can the list_del_init() be skipped?

Finally, move the assignment of NULL to ->nfs_uuid until after
the last action on the nfs_file_localio (the list_del_init).  As soon as
this is NULL a racing nfs_close_local_fh() can bypass all the locking
and go on to free the nfs_file_localio, so we must be certain to be
finished with it first.

Fixes: 86e0041225 ("nfs: cache all open LOCALIO nfsd_file(s) in client")
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:14 -04:00
Chuck Lever
111f9e4b0d SUNRPC: Remove dead code from xs_tcp_tls_setup_socket()
xs_tcp_tls_finish_connecting() already marks the upper xprt
connected, so the same code in xs_tcp_tls_setup_socket() is
never executed.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:14 -04:00
Chuck Lever
0bd2f6b899 SUNRPC: Prevent hang on NFS mount with xprtsec=[m]tls
Engineers at Hammerspace noticed that sometimes mounting with
"xprtsec=tls" hangs for a minute or so, and then times out, even
when the NFS server is reachable and responsive.

kTLS shuts off data_ready callbacks if strp->msg_ready is set to
mitigate data_ready callbacks when a full TLS record is not yet
ready to be read from the socket.

Normally msg_ready is clear when the first TLS record arrives on
a socket. However, I observed that sometimes tls_setsockopt() sets
strp->msg_ready, and that prevents forward progress because
tls_data_ready() becomes a no-op.

Moreover, Jakub says: "If there's a full record queued at the time
when [tlshd] passes the socket back to the kernel, it's up to the
reader to read the already queued data out." So SunRPC cannot
expect a data_ready call when ingress data is already waiting.

Add an explicit poll after SunRPC's upper transport is set up to
pick up any data that arrived after the TLS handshake but before
transport set-up is complete.

Reported-by: Steve Sears <sjs@hammerspace.com>
Suggested-by: Jakub Kacinski <kuba@kernel.org>
Fixes: 75eb6af7ac ("SUNRPC: Add a TCP-with-TLS RPC transport class")
Tested-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Mike Snitzer <snitzer@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:14 -04:00
NeilBrown
dd862da61e nfs: fix incorrect handling of large-number NFS errors in nfs4_do_mkdir()
A recent commit introduced nfs4_do_mkdir() which reports an error from
nfs4_call_sync() by returning it with ERR_PTR().

This is a problem as nfs4_call_sync() can return negative NFS-specific
errors with values larger than MAX_ERRNO (4095).  One example is
NFS4ERR_DELAY which has value 10008.

This "pointer" gets to PTR_ERR_OR_ZERO() in nfs4_proc_mkdir() which
chooses ZERO because it isn't in the range of value errors.  Ultimately
the pointer is dereferenced.

This patch changes nfs4_do_mkdir() to report the dentry pointer and
status separately - pointer as a return value, status in an "int *"
parameter.

The same separation is used for _nfs4_proc_mkdir() and the two are
combined only in nfs4_proc_mkdir() after the status has passed through
nfs4_handle_exception(), which ensures the error code does not exceed
MAX_ERRNO.

It also fixes a problem in the even when nfs4_handle_exception() updated
the error value, the original 'alias' was still returned.

Reported-by: Anna Schumaker <anna@kernel.org>
Fixes: 8376583b84 ("nfs: change mkdir inode_operation to return alternate dentry if needed.")
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:14 -04:00
Li Lingfeng
80c4de6ab4 nfs: ignore SB_RDONLY when remounting nfs
In some scenarios, when mounting NFS, more than one superblock may be
created. The final superblock used is the last one created, but only the
first superblock carries the ro flag passed from user space. If a ro flag
is added to the superblock via remount, it will trigger the issue
described in Link[1].

Link[2] attempted to address this by marking the superblock as ro during
the initial mount. However, this introduced a new problem in scenarios
where multiple mount points share the same superblock:
[root@a ~]# mount /dev/sdb /mnt/sdb
[root@a ~]# echo "/mnt/sdb *(rw,no_root_squash)" > /etc/exports
[root@a ~]# echo "/mnt/sdb/test_dir2 *(ro,no_root_squash)" >> /etc/exports
[root@a ~]# systemctl restart nfs-server
[root@a ~]# mount -t nfs -o rw 127.0.0.1:/mnt/sdb/test_dir1 /mnt/test_mp1
[root@a ~]# mount | grep nfs4
127.0.0.1:/mnt/sdb/test_dir1 on /mnt/test_mp1 type nfs4 (rw,relatime,...
[root@a ~]# mount -t nfs -o ro 127.0.0.1:/mnt/sdb/test_dir2 /mnt/test_mp2
[root@a ~]# mount | grep nfs4
127.0.0.1:/mnt/sdb/test_dir1 on /mnt/test_mp1 type nfs4 (ro,relatime,...
127.0.0.1:/mnt/sdb/test_dir2 on /mnt/test_mp2 type nfs4 (ro,relatime,...
[root@a ~]#

When mounting the second NFS, the shared superblock is marked as ro,
causing the previous NFS mount to become read-only.

To resolve both issues, the ro flag is no longer applied to the superblock
during remount. Instead, the ro flag on the mount is used to control
whether the mount point is read-only.

Fixes: 281cad46b3 ("NFS: Create a submount rpc_op")
Link[1]: https://lore.kernel.org/all/20240604112636.236517-3-lilingfeng@huaweicloud.com/
Link[2]: https://lore.kernel.org/all/20241130035818.1459775-1-lilingfeng3@huawei.com/
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:14 -04:00
Li Lingfeng
8cd9b78594 nfs: clear SB_RDONLY before getting superblock
As described in the link, commit 52cb7f8f17 ("nfs: ignore SB_RDONLY when
mounting nfs") removed the check for the ro flag when determining whether
to share the superblock, which caused issues when mounting different
subdirectories under the same export directory via NFSv3. However, this
change did not affect NFSv4.

For NFSv3:
1) A single superblock is created for the initial mount.
2) When mounted read-only, this superblock carries the SB_RDONLY flag.
3) Before commit 52cb7f8f17 ("nfs: ignore SB_RDONLY when mounting nfs"):
Subsequent rw mounts would not share the existing ro superblock due to
flag mismatch, creating a new superblock without SB_RDONLY.
After the commit:
  The SB_RDONLY flag is ignored during superblock comparison, and this leads
  to sharing the existing superblock even for rw mounts.
  Ultimately results in write operations being rejected at the VFS layer.

For NFSv4:
1) Multiple superblocks are created and the last one will be kept.
2) The actually used superblock for ro mounts doesn't carry SB_RDONLY flag.
Therefore, commit 52cb7f8f17 doesn't affect NFSv4 mounts.

Clear SB_RDONLY before getting superblock when NFS_MOUNT_UNSHARED is not
set to fix it.

Fixes: 52cb7f8f17 ("nfs: ignore SB_RDONLY when mounting nfs")
Closes: https://lore.kernel.org/all/12d7ea53-1202-4e21-a7ef-431c94758ce5@app.fastmail.com/T/
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:14 -04:00
Mike Snitzer
1ff4716f42 NFS: always probe for LOCALIO support asynchronously
It was reported that NFS client mounts of AWS Elastic File System
(EFS) volumes is slow, this is because the AWS firewall disallows
LOCALIO (because it doesn't consider the use of NFS_LOCALIO_PROGRAM
valid), see: https://bugzilla.redhat.com/show_bug.cgi?id=2335129

Switch to performing the LOCALIO probe asynchronously to address the
potential for the NFS LOCALIO protocol being disallowed and/or slowed
by the remote server's response.

While at it, fix nfs_local_probe_async() to always take/put a
reference on the nfs_client that is using the LOCALIO protocol.
Also, unexport the nfs_local_probe() symbol and make it private to
fs/nfs/localio.c

This change has the side-effect of initially issuing reads, writes and
commits over the wire via SUNRPC until the LOCALIO probe completes.

Suggested-by: Jeff Layton <jlayton@kernel.org> # to always probe async
Fixes: 76d4cb6345 ("nfs: probe for LOCALIO when v4 client reconnects to server")
Cc: stable@vger.kernel.org # 6.14+
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:13 -04:00
Mike Snitzer
04a1526366 pnfs/flexfiles: connect to NFSv3 DS using TLS if MDS connection uses TLS
Implementation follows bones of the pattern that was established in
commit a35518cae4 ("NFSv4.1/pnfs: fix NFS with TLS in pnfs").

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:13 -04:00
Mike Snitzer
62d2cde203 NFS: add localio to sysfs
The Linux NFS client and server added support for LOCALIO in Linux
v6.12. It is useful to know if a client and server negotiated LOCALIO
be used, so expose it through the 'localio' attribute.

Suggested-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:13 -04:00
Christoph Hellwig
f72a67598c nfs: use writeback_iter directly
Stop using write_cache_pages and use writeback_iter directly.  This
removes an indirect call per written folio and makes the code easier
to follow.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:13 -04:00
Christoph Hellwig
66a4981350 nfs: refactor nfs_do_writepage
Use early returns wherever possible to simplify the code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:13 -04:00
Christoph Hellwig
66beed5aca nfs: don't return AOP_WRITEPAGE_ACTIVATE from nfs_do_writepage
nfs_do_writepage is a successful return that requires the caller to
unlock the folio.  Using it here requires special casing both in
nfs_do_writepage and nfs_writepages_callback and leaves a land mine in
nfs_wb_folio in case it ever set the flag.  Remove it and just
unconditionally unlock in nfs_writepages_callback.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:13 -04:00
Christoph Hellwig
b6354e60dd nfs: fold nfs_page_async_flush into nfs_do_writepage
Fold nfs_page_async_flush into its only caller to clean up the code a
bit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:13 -04:00
Han Young
3a3065352f NFSv4: Always set NLINK even if the server doesn't support it
fattr4_numlinks is a recommended attribute, so the client should emulate
it even if the server doesn't support it. In decode_attr_nlink function
in nfs4xdr.c, nlink is initialized to 1. However, this default value
isn't set to the inode due to the check in nfs_fhget.

So if the server doesn't support numlinks, inode's nlink will be zero,
the mount will fail with error "Stale file handle". Set the nlink to 1
if the server doesn't support it.

Signed-off-by: Han Young <hanyang.tony@bytedance.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:13 -04:00
Benjamin Coddington
77be29b7a3 NFSv4: Allow FREE_STATEID to clean up delegations
The NFS client's list of delegations can grow quite large (well beyond the
delegation watermark) if the server is revoking or there are repeated
events that expire state.  Once this happens, the revoked delegations can
cause a performance problem for subsequent walks of the
servers->delegations list when the client tries to test and free state.

If we can determine that the FREE_STATEID operation has completed without
error, we can prune the delegation from the list.

Since the NFS client combines TEST_STATEID with FREE_STATEID in its minor
version operations, there isn't an easy way to communicate success of
FREE_STATEID.  Rather than re-arrange quite a number of calling paths to
break out the separate procedures, let's signal the success of FREE_STATEID
by setting the stateid's type.

Set NFS4_FREED_STATEID_TYPE for stateids that have been successfully
discarded from the server, and use that type to signal that the delegation
can be cleaned up.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:13 -04:00
Scott Mayhew
4d4832ed13 NFSv4: Don't check for OPEN feature support in v4.1
fattr4_open_arguments is a v4.2 recommended attribute, so we shouldn't
be sending it to v4.1 servers.

Fixes: cb78f9b7d0 ("nfs: fix the fetch of FATTR4_OPEN_ARGUMENTS")
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Cc: stable@vger.kernel.org # 6.11+
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:13 -04:00
Olga Kornievskaia
243fea1346 NFSv4.2: fix listxattr to return selinux security label
Currently, when NFS is queried for all the labels present on the
file via a command example "getfattr -d -m . /mnt/testfile", it
does not return the security label. Yet when asked specifically for
the label (getfattr -n security.selinux) it will be returned.
Include the security label when all attributes are queried.

Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:13 -04:00
Sagi Grimberg
aba41e90aa NFSv4.2: fix setattr caching of TIME_[MODIFY|ACCESS]_SET when timestamps are delegated
nfs_setattr will flush all pending writes before updating a file time
attributes. However when the client holds delegated timestamps, it can
update its timestamps locally as it is the authority for the file
times attributes. The client will later set the file attributes by
adding a setattr to the delegreturn compound updating the server time
attributes.

Fix nfs_setattr to avoid flushing pending writes when the file time
attributes are delegated and the mtime/atime are set to a fixed
timestamp (ATTR_[MODIFY|ACCESS]_SET. Also, when sending the setattr
procedure over the wire, we need to clear the correct attribute bits
from the bitmask.

I was able to measure a noticable speedup when measuring untar performance.
Test: $ time tar xzf ~/dir.tgz
Baseline: 1m13.072s
Patched: 0m49.038s

Which is more than 30% latency improvement.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:13 -04:00
Anna Schumaker
d2e1d783f2 NFS: Add support for fallocate(FALLOC_FL_ZERO_RANGE)
This implements a suggestion from Trond that we can mimic
FALLOC_FL_ZERO_RANGE by sending a compound that first does a DEALLOCATE
to punch a hole in a file, and then an ALLOCATE to fill the hole with
zeroes. There might technically be a race here, but once the DEALLOCATE
finishes any reads from the region would return zeroes anyway, so I
don't expect it to cause problems.

Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:13 -04:00
Max Kellermann
4c10fa44bc fs/nfs/read: fix double-unlock bug in nfs_return_empty_folio()
Sometimes, when a file was read while it was being truncated by
another NFS client, the kernel could deadlock because folio_unlock()
was called twice, and the second call would XOR back the `PG_locked`
flag.

Most of the time (depending on the timing of the truncation), nobody
notices the problem because folio_unlock() gets called three times,
which flips `PG_locked` back off:

 1. vfs_read, nfs_read_folio, ... nfs_read_add_folio,
    nfs_return_empty_folio
 2. vfs_read, nfs_read_folio, ... netfs_read_collection,
    netfs_unlock_abandoned_read_pages
 3. vfs_read, ... nfs_do_read_folio, nfs_read_add_folio,
    nfs_return_empty_folio

The problem is that nfs_read_add_folio() is not supposed to unlock the
folio if fscache is enabled, and a nfs_netfs_folio_unlock() check is
missing in nfs_return_empty_folio().

Rarely this leads to a warning in netfs_read_collection():

 ------------[ cut here ]------------
 R=0000031c: folio 10 is not locked
 WARNING: CPU: 0 PID: 29 at fs/netfs/read_collect.c:133 netfs_read_collection+0x7c0/0xf00
 [...]
 Workqueue: events_unbound netfs_read_collection_worker
 RIP: 0010:netfs_read_collection+0x7c0/0xf00
 [...]
 Call Trace:
  <TASK>
  netfs_read_collection_worker+0x67/0x80
  process_one_work+0x12e/0x2c0
  worker_thread+0x295/0x3a0

Most of the time, however, processes just get stuck forever in
folio_wait_bit_common(), waiting for `PG_locked` to disappear, which
never happens because nobody is really holding the folio lock.

Fixes: 000dbe0bec ("NFS: Convert buffered read paths to use netfs when fscache is enabled")
Cc: stable@vger.kernel.org
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Reviewed-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28 17:17:13 -04:00
Jerome Brunet
793908d60b PCI: endpoint: Retain fixed-size BAR size as well as aligned size
When allocating space for an endpoint function on a BAR with a fixed size,
the size saved in 'struct pci_epf_bar.size' should be the fixed size as
expected by pci_epc_set_bar().

However, if pci_epf_alloc_space() increased the allocation size to
accommodate iATU alignment requirements, it previously saved the larger
aligned size in .size, which broke pci_epc_set_bar().

To solve this, keep the fixed BAR size in .size and save the aligned size
in a new .aligned_size for use when deallocating it.

Fixes: 2a9a801620 ("PCI: endpoint: Add support to specify alignment for buffers allocated to BARs")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
[mani: commit message fixup]
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[bhelgaas: more specific subject, commit log, wrap comment to match file]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20250424-pci-ep-size-alignment-v5-1-2d4ec2af23f5@baylibre.com
2025-05-28 16:15:40 -05:00
Linus Torvalds
1cbf99e47f Merge tag 'jfs-6.16' of github.com:kleikamp/linux-shaggy
Pull jfs updates from David Kleikamp:
 "A few small fixes for jfs"

* tag 'jfs-6.16' of github.com:kleikamp/linux-shaggy:
  jfs: fix array-index-out-of-bounds read in add_missing_indices
  jfs: Fix null-ptr-deref in jfs_ioc_trim
  jfs: validate AG parameters in dbMount() to prevent crashes
2025-05-28 13:36:38 -07:00
Pan Taixi
2fbdb6d8e0 tracing: Fix compilation warning on arm32
On arm32, size_t is defined to be unsigned int, while PAGE_SIZE is
unsigned long. This hence triggers a compilation warning as min()
asserts the type of two operands to be equal. Casting PAGE_SIZE to size_t
solves this issue and works on other target architectures as well.

Compilation warning details:

kernel/trace/trace.c: In function 'tracing_splice_read_pipe':
./include/linux/minmax.h:20:28: warning: comparison of distinct pointer types lacks a cast
  (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
                            ^
./include/linux/minmax.h:26:4: note: in expansion of macro '__typecheck'
   (__typecheck(x, y) && __no_side_effects(x, y))
    ^~~~~~~~~~~

...

kernel/trace/trace.c:6771:8: note: in expansion of macro 'min'
        min((size_t)trace_seq_used(&iter->seq),
        ^~~

Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/20250526013731.1198030-1-pantaixi@huaweicloud.com
Fixes: f5178c41bb ("tracing: Fix oob write in trace_seq_to_buffer()")
Reviewed-by: Jeongjun Park <aha310510@gmail.com>
Signed-off-by: Pan Taixi <pantaixi@huaweicloud.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-05-28 16:10:43 -04:00
Aurabindo Pillai
d78eb800f8 drm/amd/display: Add some missing register headers for DCN401
Add some HDCP related register headers for future use.

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-28 16:01:50 -04:00
Pratap Nirujogi
3e9d9df850 drm/amd/amdgpu: Add GPIO resources required for amdisp
ISP is a child device to GFX, and its device specific information
is not available in ACPI. Adding the 2 GPIO resources required for
ISP_v4_1_1 in amdgpu_isp driver.

- GPIO 0 to allow sensor driver to enable and disable sensor module.
- GPIO 85 to allow ISP driver to enable and disable ISP RGB streaming mode.

Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-28 16:01:39 -04:00
Jens Axboe
2c7f023219 io_uring/net: only consider msg_inq if larger than 1
Currently retry and general validity of msg_inq is gated on it being
larger than zero, but it's entirely possible for this to be slightly
inaccurate. In particular, if FIN is received, it'll return 1.

Just use larger than 1 as the check. This covers both the FIN case, and
at the same time, it doesn't make much sense to retry a recv immediately
if there's even just a single 1 byte of valid data in the socket.

Leave the SOCK_NONEMPTY flagging when larger than 0 still, as an app may
use that for the final receive.

Cc: stable@vger.kernel.org
Reported-by: Christian Mazakas <christian.mazakas@gmail.com>
Fixes: 7c71a0af81 ("io_uring/net: improve recv bundles")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-28 13:50:19 -06:00
Linus Torvalds
b1fd8bd0cc Merge tag 'dlm-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
 "This fixes delays when shutting down SCTP connections, and updates dlm
  Kconfig for SCTP"

* tag 'dlm-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
  dlm: drop SCTP Kconfig dependency
  dlm: reject SCTP configuration if not enabled
  dlm: use SHUT_RDWR for SCTP shutdown
  dlm: mask sk_shutdown value
2025-05-28 12:40:39 -07:00
Linus Torvalds
2c26b68cd5 Merge tag 'nfsd-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd updates from Chuck Lever:
 "The marquee feature for this release is that the limit on the maximum
  rsize and wsize has been raised to 4MB. The default remains at 1MB,
  but risk-seeking administrators now have the ability to try larger I/O
  sizes with NFS clients that support them. Eventually the default
  setting will be increased when we have confidence that this change
  will not have negative impact.

  With v6.16, NFSD now has its own debugfs file system where we can add
  experimental features and make them available outside of our
  development community without impacting production deployments. The
  first experimental setting added is one that makes all NFS READ
  operations use vfs_iter_read() instead of the NFSD splice actor. The
  plan is to eventually retire the splice actor, as that will enable a
  number of new capabilities such as the use of struct bio_vec from the
  top to the bottom of the NFSD stack.

  Jeff Layton contributed a number of observability improvements. The
  use of dprintk() in a number of high-traffic code paths has been
  replaced with static trace points.

  This release sees the continuation of efforts to harden the NFSv4.2
  COPY operation. Soon, the restriction on async COPY operations can be
  lifted.

  Many thanks to the contributors, reviewers, testers, and bug reporters
  who participated during the v6.16 development cycle"

* tag 'nfsd-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (60 commits)
  xdrgen: Fix code generated for counted arrays
  SUNRPC: Bump the maximum payload size for the server
  NFSD: Add a "default" block size
  NFSD: Remove NFSSVC_MAXBLKSIZE_V2 macro
  NFSD: Remove NFSD_BUFSIZE
  sunrpc: Remove the RPCSVC_MAXPAGES macro
  svcrdma: Adjust the number of entries in svc_rdma_send_ctxt::sc_pages
  svcrdma: Adjust the number of entries in svc_rdma_recv_ctxt::rc_pages
  sunrpc: Adjust size of socket's receive page array dynamically
  SUNRPC: Remove svc_rqst :: rq_vec
  SUNRPC: Remove svc_fill_write_vector()
  NFSD: Use rqstp->rq_bvec in nfsd_iter_write()
  SUNRPC: Export xdr_buf_to_bvec()
  NFSD: De-duplicate the svc_fill_write_vector() call sites
  NFSD: Use rqstp->rq_bvec in nfsd_iter_read()
  sunrpc: Replace the rq_bvec array with dynamically-allocated memory
  sunrpc: Replace the rq_pages array with dynamically-allocated memory
  sunrpc: Remove backchannel check in svc_init_buffer()
  sunrpc: Add a helper to derive maxpages from sv_max_mesg
  svcrdma: Reduce the number of rdma_rw contexts per-QP
  ...
2025-05-28 12:21:12 -07:00
Linus Torvalds
d87d73895f Merge tag 'ext4_for_linus-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
 "New ext4 features and performance improvements:

   - Fast commit performance improvements

   - Multi-fsblock atomic write support for bigalloc file systems

   - Large folio support for regular files

  This last can result in really stupendous performance for the right
  workloads. For example, see [1] where the Kernel Test Robot reported
  over 37% improvement on a large sequential I/O workload.

  There are also the usual bug fixes and cleanups. Of note are cleanups
  of the extent status tree to fix potential races that could result in
  the extent status tree getting corrupted under heavy simultaneous
  allocation and deallocation to a single file"

Link: https://lore.kernel.org/all/202505161418.ec0d753f-lkp@intel.com/ [1]

* tag 'ext4_for_linus-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (52 commits)
  ext4: Add a WARN_ON_ONCE for querying LAST_IN_LEAF instead
  ext4: Simplify flags in ext4_map_query_blocks()
  ext4: Rename and document EXT4_EX_FILTER to EXT4_EX_QUERY_FILTER
  ext4: Simplify last in leaf check in ext4_map_query_blocks
  ext4: Unwritten to written conversion requires EXT4_EX_NOCACHE
  ext4: only dirty folios when data journaling regular files
  ext4: Add atomic block write documentation
  ext4: Enable support for ext4 multi-fsblock atomic write using bigalloc
  ext4: Add multi-fsblock atomic write support with bigalloc
  ext4: Add support for EXT4_GET_BLOCKS_QUERY_LEAF_BLOCKS
  ext4: Make ext4_meta_trans_blocks() non-static for later use
  ext4: Check if inode uses extents in ext4_inode_can_atomic_write()
  ext4: Document an edge case for overwrites
  jbd2: remove journal_t argument from jbd2_superblock_csum()
  jbd2: remove journal_t argument from jbd2_chksum()
  ext4: remove sb argument from ext4_superblock_csum()
  ext4: remove sbi argument from ext4_chksum()
  ext4: enable large folio for regular file
  ext4: make online defragmentation support large folios
  ext4: make the writeback path support large folios
  ...
2025-05-28 12:12:08 -07:00
Linus Torvalds
e9d7126536 Merge tag 'ntfs3_for_6.16' of https://github.com/Paragon-Software-Group/linux-ntfs3
Pull  ntfs updates from Konstantin Komarov:
 "Added:
   - missing direct_IO in ntfs_aops_cmpr
   - handling of hdr_first_de() return value

  Fixed:
   - handling of InitializeFileRecordSegment operation.

  Removed:
   - ability to change compression on mounted volume
   - redundant NULL check"

* tag 'ntfs3_for_6.16' of https://github.com/Paragon-Software-Group/linux-ntfs3:
  fs/ntfs3: remove ability to change compression on mounted volume
  fs/ntfs3: Fix handling of InitializeFileRecordSegment
  fs/ntfs3: Add missing direct_IO in ntfs_aops_cmpr
  fs/ntfs3: handle hdr_first_de() return value
  fs/ntfs3: Drop redundant NULL check
2025-05-28 12:08:26 -07:00
Linus Torvalds
b04f9f8893 Merge tag 'for-linus-6.16-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux
Pull orangefs update from Mike Marshall:
 "Convert to use the new mount API.

  Code from Eric Sandeen at redhat that converts orangefs over to the
  new mount API"

* tag 'for-linus-6.16-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
  orangefs: Convert to use the new mount API
2025-05-28 12:05:30 -07:00
Linus Torvalds
c69d8e9de0 Merge tag 'exfat-for-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat updates from Namjae Jeon:

 - Fix xfstests generic/482 test failure

 - Fix double free in delayed_free

* tag 'exfat-for-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
  exfat: do not clear volume dirty flag during sync
  exfat: fix double free in delayed_free
2025-05-28 12:02:04 -07:00
Linus Torvalds
a56baa2253 Merge tag 'for-6.16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
 "A fixup to the xarray conversion sent in the main 6.16 batch. It was
  not included because it would cause rebase/refresh of like 80 patches,
  right before sending the early pull request last week.

  It's fixing a bug when zoned mode is enabled on btrfs so it's not
  affecting most people"

* tag 'for-6.16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: don't drop a reference if btrfs_check_write_meta_pointer() fails
2025-05-28 11:59:25 -07:00
Anubhav Shelat
c7a48ea9b9 perf trace: Always print return value for syscalls returning a pid
The syscalls that were consistently observed were set_robust_list and
rseq. This is because perf cannot find their child process.

This change ensures that the return value is always printed.

Before:
     0.256 ( 0.001 ms): set_robust_list(head: 0x7f09c77dba20, len: 24)                        =
     0.259 ( 0.001 ms): rseq(rseq: 0x7f09c77dc0e0, rseq_len: 32, sig: 1392848979)             =
After:
     0.270 ( 0.002 ms): set_robust_list(head: 0x7f0bb14a6a20, len: 24)                        = 0
     0.273 ( 0.002 ms): rseq(rseq: 0x7f0bb14a70e0, rseq_len: 32, sig: 1392848979)             = 0

Committer notes:

As discussed in the thread in the Link: tag below, these two don't
return a pid, but for syscalls returning one, we need to print the
result and if we manage to find the children in 'perf trace' data
structures, then print its name as well.

Fixes: 11c8e39f51 ("perf trace: Infrastructure to show COMM strings for syscalls returning PIDs")
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250403160411.159238-2-ashelat@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 15:39:29 -03:00
Michal Wajdeczko
2cb38bb0ad drm/xe: Allow to trigger GT resets using debugfs writes
Today we allow to trigger GT resest by reading dedicated debugfs
files "force_reset" and "force_reset_sync" that we are exposing
using drm_info_list[] and drm_debugfs_create_files().

To avoid triggering potentially disruptive actions during otherwise
"safe" read operations, expose those two attributes using debugfs
function where we can specify file permissions and provide custom
"write" handler to trigger the GT resets also from there.

This step would allow us to drop triggering GT resets during read
operations, which we leave just to give users more time to switch.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250519200914.216-1-michal.wajdeczko@intel.com
2025-05-28 20:13:18 +02:00
meowmeowbeanz
df7996076b ASoC: amd: yc: Add support for Lenovo Yoga 7 16ARP8
Add DMI quirk entry for Lenovo Yoga 7 16ARP8 (83BS) to enable
digital microphone support via ACP driver.

Fixes microphone detection on this specific model which was
previously falling back to non-functional generic audio paths.

Tested-by: meowmeowbeanz <meowmeowbeanz@gmx.com>
Signed-off-by: meowmeowbeanz <meowmeowbeanz@gmx.com>
Link: https://patch.msgid.link/20250528-yoga-7-16arp8-microphone-fix-v1-1-bfeed2ecd0c2@gmx.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-28 19:11:18 +01:00
Leo Yan
e8718f9866 perf script: Print PERF_AUX_FLAG_COLLISION flag
Print out the collision flag for AUX trace data. This is helpful for
inspecting sample collisions.

After:

  0x217b60@/data_nvme1n1/niayan01/upstream/perf.data [0x40]: event: 11
  .
  . ... raw event: size 64 bytes
  .  0000:  0b 00 00 00 00 00 40 00 d2 ef 3f 00 00 00 00 00  ......@...?.....
  .  0010:  ff 0f 00 00 00 00 00 00 08 00 00 00 00 00 00 00  ................
  .  0020:  1c 01 00 00 1c 01 00 00 10 bf 38 d6 11 01 00 00  ..........8.....
  .  0030:  03 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00  ................

  3 1176120114960 0x217b60 [0x40]: PERF_RECORD_AUX offset: 0x3fefd2 size: 0xfff flags: 0x8 [C]

The added character '[C]' indicates the collision.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250528153519.188644-1-leo.yan@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 15:08:25 -03:00
Dr. David Alan Gilbert
55423e9c53 smb: client: Remove an unused function and variable
SMB2_QFS_info() has been unused since 2018's
commit 730928c8f4 ("cifs: update smb2_queryfs() to use compounding")

sign_CIFS_PDUs has been unused since 2009's
commit 2edd6c5b05 ("[CIFS] NTLMSSP support moving into new file, old dead
code removed")

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-28 12:42:40 -05:00
Namhyung Kim
0dad79cf81 perf mem: Show absolute percent in mem_stat output
Currently the output sums up to 100% for each entry.  But it can be
confusing when it's displayed with 'overhead'.

Before:

  $ perf mem report -F overhead,sample,cache,comm
  ...
  #                         -------------- Cache --------------
  # Overhead       Samples       L1     L2     L3 L1-buf  Other  Command
  # ........  ............  ...................................  ...............
  #
      25.38%           517    34.6%   0.0%  15.8%  23.3%  26.2%  swapper
       9.03%           239    35.4%   0.8%   9.1%  22.1%  32.6%  chrome
       8.61%           233    45.3%   1.2%   8.9%  22.7%  21.9%  Chrome_ChildIOT
       7.81%           189    33.6%   0.4%   5.5%  35.9%  24.6%  Isolated Web Co
       3.73%           103    40.4%   0.3%   2.7%  39.4%  17.2%  gnome-shell

Let's convert it to use absolute percent value so that it can add up to
the overhead for that entry.

After:
  #                         -------------- Cache --------------
  # Overhead       Samples       L1     L2     L3 L1-buf  Other  Command
  # ........  ............  ...................................  ...............
  #
      25.38%           517     8.8%   0.0%   4.0%   5.9%   6.7%  swapper
       9.03%           239     3.2%   0.1%   0.8%   2.0%   2.9%  chrome
       8.61%           233     3.9%   0.1%   0.8%   2.0%   1.9%  Chrome_ChildIOT
       7.81%           189     2.6%   0.0%   0.4%   2.8%   1.9%  Isolated Web Co
       3.73%           103     1.5%   0.0%   0.1%   1.5%   0.6%  gnome-shell

This aligns well with the existing 'mem' sort key.

  $ perf mem report -s comm,mem -H
  ...
  #
  #    Overhead       Samples  Command / Memory access
  # .........................  ..........................................
  #
      25.38%           517     swapper
          8.78%           150     L1 hit
          6.66%            72     RAM hit
          5.92%           137     LFB/MAB hit
          4.02%           157     L3 hit
          0.00%             1     L3 miss
       9.03%           239     chrome
          3.19%           117     L1 hit
          2.94%            35     RAM hit
          1.99%            48     LFB/MAB hit
          0.82%            32     L3 hit
          0.08%             5     L2 hit
          0.00%             2     L3 miss

We can add an option or a config to change the setting later.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250523222157.1259998-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 14:42:20 -03:00
Namhyung Kim
7a6710d015 perf mem: Display sort order only if it's available
IOW it's not used when -F option is used alone.  Let's make it
conditional to skip printing incorrect information.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250523222157.1259998-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 14:41:42 -03:00
Ravi Bangoria
00a23c000e perf mem: Describe overhead calculation in brief
Unlike perf-report which uses sample period for overhead calculation,
perf-mem overhead is calculated using sample weight. Describe perf-mem
overhead calculation method in it's man page.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250523222157.1259998-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 14:41:10 -03:00
Paolo Bonzini
e9ba21fb5d Merge tag 'kvm-s390-next-6.16-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
* Fix interaction between some filesystems and Secure Execution
* Some cleanups and refactorings, preparing for an upcoming big series
2025-05-28 13:21:16 -04:00
Dapeng Mi
a4a859eb67 perf record: Fix incorrect --user-regs comments
The comment of "--user-regs" option is not correct, fix it.

"on interrupt," -> "in user space,"

Fixes: 84c4174227 ("perf record: Support direct --user-regs arguments")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250403060810.196028-1-dapeng1.mi@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 14:10:56 -03:00