(At least for me) the logic maintaining the count of posted
recv_io messages and the count of granted credits is much easier
to understand.
From there we can easily calculate the number of new_credits we'll
grant to the peer in outgoing send_io messages.
This will simplify the move to common logic that can be
shared between client and server in the following patches.
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
We already call ib_drain_qp() before, which is an sync operation
that only returns in the queues are fully drained.
ib_drain_qp() completes pending requests with IB_WC_WR_FLUSH_ERR
so we have already called put_receive_buffer().
So all smbdirect_recv_io objects are either in the
smbdirect_socket.recv_io.free.list or
smbdirect_socket.recv_io.reassembly.list.
Then we explicitly iterate smbdirect_socket.recv_io.reassembly.list
and call put_receive_buffer(), so every object is
in smbdirect_socket.recv_io.free.list.
It means info->count_receive_queue == sp->recv_credit_max was
already true and calling wait_event(info->wait_receive_queues...
is pointless.
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
If we're already disconnecting we don't need to queue the
disconnect_work again. disable_work() turns the next queue_work()
into a no-op.
Also let smbd_destroy() cancel(and disable) queued disconnect_work and
call smbd_disconnect_rdma_work() inline.
The makes it more obvious that disconnect_work is never queued
again after smbd_destroy() called smbd_disconnect_rdma_work().
It also means we have a single place to call rdma_disconnect().
While there we better also disable all other [delayed_]work.
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
wake_up_interruptible[_all]() doesn't wake up tasks waiting
with wait_event().
So we better wake_up[_all]() in order to wake up all tasks in order
to simplify the logic.
As we currently don't use any wait_event_*_exclusive() it
doesn't really matter if we use wake_up() or wake_up_all().
But in this patch I try to use wake_up() for expected situations
and wake_up_all() for situations of a broken connection.
So don't need to adjust things in future when we
may use wait_event_*_exclusive() in order to wake up
only one process that should make progress.
Changing the wait_event_*() code in order to keep
wait_event(), wait_event_interruptible() and
wait_event_interruptible_timeout() or
changing them to wait_event_killable(),
wait_event_killable_timeout(),
wait_event_killable_exclusive()
is something to think about in a future patch.
The goal here is to avoid that some tasks are not
woken and freeze forever.
Also note that this patch only changes the existing
wake_up*() calls. Adding more wake_up*() calls for
other wait queues is also deferred to a future patch.
Link: https://lore.kernel.org/linux-cifs/13851363-0dc9-465c-9ced-3ede4904eef0@samba.org/T/#t
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
For the reader it is not obvious that the counter values are displayed
in hex as there's no leading '0x' before '%x'.
So changed them to %u instead and added '0x' for non-counter values
and also a string for the status.
Note this generates some checkpatch warnings like this:
WARNING: quoted string split across lines
#45: FILE: fs/smb/client/cifs_debug.c:460:
+ seq_printf(m, "\nSMBDirect protocol version: 0x%x "
+ "transport status: %s (%u)",
But I'll leave this is the current style in the
related code...
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Do a real negotiation and check the servers initiator_depth and
responder_resources.
This should use big endian in order to be useful.
I have captures of windows clients showing this.
The fact that we used little endian up to now
means that we sent very large numbers and the
negotiation with the server truncated them to the
server limits.
Note the reason why this uses u8 for
initiator_depth and responder_resources is
that the rdma layer also uses it.
The inconsitency regarding the initiator_depth
and responder_resources values being reversed
for iwarp devices in RDMA_CM_EVENT_ESTABLISHED
should also be fixed later, but for now we should
fix it.
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Cc: linux-rdma@vger.kernel.org
Fixes: c739858334 ("CIFS: SMBD: Implement RDMA memory registration")
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
This safer to start with and allows common code not care about if the
caller uses these or not. E.g. sc->mr_io.recovery_work is only used
on the client...
Note disable_[delayed_]work_sync() requires a valid function pointer
in INIT_[DELAYED_]WORK(). The non _sync() version don't require it,
but as we need to use the _sync() version on cleanup we better use
it here too, it won't block anyway here...
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
In f2fs_precache_extents(), For large files, It requires reading many
node blocks. Instead of reading each node block with synchronous I/O,
this patch applies readahead so that node blocks can be fetched in
advance.
It reduces the overhead of repeated sync reads and improves efficiency
when precaching extents of large files.
I created a file with the same largest extent and executed the test.
For this experiment, I set the file's largest extent with an offset of 0
and a size of 1GB. I configured the remaining area with 100MB extents.
5GB test file:
dd if=/dev/urandom of=test1 bs=1m count=5120
cp test1 test2
fsync test1
dd if=test1 of=test2 bs=1m skip=1024 seek=1024 count=100 conv=notrunc
dd if=test1 of=test2 bs=1m skip=1224 seek=1224 count=100 conv=notrunc
...
dd if=test1 of=test2 bs=1m skip=5024 seek=5024 count=100 conv=notrunc
reboot
I also created 10GB and 20GB files with large extents using the same
method.
ioctl(F2FS_IOC_PRECACHE_EXTENTS) test results are as follows:
+-----------+---------+---------+-----------+
| File size | Before | After | Reduction |
+-----------+---------+---------+-----------+
| 5GB | 101.8ms | 37.0ms | 72.1% |
| 10GB | 222.9ms | 56.0ms | 74.9% |
| 20GB | 446.2ms | 116.4ms | 73.9% |
+-----------+---------+---------+-----------+
Tested on a 256GB mobile device with an SM8750 chipset.
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Sunmin Jeong <s_min.jeong@samsung.com>
Signed-off-by: Yunji Kang <yunji0.kang@samsung.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Add a sanity check in __update_extent_tree_range() to detect any
zero-sized extent update.
Signed-off-by: wangzijie <wangzijie1@honor.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
hugetlb_vmdelete_list() uses trylock to acquire VMA locks during truncate
operations. As per the original design in commit 40549ba8f8 ("hugetlb:
use new vma_lock for pmd sharing synchronization"), if the trylock fails
or the VMA has no lock, it should skip that VMA. Any remaining mapped
pages are handled by remove_inode_hugepages() which is called after
hugetlb_vmdelete_list() and uses proper lock ordering to guarantee
unmapping success.
Currently, when hugetlb_vma_trylock_write() returns success (1) for VMAs
without shareable locks, the code proceeds to call unmap_hugepage_range().
This causes assertion failures in huge_pmd_unshare() →
hugetlb_vma_assert_locked() because no lock is actually held:
WARNING: CPU: 1 PID: 6594 Comm: syz.0.28 Not tainted
Call Trace:
hugetlb_vma_assert_locked+0x1dd/0x250
huge_pmd_unshare+0x2c8/0x540
__unmap_hugepage_range+0x6e3/0x1aa0
unmap_hugepage_range+0x32e/0x410
hugetlb_vmdelete_list+0x189/0x1f0
Fix by using goto to ensure locks acquired by trylock are always released,
even when skipping VMAs without shareable locks.
Link: https://lkml.kernel.org/r/20250926033255.10930-1-kartikey406@gmail.com
Fixes: 40549ba8f8 ("hugetlb: use new vma_lock for pmd sharing synchronization")
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
Reported-by: syzbot+f26d7c75c26ec19790e7@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f26d7c75c26ec19790e7
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add support for SEEK_DATA and SEEK_HOLE lseek() whence values.
These allow much faster searches for holes and data in sparse files, which
can significantly speed up file copying, e.g.
before (GNU coreutils, Debian 13):
cp --sparse=always big-file /
took real 11m58s, user 0m5.764s, sys 11m48s
after:
real 0.047s, user 0.000s, sys 0.027s
Where big-file has a 256 GB hole followed by 47 KB of data.
Link: https://lkml.kernel.org/r/20250923220652.568416-3-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "Squashfs: performance improvement and a sanity check".
This patchset adds an additional sanity check when reading regular file
inodes, and adds support for SEEK_DATA/SEEK_HOLE lseek() whence values.
This patch (of 2):
Add an additional sanity check when reading regular file inodes.
A regular file if the file size is an exact multiple of the filesystem
block size cannot have a fragment. This is because by definition a
fragment block stores tailends which are not a whole block in size.
Link: https://lkml.kernel.org/r/20250923220652.568416-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20250923220652.568416-2-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Syzkaller reports a "KMSAN: uninit-value in squashfs_get_parent" bug.
This is caused by open_by_handle_at() being called with a file handle
containing an invalid parent inode number. In particular the inode number
is that of a symbolic link, rather than a directory.
Squashfs_get_parent() gets called with that symbolic link inode, and
accesses the parent member field.
unsigned int parent_ino = squashfs_i(inode)->parent;
Because non-directory inodes in Squashfs do not have a parent value, this
is uninitialised, and this causes an uninitialised value access.
The fix is to initialise parent with the invalid inode 0, which will cause
an EINVAL error to be returned.
Regular inodes used to share the parent field with the block_list_start
field. This is removed in this commit to enable the parent field to
contain the invalid inode number 0.
Link: https://lkml.kernel.org/r/20250918233308.293861-1-phillip@squashfs.org.uk
Fixes: 122601408d ("Squashfs: export operations")
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Reported-by: syzbot+157bdef5cf596ad0da2c@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68cc2431.050a0220.139b6.0001.GAE@google.com/
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>