This will allow running vDPA for virtio block protocol.
It's a preliminary implementation with a simple request handling:
for each request, only the status (last byte) is set.
It's always set to VIRTIO_BLK_S_OK.
Also input validation is missing and will be added in the next commits.
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
[sgarzare: various cleanups/fixes]
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210315163450.254396-12-sgarzare@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vringh_getdesc_iotlb() allocates memory to store the kvec, that
is freed with vringh_kiov_cleanup().
vringh_getdesc_iotlb() is able to reuse a kvec previously allocated,
so in order to avoid to allocate the kvec for each request, we are
not calling vringh_kiov_cleanup() when we finished to handle a
request, but we should call it when we free the entire device.
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210315163450.254396-8-sgarzare@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Usually iotlb accesses are synchronized with a spinlock.
Let's request it as a new parameter in vringh_set_iotlb() and
hold it when we navigate the iotlb in iotlb_translate() to avoid
race conditions with any new additions/deletions of ranges from
the ioltb.
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210315163450.254396-3-sgarzare@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The identical mapping used until now created issues when mapping
different virtual pages with the same physical address.
To solve this issue, we can use the iova module, to handle the IOVA
allocation.
For simplicity we use an IOVA allocator with byte granularity.
We add two new functions, vdpasim_map_range() and vdpasim_unmap_range(),
to handle the IOVA allocation and the registration into the IOMMU/IOTLB.
These functions are used by dma_map_ops callbacks.
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210315163450.254396-2-sgarzare@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The virtqueue doorbell is usually implemented via registeres but we
don't provide the necessary vma->flags like VM_PFNMAP. This may cause
several issues e.g when userspace tries to map the doorbell via vhost
IOTLB, kernel may panic due to the page is not backed by page
structure. This patch fixes this by setting the necessary
vm_flags. With this patch, try to map doorbell via IOTLB will fail
with bad address.
Cc: stable@vger.kernel.org
Fixes: ddd89d0a05 ("vhost_vdpa: support doorbell mapping via mmap")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210413091557.29008-1-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Follow comment style mentioned in the Writing kernel-doc document [1].
Following warnings are fixed.
$ scripts/kernel-doc -v -none drivers/vdpa/vdpa.c
drivers/vdpa/vdpa.c:67: info: Scanning doc for __vdpa_alloc_device
drivers/vdpa/vdpa.c:84: warning: No description found for return value of '__vdpa_alloc_device'
drivers/vdpa/vdpa.c:153: info: Scanning doc for _vdpa_register_device
drivers/vdpa/vdpa.c:163: warning: No description found for return value of '_vdpa_register_device'
drivers/vdpa/vdpa.c:172: info: Scanning doc for vdpa_register_device
drivers/vdpa/vdpa.c:180: warning: No description found for return value of 'vdpa_register_device'
drivers/vdpa/vdpa.c:191: info: Scanning doc for _vdpa_unregister_device
drivers/vdpa/vdpa.c:205: info: Scanning doc for vdpa_unregister_device
drivers/vdpa/vdpa.c:217: info: Scanning doc for __vdpa_register_driver
drivers/vdpa/vdpa.c:224: warning: No description found for return value of '__vdpa_register_driver'
drivers/vdpa/vdpa.c:233: info: Scanning doc for vdpa_unregister_driver
drivers/vdpa/vdpa.c:243: info: Scanning doc for vdpa_mgmtdev_register
drivers/vdpa/vdpa.c:250: warning: No description found for return value of 'vdpa_mgmtdev_register'
After the fix:
scripts/kernel-doc -v -none drivers/vdpa/vdpa.c
drivers/vdpa/vdpa.c:67: info: Scanning doc for __vdpa_alloc_device
drivers/vdpa/vdpa.c:153: info: Scanning doc for _vdpa_register_device
drivers/vdpa/vdpa.c:172: info: Scanning doc for vdpa_register_device
drivers/vdpa/vdpa.c:191: info: Scanning doc for _vdpa_unregister_device
drivers/vdpa/vdpa.c:205: info: Scanning doc for vdpa_unregister_device
drivers/vdpa/vdpa.c:217: info: Scanning doc for __vdpa_register_driver
drivers/vdpa/vdpa.c:233: info: Scanning doc for vdpa_unregister_driver
drivers/vdpa/vdpa.c:243: info: Scanning doc for vdpa_mgmtdev_register
[1] https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20210406170457.98481-3-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Allow to control vdpa device creation and destruction using the vdpa
management tool.
Examples:
1. List the management devices
$ vdpa mgmtdev show
pci/0000:3b:00.1:
supported_classes net
2. Create vdpa instance
$ vdpa dev add mgmtdev pci/0000:3b:00.1 name vdpa0
3. Show vdpa devices
$ vdpa dev show
vdpa0: type network mgmtdev pci/0000:3b:00.1 vendor_id 5555 max_vqs 16 \
max_vq_size 256
Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210408091320.4600-1-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduce a vDPA driver for virtio-pci device. It bridges
the virtio-pci control command to the vDPA bus. This will be used for
features prototyping and testing.
Note that get/restore virtqueue state is not supported which needs
extension on the virtio specification.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210223061905.422659-4-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix potential NULL pointer dereference in the auxtrace option parser
- Fix access to PID in an array when setting a PID filter in 'perf ftrace'
- Fix error return code in the 'perf data' tool and in maps__clone(),
found using a static analysis tool from Huawei
* tag 'perf-tools-fixes-for-v5.12-2021-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf map: Fix error return code in maps__clone()
perf ftrace: Fix access to pid in array when setting a pid filter
perf auxtrace: Fix potential NULL pointer dereference
perf data: Fix error return code in perf_data__create_dir()
Pull x86 perf fixes from Borislav Petkov:
- Fix Broadwell Xeon's stepping in the PEBS isolation table of CPUs
- Fix a panic when initializing perf uncore machinery on Haswell and
Broadwell servers
* tag 'perf_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/kvm: Fix Broadwell Xeon stepping in isolation_ucodes[]
perf/x86/intel/uncore: Remove uncore extra PCI dev HSWEP_PCI_PCU_3
Pull locking fix from Borislav Petkov:
"Fix ordering in the queued writer lock's slowpath"
* tag 'locking_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/qrwlock: Fix ordering in queued_write_lock_slowpath()
Pull scheduler fix from Borislav Petkov:
"Fix a typo in a macro ifdeffery"
* tag 'sched_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
preempt/dynamic: Fix typo in macro conditional statement
Pull x86 fix from Borislav Petkov:
"Fix an out-of-bounds memory access when setting up a crash kernel with
kexec"
* tag 'x86_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/crash: Fix crash_setup_memmap_entries() out-of-bounds access
Pull kvm fix from Paolo Bonzini:
"Fix SRCU bug introduced in the merge window"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86/xen: Take srcu lock when accessing kvm_memslots()
This reverts commit 0c85a7e874.
The games with 'rm' are on (two separate instances) of a local variable,
and make no difference.
Quoting Aditya Pakki:
"I was the author of the patch and it was the cause of the giant UMN
revert.
The patch is garbage and I was unaware of the steps involved in
retracting it. I *believed* the maintainers would pull it, given it
was already under Greg's list. The patch does not introduce any bugs
but is pointless and is stupid. I accept my incompetence and for not
requesting a revert earlier."
Link: https://lwn.net/Articles/854319/
Requested-by: Aditya Pakki <pakki001@umn.edu>
Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull pin control fixes from Linus Walleij:
"Late pin control fixes, would have been in the main pull request
normally but hey I got lucky and we got another week to polish up
v5.12 so here we go.
One driver fix and one making the core debugfs work:
- Fix the number of pins in the community of the Intel Lewisburg SoC
- Show pin numbers for controllers with base = 0 in the new debugfs
feature"
* tag 'pinctrl-v5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: core: Show pin numbers for the controllers with base = 0
pinctrl: lewisburg: Update number of pins in community
Merge misc fixes from Andrew Morton:
"5 patches.
Subsystems affected by this patch series: coda, overlayfs, and
mm (pagecache and memcg)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
tools/cgroup/slabinfo.py: updated to work on current kernel
mm/filemap: fix mapping_seek_hole_data on THP & 32-bit
mm/filemap: fix find_lock_entries hang on 32-bit THP
ovl: fix reference counting in ovl_mmap error path
coda: fix reference counting in coda_file_mmap error path
Pull block fix from Jens Axboe:
"A single fix for a behavioral regression in this series, when
re-reading the partition table with partitions open"
* tag 'block-5.12-2021-04-23' of git://git.kernel.dk/linux-block:
block: return -EBUSY when there are open partitions in blkdev_reread_part
slabinfo.py script does not work with actual kernel version.
First, it was unable to recognise SLUB susbsytem, and when I specified
it manually it failed again with
AttributeError: 'struct page' has no member 'obj_cgroups'
.. and then again with
File "tools/cgroup/memcg_slabinfo.py", line 221, in main
memcg.kmem_caches.address_of_(),
AttributeError: 'struct mem_cgroup' has no member 'kmem_caches'
Link: https://lkml.kernel.org/r/cec1a75e-43b4-3d64-2084-d9f98fda037f@virtuozzo.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Tested-by: Roman Gushchin <guro@fb.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
No problem on 64-bit, or without huge pages, but xfstests generic/285
and other SEEK_HOLE/SEEK_DATA tests have regressed on huge tmpfs, and on
32-bit architectures, with the new mapping_seek_hole_data(). Several
different bugs turned out to need fixing.
u64 cast to stop losing bits when converting unsigned long to loff_t
(and let's use shifts throughout, rather than mixed with * and /).
Use round_up() when advancing pos, to stop assuming that pos was already
THP-aligned when advancing it by THP-size. (This use of round_up()
assumes that any THP has THP-aligned index: true at present and true
going forward, but could be recoded to avoid the assumption.)
Use xas_set() when iterating away from a THP, so that xa_index stays in
synch with start, instead of drifting away to return bogus offset.
Check start against end to avoid wrapping 32-bit xa_index to 0 (and to
handle these additional cases, seek_data or not, it's easier to break
the loop than goto: so rearrange exit from the function).
[hughd@google.com: remove unneeded u64 casts, per Matthew]
Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2104221347240.1170@eggly.anvils
Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2104211737410.3299@eggly.anvils
Fixes: 41139aa4c3 ("mm/filemap: add mapping_seek_hole_data")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kvm_memslots() will be called by kvm_write_guest_offset_cached() so we should
take the srcu lock. Let's pull the srcu lock operation from kvm_steal_time_set_preempted()
again to fix xen part.
Fixes: 30b5c851af ("KVM: x86/xen: Add support for vCPU runstate information")
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1619166200-9215-1-git-send-email-wanpengli@tencent.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>