linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-04-02 14:02:00 -04:00

Author	SHA1	Message	Date
Michael S. Tsirkin	d08fda2cf2	virtio_input: use virtqueue_add_inbuf_cache_clean for events The evts array contains 64 small (8-byte) input events that share cachelines with each other. When CONFIG_DMA_API_DEBUG is enabled, this can trigger warnings about overlapping DMA mappings within the same cacheline. Previous patch isolated the array in its own cachelines, so the warnings are now spurious. Use virtqueue_add_inbuf_cache_clean() to indicate that the CPU does not write into these cache lines, suppressing these warnings. Message-ID: <4c885b4046323f68cf5cadc7fbfb00216b11dd20.1767601130.git.mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2026-01-08 09:54:27 -05:00
Michael S. Tsirkin	bd2b617c49	virtio-rng: fix DMA alignment for data buffer The data buffer in struct virtrng_info is used for DMA_FROM_DEVICE via virtqueue_add_inbuf() and shares cachelines with the adjacent CPU-written fields (data_avail, data_idx). The device writing to the DMA buffer and the CPU writing to adjacent fields could corrupt each other's data on non-cache-coherent platforms. Add __dma_from_device_group_begin()/end() annotations to place these in distinct cache lines. Message-ID: <157a63b6324d1f1307ddd4faa3b62a8b90a79423.1767601130.git.mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2026-01-08 09:54:27 -05:00
Michael S. Tsirkin	2678369e8e	virtio_scsi: fix DMA cacheline issues for events Current struct virtio_scsi_event_node layout has two problems: The event (DMA_FROM_DEVICE) and work (CPU-written via INIT_WORK/queue_work) fields share a cacheline. On non-cache-coherent platforms, CPU writes to work can corrupt device-written event data. If ARCH_DMA_MINALIGN is large enough, the 8 events in event_list share cachelines, triggering CONFIG_DMA_API_DEBUG warnings. Fix the corruption by moving event buffers to a separate array and aligning using __dma_from_device_group_begin()/end(). Suppress the (now spurious) DMA debug warnings using virtqueue_add_inbuf_cache_clean(). Message-ID: <8801aeef7576a155299f19b6887682dd3a272aba.1767601130.git.mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2026-01-08 09:54:27 -05:00
Michael S. Tsirkin	95c7b0ad6c	virtio_input: fix DMA alignment for evts On non-cache-coherent platforms, when a structure contains a buffer used for DMA alongside fields that the CPU writes to, cacheline sharing can cause data corruption. The evts array is used for DMA_FROM_DEVICE operations via virtqueue_add_inbuf(). The adjacent lock and ready fields are written by the CPU during normal operation. If these share cachelines with evts, CPU writes can corrupt DMA data. Add __dma_from_device_group_begin()/end() annotations to ensure evts is isolated in its own cachelines. Message-ID: <cd328233198a76618809bb5cd9a6ddcaa603a8a1.1767601130.git.mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2026-01-08 09:54:26 -05:00
Michael S. Tsirkin	db191ba0c8	vsock/virtio: use virtqueue_add_inbuf_cache_clean for events The event_list array contains 8 small (4-byte) events that share cachelines with each other. When CONFIG_DMA_API_DEBUG is enabled, this can trigger warnings about overlapping DMA mappings within the same cacheline. The previous patch isolated event_list in its own cache lines so the warnings are spurious. Use virtqueue_add_inbuf_cache_clean() to indicate that the CPU does not write into these fields, suppressing the warnings. Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Message-ID: <4b5bf63a7ebb782d87f643466b3669df567c9fe1.1767601130.git.mst@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2026-01-08 09:54:26 -05:00
Michael S. Tsirkin	63dfad0517	vsock/virtio: fix DMA alignment for event_list On non-cache-coherent platforms, when a structure contains a buffer used for DMA alongside fields that the CPU writes to, cacheline sharing can cause data corruption. The event_list array is used for DMA_FROM_DEVICE operations via virtqueue_add_inbuf(). The adjacent event_run and guest_cid fields are written by the CPU while the buffer is available, so mapped for the device. If these share cachelines with event_list, CPU writes can corrupt DMA data. Add __dma_from_device_group_begin()/end() annotations to ensure event_list is isolated in its own cachelines. Message-ID: <f19ebd74f70c91cab4b0178df78cf6a6e107a96b.1767601130.git.mst@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2026-01-08 09:54:19 -05:00
Michael S. Tsirkin	5fc6dd158e	virtio: add virtqueue_add_inbuf_cache_clean API Add virtqueue_add_inbuf_cache_clean() for passing DMA_ATTR_CPU_CACHE_CLEAN to virtqueue operations. This suppresses DMA debug cacheline overlap warnings for buffers where proper cache management is ensured by the caller. Message-ID: <e50d38c974859e731e50bda7a0ee5691debf5bc4.1767601130.git.mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2026-01-02 06:22:49 -05:00
Michael S. Tsirkin	d5d8465131	dma-debug: track cache clean flag in entries If a driver is buggy and has 2 overlapping mappings but only sets cache clean flag on the 1st one of them, we warn. But if it only does it for the 2nd one, we don't. Fix by tracking cache clean flag in the entry. Message-ID: <0ffb3513d18614539c108b4548cdfbc64274a7d1.1767601130.git.mst@redhat.com> Reviewed-by: Petr Tesarik <ptesarik@suse.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2026-01-02 06:22:49 -05:00
Michael S. Tsirkin	e21dd666e4	docs: dma-api: document DMA_ATTR_CPU_CACHE_CLEAN Document DMA_ATTR_CPU_CACHE_CLEAN as implemented in the previous patch. Message-ID: <0720b4be31c1b7a38edca67fd0c97983d2a56936.1767601130.git.mst@redhat.com> Reviewed-by: Petr Tesarik <ptesarik@suse.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-31 19:30:17 -05:00
Michael S. Tsirkin	61868dc55a	dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN When multiple small DMA_FROM_DEVICE or DMA_BIDIRECTIONAL buffers share a cacheline, and DMA_API_DEBUG is enabled, we get this warning: cacheline tracking EEXIST, overlapping mappings aren't supported. This is because when one of the mappings is removed, while another one is active, CPU might write into the buffer. Add an attribute for the driver to promise not to do this, making the overlapping safe, and suppressing the warning. Message-ID: <2d5d091f9d84b68ea96abd545b365dd1d00bbf48.1767601130.git.mst@redhat.com> Reviewed-by: Petr Tesarik <ptesarik@suse.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-31 19:30:02 -05:00
Michael S. Tsirkin	1e8b5d8555	docs: dma-api: document __dma_from_device_group_begin()/end() Document the __dma_from_device_group_begin()/end() annotations. Message-ID: <01ea88055ded4d70cac70ba557680fd5fa7d9ff5.1767601130.git.mst@redhat.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Petr Tesarik <ptesarik@suse.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-31 19:28:12 -05:00
Michael S. Tsirkin	ca085faabb	dma-mapping: add __dma_from_device_group_begin()/end() When a structure contains a buffer that DMA writes to alongside fields that the CPU writes to, cache line sharing between the DMA buffer and CPU-written fields can cause data corruption on non-cache-coherent platforms. Add __dma_from_device_group_begin()/end() annotations to ensure proper alignment to prevent this: struct my_device { spinlock_t lock1; __dma_from_device_group_begin(); char dma_buffer1[16]; char dma_buffer2[16]; __dma_from_device_group_end(); spinlock_t lock2; }; Message-ID: <19163086d5e4704c316f18f6da06bc1c72968904.1767601130.git.mst@redhat.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Petr Tesarik <ptesarik@suse.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-31 19:27:45 -05:00
Jason Wang	f6a15d8549	virtio_ring: add in order support This patch implements in order support for both split virtqueue and packed virtqueue. Performance could be gained for the device where the memory access could be expensive (e.g vhost-net or a real PCI device): Benchmark with KVM guest: Vhost-net on the host: (pktgen + XDP_DROP): in_order=off \| in_order=on \| +% TX: 4.51Mpps \| 5.30Mpps \| +17% RX: 3.47Mpps \| 3.61Mpps \| + 4% Vhost-user(testpmd) on the host: (pktgen/XDP_DROP): For split virtqueue: in_order=off \| in_order=on \| +% TX: 5.60Mpps \| 5.60Mpps \| +0.0% RX: 9.16Mpps \| 9.61Mpps \| +4.9% For packed virtqueue: in_order=off \| in_order=on \| +% TX: 5.60Mpps \| 5.70Mpps \| +1.7% RX: 10.6Mpps \| 10.8Mpps \| +1.8% Benchmark also shows no performance impact for in_order=off for queue size with 256 and 1024. Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-20-jasowang@redhat.com>	2025-12-31 05:47:50 -05:00
Jason Wang	519b206e30	virtio_ring: factor out split detaching logic This patch factors out the split core detaching logic that could be reused by in order feature into a dedicated function. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-19-jasowang@redhat.com>	2025-12-31 05:47:50 -05:00
Jason Wang	9dc6b944f1	virtio_ring: factor out split indirect detaching logic Factor out the split indirect descriptor detaching logic in order to allow it to be reused by the in order support. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-18-jasowang@redhat.com>	2025-12-31 05:39:18 -05:00
Jason Wang	fa56d17b92	virtio_ring: factor out core logic for updating last_used_idx Factor out the core logic for updating last_used_idx to be reused by the packed in order implementation. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-17-jasowang@redhat.com>	2025-12-31 05:39:18 -05:00
Jason Wang	c623106c79	virtio_ring: factor out core logic of buffer detaching Factor out core logic of buffer detaching and leave the free list management to the caller so in_order can just call the core logic. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-16-jasowang@redhat.com>	2025-12-31 05:39:18 -05:00
Jason Wang	03f05c4eeb	virtio_ring: determine descriptor flags at one time Let's determine the last descriptor by counting the number of sg. This would be consistent with packed virtqueue implementation and ease the future in-order implementation. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-15-jasowang@redhat.com>	2025-12-31 05:39:18 -05:00
Jason Wang	1208473f9b	virtio_ring: introduce virtqueue ops This patch introduces virtqueue ops which is a set of callbacks that will be called for different queue layout or features. This would help to avoid branches for split/packed and will ease the future implementation like in order. Note that in order to eliminate the indirect calls this patch uses global array of const ops to allow compiler to avoid indirect branches. Tested with CONFIG_MITIGATION_RETPOLINE, no performance differences were noticed. Acked-by: Eugenio Pérez <eperezma@redhat.com> Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-14-jasowang@redhat.com>	2025-12-31 05:39:18 -05:00
Jason Wang	eff8b47d28	virtio_ring: switch to use unsigned int for virtqueue_poll_packed() Switch to use unsigned int for virtqueue_poll_packed() to match virtqueue_poll() and virtqueue_poll_split() and to ease the abstraction of the virtqueue ops. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-13-jasowang@redhat.com>	2025-12-31 05:39:18 -05:00
Jason Wang	f2ad9d6b4e	virtio_ring: switch to use vring_virtqueue for detach_unused_buf variants Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-12-jasowang@redhat.com>	2025-12-31 05:39:18 -05:00
Jason Wang	7e81017673	virtio_ring: switch to use vring_virtqueue for disable_cb variants Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-11-jasowang@redhat.com>	2025-12-31 05:39:18 -05:00
Jason Wang	62fa22cdab	virtio_ring: use vring_virtqueue for enable_cb_delayed variants Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-10-jasowang@redhat.com>	2025-12-31 05:39:18 -05:00
Jason Wang	74847cb573	virtio_ring: switch to use vring_virtqueue for enable_cb_prepare variants Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-9-jasowang@redhat.com>	2025-12-31 05:39:17 -05:00
Jason Wang	ceea1cd0ae	virtio: switch to use vring_virtqueue for virtqueue_get variants Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-8-jasowang@redhat.com>	2025-12-31 05:39:17 -05:00
Jason Wang	4a0fa90b10	virtio_ring: switch to use vring_virtqueue for virtqueue_add variants Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-7-jasowang@redhat.com>	2025-12-31 05:39:17 -05:00
Jason Wang	8b8590b708	virtio_ring: switch to use vring_virtqueue for virtqueue_kick_prepare variants Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-6-jasowang@redhat.com>	2025-12-31 05:39:17 -05:00
Jason Wang	9552bc0581	virtio_ring: switch to use vring_virtqueue for virtqueue resize variants Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-5-jasowang@redhat.com>	2025-12-31 05:39:17 -05:00
Jason Wang	40da006f13	virtio_ring: unify logic of virtqueue_poll() and more_used() This patch unifies the logic of virtqueue_poll() and more_used() for better code reusing and ease the future in order implementation. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-4-jasowang@redhat.com>	2025-12-31 05:39:17 -05:00
Jason Wang	79f6d68293	virtio_ring: switch to use vring_virtqueue in virtqueue_poll variants Those variants are used internally so let's switch to use vring_virtqueue as parameter to be consistent with other internal virtqueue helpers. Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-3-jasowang@redhat.com>	2025-12-31 05:39:17 -05:00
Jason Wang	8ce8e3e558	virtio_ring: rename virtqueue_reinit_xxx to virtqueue_reset_xxx() To be consistent with virtqueue_reset(). Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251230064649.55597-2-jasowang@redhat.com>	2025-12-31 05:39:17 -05:00
Jon Kohler	3b34d6324d	vhost: use "checked" versions of get_user() and put_user() vhost_get_user and vhost_put_user leverage __get_user and __put_user, respectively, which were both added in 2016 by commit `6b1e6cc785` ("vhost: new device IOTLB API"). In a heavy UDP transmit workload on a vhost-net backed tap device, these functions showed up as ~11.6% of samples in a flamegraph of the underlying vhost worker thread. Quoting Linus from [1]: Anyway, every single __get_user() call I looked at looked like historical garbage. [...] End result: I get the feeling that we should just do a global search-and-replace of the __get_user/ __put_user users, replace them with plain get_user/put_user instead, and then fix up any fallout (eg the coco code). Switch to plain get_user/put_user in vhost, which results in a slight throughput speedup. get_user now about ~8.4% of samples in flamegraph. Basic iperf3 test on a Intel 5416S CPU with Ubuntu 25.10 guest: TX: taskset -c 2 iperf3 -c <rx_ip> -t 60 -p 5200 -b 0 -u -i 5 RX: taskset -c 2 iperf3 -s -p 5200 -D Before: 6.08 Gbits/sec After: 6.32 Gbits/sec As to what drives the speedup, Sean's patch [2] explains: Use the normal, checked versions for get_user() and put_user() instead of the double-underscore versions that omit range checks, as the checked versions are actually measurably faster on modern CPUs (12%+ on Intel, 25%+ on AMD). The performance hit on the unchecked versions is almost entirely due to the added LFENCE on CPUs where LFENCE is serializing (which is effectively all modern CPUs), which was added by commit `304ec1b050` ("x86/uaccess: Use __uaccess_begin_nospec() and uaccess_try_nospec"). The small optimizations done by commit `b19b74bc99` ("x86/mm: Rework address range check in get_user() and put_user()") likely shave a few cycles off, but the bulk of the extra latency comes from the LFENCE. [1] https://lore.kernel.org/all/CAHk-=wiJiDSPZJTV7z3Q-u4DfLgQTNWqUqqrwSBHp0+Dh016FA@mail.gmail.com/ [2] https://lore.kernel.org/all/20251106210206.221558-1-seanjc@google.com/ Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Jon Kohler <jon@nutanix.com> Message-Id: <20251113005529.2494066-1-jon@nutanix.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-26 15:00:01 -05:00
zhangdongchuan@eswincomputing.com	4b7bf8d550	virtio_ring: code cleanup in detach_buf_split Since the return value of vring_unmap_one_split() is exactly vq->split.desc_extra[i].next, 'i = vq->split.desc_extra[i].next' is redundant. Assign vring_unmap_one_split() to i instead. Since vq->split.desc_extra is assigned to extra, use extra[i].next instead of vq->split.desc_extra[i].next to improve readability. No change in functionality. Signed-off-by: zhangdongchuan <zhangdongchuan@eswincomputing.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <202511261140162936986@eswincomputing.com>	2025-12-26 15:00:00 -05:00
Thomas Weißschuh	3c4629b68d	virtio: uapi: avoid usage of libc types Using libc types and headers from the UAPI headers is problematic as it introduces a dependency on a full C toolchain. On Linux 'unsigned long' works as a replacement for 'uintptr_t' and does not depend on libc. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20251222-uapi-virtio-v1-1-29390f87bcad@linutronix.de>	2025-12-26 15:00:00 -05:00
Stefano Garzarella	d8ee3cfdc8	vhost/vsock: improve RCU read sections around vhost_vsock_get() vhost_vsock_get() uses hash_for_each_possible_rcu() to find the `vhost_vsock` associated with the `guest_cid`. hash_for_each_possible_rcu() should only be called within an RCU read section, as mentioned in the following comment in include/linux/rculist.h: /** * hlist_for_each_entry_rcu - iterate over rcu list of given type * @pos: the type * to use as a loop cursor. * @head: the head for your list. * @member: the name of the hlist_node within the struct. * @cond: optional lockdep expression if called from non-RCU protection. * * This list-traversal primitive may safely run concurrently with * the _rcu list-mutation primitives such as hlist_add_head_rcu() * as long as the traversal is guarded by rcu_read_lock(). */ Currently, all calls to vhost_vsock_get() are between rcu_read_lock() and rcu_read_unlock() except for calls in vhost_vsock_set_cid() and vhost_vsock_reset_orphans(). In both cases, the current code is safe, but we can make improvements to make it more robust. About vhost_vsock_set_cid(), when building the kernel with CONFIG_PROVE_RCU_LIST enabled, we get the following RCU warning when the user space issues `ioctl(dev, VHOST_VSOCK_SET_GUEST_CID, ...)` : WARNING: suspicious RCU usage 6.18.0-rc7 #62 Not tainted ----------------------------- drivers/vhost/vsock.c:74 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by rpc-libvirtd/3443: #0: ffffffffc05032a8 (vhost_vsock_mutex){+.+.}-{4:4}, at: vhost_vsock_dev_ioctl+0x2ff/0x530 [vhost_vsock] stack backtrace: CPU: 2 UID: 0 PID: 3443 Comm: rpc-libvirtd Not tainted 6.18.0-rc7 #62 PREEMPT(none) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-7.fc42 06/10/2025 Call Trace: <TASK> dump_stack_lvl+0x75/0xb0 dump_stack+0x14/0x1a lockdep_rcu_suspicious.cold+0x4e/0x97 vhost_vsock_get+0x8f/0xa0 [vhost_vsock] vhost_vsock_dev_ioctl+0x307/0x530 [vhost_vsock] __x64_sys_ioctl+0x4f2/0xa00 x64_sys_call+0xed0/0x1da0 do_syscall_64+0x73/0xfa0 entry_SYSCALL_64_after_hwframe+0x76/0x7e ... </TASK> This is not a real problem, because the vhost_vsock_get() caller, i.e. vhost_vsock_set_cid(), holds the `vhost_vsock_mutex` used by the hash table writers. Anyway, to prevent that warning, add lockdep_is_held() condition to hash_for_each_possible_rcu() to verify that either the caller is in an RCU read section or `vhost_vsock_mutex` is held when CONFIG_PROVE_RCU_LIST is enabled; and also clarify the comment for vhost_vsock_get() to better describe the locking requirements and the scope of the returned pointer validity. About vhost_vsock_reset_orphans(), currently this function is only called via vsock_for_each_connected_socket(), which holds the `vsock_table_lock` spinlock (which is also an RCU read-side critical section). However, add an explicit RCU read lock there to make the code more robust and explicit about the RCU requirements, and to prevent issues if the calling context changes in the future or if vhost_vsock_reset_orphans() is called from other contexts. Fixes: `834e772c8d` ("vhost/vsock: fix use-after-free in network stack callers") Cc: stefanha@redhat.com Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20251126133826.142496-1-sgarzare@redhat.com> Message-ID: <20251126210313.GA499503@fedora> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-24 08:02:57 -05:00
Michael S. Tsirkin	7f81878b04	tools/virtio: add device, device_driver stubs Add stubs needed by virtio.h Message-ID: <0fabf13f6ea812ebc73b1c919fb17d4dec1545db.1764873799.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-24 08:02:56 -05:00
Michael S. Tsirkin	39cfe193f3	tools/virtio: fix up oot build oot build tends to help uncover bugs so it's worth keeping around, as long as it's low effort. add stubs for a couple of macros virtio gained recently, and disable vdpa in the test build. Message-ID: <33968faa7994b86d1f78057358a50b8f460c7a23.1764873799.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-24 08:02:56 -05:00
Michael S. Tsirkin	e88dfb9331	virtio_features: make it self-contained virtio_features.h uses WARN_ON_ONCE and memset so it must include linux/bug.h and linux/string.h Message-ID: <579986aa9b8d023844990d2a0e267382f8ad85d5.1764873799.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-24 08:02:56 -05:00
Michael S. Tsirkin	cec9c5e385	tools/virtio: switch to kernel's virtio_config.h Drops stubs in virtio_config.h, use the kernel's version instead - we are now activly developing it, so the stub became too hard to maintain. Message-ID: <8e5c85dc8aad001f161f7e2d8799ffbccfc31381.1764873799.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-24 08:02:56 -05:00
Michael S. Tsirkin	b0fe545b3c	tools/virtio: stub might_sleep and synchronize_rcu Add might_sleep() and synchronize_rcu() stubs needed by virtio_config.h. might_sleep() is a no-op, synchronize_rcu doesn't work but we don't need it to. Created using Cursor CLI. Message-ID: <5557e026335d808acd7b890693ee1382e73dd33a.1764873799.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-24 08:02:56 -05:00
Michael S. Tsirkin	a2f964c45b	tools/virtio: add struct cpumask to cpumask.h Add struct cpumask stub used by virtio_config.h. Created using Cursor CLI. Message-ID: <eacf56399ba220513ebcd610f4a5115dc768db80.1764873799.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-24 08:02:56 -05:00
Michael S. Tsirkin	4e949e77fa	tools/virtio: pass KCFLAGS to module build Update the mod target to pass KCFLAGS with the in-tree vhost driver include path. This way vhost_test can find vhost headers. Created using Cursor CLI. Message-ID: <5473e5a5dfd2fcd261a778f2017cac669c031f23.1764873799.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-24 08:02:56 -05:00
Michael S. Tsirkin	b6600eff05	tools/virtio: add ucopysize.h stub Add ucopysize.h with stub implementations of check_object_size, copy_overflow, and check_copy_size. Created using Cursor CLI. Message-ID: <5046df90002bb744609248404b81d33b559fe813.1764873799.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-24 08:02:56 -05:00
Michael S. Tsirkin	c53ad75c62	tools/virtio: add dev_WARN_ONCE and is_vmalloc_addr stubs Add dev_WARN_ONCE and is_vmalloc_addr stubs needed by virtio_ring.c. is_vmalloc_addr stub always returns false - that's fine since it's merely a sanity check. Created using Cursor CLI. Message-ID: <749e7a03b7cd56baf50a27efc3b05e50cf8f36b6.1764873799.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-24 08:02:56 -05:00
Michael S. Tsirkin	03d768a38c	tools/virtio: stub DMA mapping functions Add dma_map_page_attrs and dma_unmap_page_attrs stubs. Follow the same pattern as existing DMA mapping stubs. Created using Cursor CLI. Message-ID: <3512df1fe0e2129ea493434a21c940c50381cc93.1764873799.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-24 08:02:56 -05:00
Michael S. Tsirkin	42059e68ea	tools/virtio: add struct module forward declaration Declarate struct module in our linux/module.h stub. Created using Cursor CLI. Message-ID: <c01b8d24159664cc8c49354088efa342ae9e7321.1764873799.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-24 08:02:56 -05:00
Michael S. Tsirkin	16fe720f1d	tools/virtio: use kernel's virtio.h Replace virtio stubs with an include of the kernel header. Message-ID: <33daf1033fc447eb8e3e54d21013ccfd99550e37.1764873799.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-24 08:02:55 -05:00
Michael S. Tsirkin	f059588c55	virtio: make it self-contained virtio.h uses struct module, add a forward declaration to make the header self-contained. Message-ID: <9171b5cac60793eb59ab044c96ee038bf1363bee.1764873799.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-24 08:02:55 -05:00
Michael S. Tsirkin	94fb5e796a	tools/virtio: fix up compiler.h stub Add #undef __user before and after including compiler_types.h to avoid redefinition warnings when compiling with system headers that also define __user. This allows tools/virtio to build without warnings. Additionally, stub out __must_check Created using Cursor CLI. Message-ID: <56424ce95c72cb4957070a7cd3c3c40ad5addaee.1764873799.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-12-24 08:02:55 -05:00
Linus Torvalds	9448598b22	Linux 6.19-rc2 v6.19-rc2	2025-12-21 15:52:04 -08:00

1 2 3 4 5 ...

1412393 Commits