linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-21 11:25:22 -04:00

Author	SHA1	Message	Date
Akiva Goldberger	e5eba42f01	mlx5: Fix default values in create CQ Currently, CQs without a completion function are assigned the mlx5_add_cq_to_tasklet function by default. This is problematic since only user CQs created through the mlx5_ib driver are intended to use this function. Additionally, all CQs that will use doorbells instead of polling for completions must call mlx5_cq_arm. However, the default CQ creation flow leaves a valid value in the CQ's arm_db field, allowing FW to send interrupts to polling-only CQs in certain corner cases. These two factors would allow a polling-only kernel CQ to be triggered by an EQ interrupt and call a completion function intended only for user CQs, causing a null pointer exception. Some areas in the driver have prevented this issue with one-off fixes but did not address the root cause. This patch fixes the described issue by adding defaults to the create CQ flow. It adds a default dummy completion function to protect against null pointer exceptions, and it sets an invalid command sequence number by default in kernel CQs to prevent the FW from sending an interrupt to the CQ until it is armed. User CQs are responsible for their own initialization values. Callers of mlx5_core_create_cq are responsible for changing the completion function and arming the CQ per their needs. Fixes: `cdd04f4d4d` ("net/mlx5: Add support to create SQ and CQ for ASO") Signed-off-by: Akiva Goldberger <agoldberger@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Leon Romanovsky <leon@kernel.org> Link: https://patch.msgid.link/1762681743-1084694-1-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-11-11 15:12:18 +01:00
Jason Wang	0d16cc439f	vdpa: introduce map ops Virtio core allows the transport to provide device or transport specific mapping functions. This patch adds this support to vDPA. We can simply do this by allowing the vDPA parent to register a virtio_map_ops. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20250924070045.10361-2-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com>	2025-10-01 07:24:55 -04:00
Jason Wang	58aca3dbc7	vdpa: support virtio_map Virtio core switches from DMA device to virtio_map, let's do that as well for vDPA. Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20250821064641.5025-8-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com>	2025-10-01 07:24:43 -04:00
Dragos Tatulea	cc51a66815	vdpa/mlx5: Fix release of uninitialized resources on error path The commit in the fixes tag made sure that mlx5_vdpa_free() is the single entrypoint for removing the vdpa device resources added in mlx5_vdpa_dev_add(), even in the cleanup path of mlx5_vdpa_dev_add(). This means that all functions from mlx5_vdpa_free() should be able to handle uninitialized resources. This was not the case though: mlx5_vdpa_destroy_mr_resources() and mlx5_cmd_cleanup_async_ctx() were not able to do so. This caused the splat below when adding a vdpa device without a MAC address. This patch fixes these remaining issues: - Makes mlx5_vdpa_destroy_mr_resources() return early if called on uninitialized resources. - Moves mlx5_cmd_init_async_ctx() early on during device addition because it can't fail. This means that mlx5_cmd_cleanup_async_ctx() also can't fail. To mirror this, move the call site of mlx5_cmd_cleanup_async_ctx() in mlx5_vdpa_free(). An additional comment was added in mlx5_vdpa_free() to document the expectations of functions called from this context. Splat: mlx5_core 0000:b5:03.2: mlx5_vdpa_dev_add:3950:(pid 2306) warning: No mac address provisioned? ------------[ cut here ]------------ WARNING: CPU: 13 PID: 2306 at kernel/workqueue.c:4207 __flush_work+0x9a/0xb0 [...] Call Trace: <TASK> ? __try_to_del_timer_sync+0x61/0x90 ? __timer_delete_sync+0x2b/0x40 mlx5_vdpa_destroy_mr_resources+0x1c/0x40 [mlx5_vdpa] mlx5_vdpa_free+0x45/0x160 [mlx5_vdpa] vdpa_release_dev+0x1e/0x50 [vdpa] device_release+0x31/0x90 kobject_cleanup+0x37/0x130 mlx5_vdpa_dev_add+0x327/0x890 [mlx5_vdpa] vdpa_nl_cmd_dev_add_set_doit+0x2c1/0x4d0 [vdpa] genl_family_rcv_msg_doit+0xd8/0x130 genl_family_rcv_msg+0x14b/0x220 ? __pfx_vdpa_nl_cmd_dev_add_set_doit+0x10/0x10 [vdpa] genl_rcv_msg+0x47/0xa0 ? __pfx_genl_rcv_msg+0x10/0x10 netlink_rcv_skb+0x53/0x100 genl_rcv+0x24/0x40 netlink_unicast+0x27b/0x3b0 netlink_sendmsg+0x1f7/0x430 __sys_sendto+0x1fa/0x210 ? ___pte_offset_map+0x17/0x160 ? next_uptodate_folio+0x85/0x2b0 ? percpu_counter_add_batch+0x51/0x90 ? filemap_map_pages+0x515/0x660 __x64_sys_sendto+0x20/0x30 do_syscall_64+0x7b/0x2c0 ? do_read_fault+0x108/0x220 ? do_pte_missing+0x14a/0x3e0 ? __handle_mm_fault+0x321/0x730 ? count_memcg_events+0x13f/0x180 ? handle_mm_fault+0x1fb/0x2d0 ? do_user_addr_fault+0x20c/0x700 ? syscall_exit_work+0x104/0x140 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f0c25b0feca [...] ---[ end trace 0000000000000000 ]--- Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Fixes: `83e445e64f` ("vdpa/mlx5: Fix error path during device add") Reported-by: Wenli Quan <wquan@redhat.com> Closes: https://lore.kernel.org/virtualization/CADZSLS0r78HhZAStBaN1evCSoPqRJU95Lt8AqZNJ6+wwYQ6vPQ@mail.gmail.com/ Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Message-Id: <20250708120424.2363354-2-dtatulea@nvidia.com> Tested-by: Wenli Quan <wquan@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2025-08-01 09:11:08 -04:00
Dragos Tatulea	6f0f3d7fc4	vdpa/mlx5: Fix needs_teardown flag calculation needs_teardown is a device flag that indicates when virtual queues need to be recreated. This happens for certain configuration changes: queue size and some specific features. Currently, the needs_teardown state can be incorrectly reset by subsequent .set_vq_num() calls. For example, for 1 rx VQ with size 512 and 1 tx VQ with size 256: .set_vq_num(0, 512) -> sets needs_teardown to true (rx queue has a non-default size) .set_vq_num(1, 256) -> sets needs_teardown to false (tx queue has a default size) This change takes into account the previous value of the needs_teardown flag when re-calculating it during VQ size configuration. Fixes: `0fe963d6fc` ("vdpa/mlx5: Re-create HW VQs under certain conditions") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Tested-by: Si-Wei Liu<si-wei.liu@oracle.com> Message-Id: <20250604184802.2625300-1-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2025-08-01 09:11:08 -04:00
Si-Wei Liu	a6097e0a54	vdpa/mlx5: Fix oversized null mkey longer than 32bit create_user_mr() has correct code to count the number of null keys used to fill in a hole for the memory map. However, fill_indir() does not follow the same to cap the range up to the 1GB limit correspondingly. Fill in more null keys for the gaps in between, so that null keys are correctly populated. Fixes: `94abbccdf2` ("vdpa/mlx5: Add shared memory registration code") Cc: stable@vger.kernel.org Reported-by: Cong Meng <cong.meng@oracle.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20250220193732.521462-2-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2025-02-25 07:10:45 -05:00
Konstantin Shkolnyy	439252e167	vdpa/mlx5: Fix mlx5_vdpa_get_config() endianness on big-endian machines mlx5_vdpa_dev_add() doesn’t initialize mvdev->actual_features. It’s initialized later by mlx5_vdpa_set_driver_features(). However, mlx5_vdpa_get_config() depends on the VIRTIO_F_VERSION_1 flag in actual_features, to return data with correct endianness. When it’s called before mlx5_vdpa_set_driver_features(), the data are incorrectly returned as big-endian on big-endian machines, while QEMU then interprets them as little-endian. The fix is to initialize this VIRTIO_F_VERSION_1 as early as possible, especially considering that mlx5_vdpa_dev_add() insists on this flag to always be set anyway. Signed-off-by: Konstantin Shkolnyy <kshk@linux.ibm.com> Message-Id: <20250204173127.166673-1-kshk@linux.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com>	2025-02-25 07:10:45 -05:00
Moshe Shemesh	95f68e06b4	net/mlx5: fs, add counter object to flow destination Currently mlx5_flow_destination includes counter_id which is assigned in case we use flow counter on the flow steering rule. However, counter_id is not enough data in case of using HW Steering. Thus, have mlx5_fc object as part of mlx5_flow_destination instead of counter_id and assign it where needed. In case counter_id is received from user space, create a local counter object to represent it. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241219175841.1094544-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-23 10:34:45 -08:00
Si-Wei Liu	3502596332	vdpa/mlx5: Fix suboptimal range on iotlb iteration The starting iova address to iterate iotlb map entry within a range was set to an irrelevant value when passing to the itree_next() iterator, although luckily it doesn't affect the outcome of finding out the granule of the smallest iotlb map size. Fix the code to make it consistent with the following for-loop. Fixes: `94abbccdf2` ("vdpa/mlx5: Add shared memory registration code") Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20241021134040.975221-3-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2024-11-12 18:05:05 -05:00
Si-Wei Liu	29ce8b8a4f	vdpa/mlx5: Fix PA offset with unaligned starting iotlb map When calculating the physical address range based on the iotlb and mr [start,end) ranges, the offset of mr->start relative to map->start is not taken into account. This leads to some incorrect and duplicate mappings. For the case when mr->start < map->start the code is already correct: the range in [mr->start, map->start) was handled by a different iteration. Fixes: `94abbccdf2` ("vdpa/mlx5: Add shared memory registration code") Cc: stable@vger.kernel.org Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20241021134040.975221-2-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2024-11-12 18:05:04 -05:00
Dragos Tatulea	83e445e64f	vdpa/mlx5: Fix error path during device add In the error recovery path of mlx5_vdpa_dev_add(), the cleanup is executed and at the end put_device() is called which ends up calling mlx5_vdpa_free(). This function will execute the same cleanup all over again. Most resources support being cleaned up twice, but the recent mlx5_vdpa_destroy_mr_resources() doesn't. This change drops the explicit cleanup from within the mlx5_vdpa_dev_add() and lets mlx5_vdpa_free() do its work. This issue was discovered while trying to add 2 vdpa devices with the same name: $> vdpa dev add name vdpa-0 mgmtdev auxiliary/mlx5_core.sf.2 $> vdpa dev add name vdpa-0 mgmtdev auxiliary/mlx5_core.sf.3 ... yields the following dump: BUG: kernel NULL pointer dereference, address: 00000000000000b8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP CPU: 4 UID: 0 PID: 2811 Comm: vdpa Not tainted 6.12.0-rc6 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:destroy_workqueue+0xe/0x2a0 Code: ... RSP: 0018:ffff88814920b9a8 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff888105c10000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffff888100400168 RDI: 0000000000000000 RBP: 0000000000000000 R08: ffff888100120c00 R09: ffffffff828578c0 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff888131fd99a0 R14: 0000000000000000 R15: ffff888105c10580 FS: 00007fdfa6b4f740(0000) GS:ffff88852ca00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000b8 CR3: 000000018db09006 CR4: 0000000000372eb0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? __die+0x20/0x60 ? page_fault_oops+0x150/0x3e0 ? exc_page_fault+0x74/0x130 ? asm_exc_page_fault+0x22/0x30 ? destroy_workqueue+0xe/0x2a0 mlx5_vdpa_destroy_mr_resources+0x2b/0x40 [mlx5_vdpa] mlx5_vdpa_free+0x45/0x150 [mlx5_vdpa] vdpa_release_dev+0x1e/0x50 [vdpa] device_release+0x31/0x90 kobject_put+0x8d/0x230 mlx5_vdpa_dev_add+0x328/0x8b0 [mlx5_vdpa] vdpa_nl_cmd_dev_add_set_doit+0x2b8/0x4c0 [vdpa] genl_family_rcv_msg_doit+0xd0/0x120 genl_rcv_msg+0x180/0x2b0 ? __vdpa_alloc_device+0x1b0/0x1b0 [vdpa] ? genl_family_rcv_msg_dumpit+0xf0/0xf0 netlink_rcv_skb+0x54/0x100 genl_rcv+0x24/0x40 netlink_unicast+0x1fc/0x2d0 netlink_sendmsg+0x1e4/0x410 __sock_sendmsg+0x38/0x60 ? sockfd_lookup_light+0x12/0x60 __sys_sendto+0x105/0x160 ? __count_memcg_events+0x53/0xe0 ? handle_mm_fault+0x100/0x220 ? do_user_addr_fault+0x40d/0x620 __x64_sys_sendto+0x20/0x30 do_syscall_64+0x4c/0x100 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7fdfa6c66b57 Code: ... RSP: 002b:00007ffeace22998 EFLAGS: 00000202 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 000055a498608350 RCX: 00007fdfa6c66b57 RDX: 000000000000006c RSI: 000055a498608350 RDI: 0000000000000003 RBP: 00007ffeace229c0 R08: 00007fdfa6d35200 R09: 000000000000000c R10: 0000000000000000 R11: 0000000000000202 R12: 000055a4986082a0 R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffeace233f3 </TASK> Modules linked in: ... CR2: 00000000000000b8 Fixes: `6211165448` ("vdpa/mlx5: Postpone MR deletion") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20241105185101.1323272-2-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>	2024-11-07 16:51:16 -05:00
Dragos Tatulea	6211165448	vdpa/mlx5: Postpone MR deletion Currently, when a new MR is set up, the old MR is deleted. MR deletion is about 30-40% the time of MR creation. As deleting the old MR is not important for the process of setting up the new MR, this operation can be postponed. This series adds a workqueue that does MR garbage collection at a later point. If the MR lock is taken, the handler will back off and reschedule. The exception during shutdown: then the handler must not postpone the work. Note that this is only a speculative optimization: if there is some mapping operation that is triggered while the garbage collector handler has the lock taken, this operation it will have to wait for the handler to finish. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Message-Id: <20240830105838.2666587-9-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-25 07:07:44 -04:00
Dragos Tatulea	f30a1232b6	vdpa/mlx5: Introduce init/destroy for MR resources There's currently not a lot of action happening during the init/destroy of MR resources. But more will be added in the upcoming patches. As the mr mutex lock init/destroy has been moved to these new functions, the lifetime has now shifted away from mlx5_vdpa_alloc_resources() / mlx5_vdpa_free_resources() into these new functions. However, the lifetime at the outer scope remains the same: mlx5_vdpa_dev_add() / mlx5_vdpa_dev_free() Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Message-Id: <20240830105838.2666587-8-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-25 07:07:44 -04:00
Dragos Tatulea	58d4d50e75	vdpa/mlx5: Rename mr_mtx -> lock Now that the mr resources have their own namespace in the struct, give the lock a clearer name. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20240830105838.2666587-7-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-25 07:07:44 -04:00
Dragos Tatulea	5fc8567907	vdpa/mlx5: Extract mr members in own resource struct Group all mapping related resources into their own structure. Upcoming patches will add more members in this new structure. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20240830105838.2666587-6-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-25 07:07:43 -04:00
Dragos Tatulea	0b916a9c45	vdpa/mlx5: Rename function A followup patch will use this name for something else. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Message-Id: <20240830105838.2666587-5-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-25 07:07:43 -04:00
Dragos Tatulea	e1ba5c947e	vdpa/mlx5: Delete direct MKEYs in parallel Use the async interface to issue MTT MKEY deletion. This makes destroy_user_mr() on average 8x times faster. This number is also dependent on the size of the MR being deleted. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20240830105838.2666587-4-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-25 07:07:43 -04:00
Dragos Tatulea	0071b138d4	vdpa/mlx5: Create direct MKEYs in parallel Use the async interface to issue MTT MKEY creation. Extra care is taken at the allocation of FW input commands due to the MTT tables having variable sizes depending on MR. The indirect MKEY is still created synchronously at the end as the direct MKEYs need to be filled in. This makes create_user_mr() 3-5x faster, depending on the size of the MR. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Message-Id: <20240830105838.2666587-3-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-25 07:07:43 -04:00
Dragos Tatulea	9dba41951a	vdpa/mlx5: Parallelize VQ suspend/resume for CVQ MQ command change_num_qps() is still suspending/resuming VQs one by one. This change switches to parallel suspend/resume. When increasing the number of queues the flow has changed a bit for simplicity: the setup_vq() function will always be called before resume_vqs(). If the VQ is initialized, setup_vq() will exit early. If the VQ is not initialized, setup_vq() will create it and resume_vqs() will resume it. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Message-Id: <20240816090159.1967650-11-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:43 -04:00
Dragos Tatulea	74c89072f2	vdpa/mlx5: Small improvement for change_num_qps() change_num_qps() has a lot of multiplications by 2 to convert the number of VQ pairs to number of VQs. This patch simplifies the code by doing the VQP -> VQ count conversion at the beginning in a variable. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Message-Id: <20240816090159.1967650-10-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:43 -04:00
Dragos Tatulea	55a7cb05b0	vdpa/mlx5: Keep notifiers during suspend but ignore Unregistering notifiers is a costly operation. Instead of removing the notifiers during device suspend and adding them back at resume, simply ignore the call when the device is suspended. At resume time call queue_link_work() to make sure that the device state is propagated in case there were changes. For 1 vDPA device x 32 VQs (16 VQPs) attached to a large VM (256 GB RAM, 32 CPUs x 2 threads per core), the device suspend time is reduced from ~13 ms to ~2.5 ms. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20240816090159.1967650-9-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:42 -04:00
Dragos Tatulea	5eb8c7eb1e	vdpa/mlx5: Parallelize device resume Currently device resume works on vqs serially. Building up on previous changes that converted vq operations to the async api, this patch parallelizes the device resume. For 1 vDPA device x 32 VQs (16 VQPs) attached to a large VM (256 GB RAM, 32 CPUs x 2 threads per core), the device resume time is reduced from ~16 ms to ~4.5 ms. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20240816090159.1967650-8-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:42 -04:00
Dragos Tatulea	dcf3eac01f	vdpa/mlx5: Parallelize device suspend Currently device suspend works on vqs serially. Building up on previous changes that converted vq operations to the async api, this patch parallelizes the device suspend: 1) Suspend all active vqs parallel. 2) Query suspended vqs in parallel. For 1 vDPA device x 32 VQs (16 VQPs) attached to a large VM (256 GB RAM, 32 CPUs x 2 threads per core), the device suspend time is reduced from ~37 ms to ~13 ms. A later patch will remove the link unregister operation which will make it even faster. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20240816090159.1967650-7-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:42 -04:00
Dragos Tatulea	61674c154b	vdpa/mlx5: Use async API for vq modify commands Switch firmware vq modify command to be issued via the async API to allow future parallelization. The new refactored function applies the modify on a range of vqs and waits for their execution to complete. For now the command is still used in a serial fashion. A later patch will switch to modifying multiple vqs in parallel. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Message-Id: <20240816090159.1967650-6-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:42 -04:00
Dragos Tatulea	1fcdf43ea6	vdpa/mlx5: Use async API for vq query command Switch firmware vq query command to be issued via the async API to allow future parallelization. For now the command is still serial but the infrastructure is there to issue commands in parallel, including ratelimiting the number of issued async commands to firmware. A later patch will switch to issuing more commands at a time. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Message-Id: <20240816090159.1967650-5-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:42 -04:00
Dragos Tatulea	d89d58f488	vdpa/mlx5: Introduce async fw command wrapper Introduce a new function mlx5_vdpa_exec_async_cmds() which wraps the mlx5_core async firmware command API in a way that will be used to parallelize certain operation in this driver. The wrapper deals with the case when mlx5_cmd_exec_cb() returns EBUSY due to the command being throttled. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Message-Id: <20240816090159.1967650-4-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:42 -04:00
Dragos Tatulea	de2cd39fc1	vdpa/mlx5: Introduce error logging function mlx5_vdpa_err() was missing. This patch adds it and uses it in the necessary places. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20240816090159.1967650-3-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>	2024-09-25 07:07:42 -04:00
Cindy Lu	6d17035a74	vdpa/mlx5: Add the support of set mac address Add the function to support setting the MAC address. For vdpa/mlx5, the function will use mlx5_mpfs_add_mac to set the mac address Tested in ConnectX-6 Dx device Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20240731031653.1047692-4-lulu@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2024-09-10 02:51:48 -04:00
Dragos Tatulea	dc12502905	vdpa/mlx5: Fix invalid mr resource destroy Certain error paths from mlx5_vdpa_dev_add() can end up releasing mr resources which never got initialized in the first place. This patch adds the missing check in mlx5_vdpa_destroy_mr_resources() to block releasing non-initialized mr resources. Reference trace: mlx5_core 0000:08:00.2: mlx5_vdpa_dev_add:3274:(pid 2700) warning: No mac address provisioned? BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 140216067 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 8 PID: 2700 Comm: vdpa Kdump: loaded Not tainted 5.14.0-496.el9.x86_64 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:vhost_iotlb_del_range+0xf/0xe0 [vhost_iotlb] Code: [...] RSP: 0018:ff1c823ac23077f0 EFLAGS: 00010246 RAX: ffffffffc1a21a60 RBX: ffffffff899567a0 RCX: 0000000000000000 RDX: ffffffffffffffff RSI: 0000000000000000 RDI: 0000000000000000 RBP: ff1bda1f7c21e800 R08: 0000000000000000 R09: ff1c823ac2307670 R10: ff1c823ac2307668 R11: ffffffff8a9e7b68 R12: 0000000000000000 R13: 0000000000000000 R14: ff1bda1f43e341a0 R15: 00000000ffffffea FS: 00007f56eba7c740(0000) GS:ff1bda269f800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000104d90001 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: ? show_trace_log_lvl+0x1c4/0x2df ? show_trace_log_lvl+0x1c4/0x2df ? mlx5_vdpa_free+0x3d/0x150 [mlx5_vdpa] ? __die_body.cold+0x8/0xd ? page_fault_oops+0x134/0x170 ? __irq_work_queue_local+0x2b/0xc0 ? irq_work_queue+0x2c/0x50 ? exc_page_fault+0x62/0x150 ? asm_exc_page_fault+0x22/0x30 ? __pfx_mlx5_vdpa_free+0x10/0x10 [mlx5_vdpa] ? vhost_iotlb_del_range+0xf/0xe0 [vhost_iotlb] mlx5_vdpa_free+0x3d/0x150 [mlx5_vdpa] vdpa_release_dev+0x1e/0x50 [vdpa] device_release+0x31/0x90 kobject_cleanup+0x37/0x130 mlx5_vdpa_dev_add+0x2d2/0x7a0 [mlx5_vdpa] vdpa_nl_cmd_dev_add_set_doit+0x277/0x4c0 [vdpa] genl_family_rcv_msg_doit+0xd9/0x130 genl_family_rcv_msg+0x14d/0x220 ? __pfx_vdpa_nl_cmd_dev_add_set_doit+0x10/0x10 [vdpa] ? _copy_to_user+0x1a/0x30 ? move_addr_to_user+0x4b/0xe0 genl_rcv_msg+0x47/0xa0 ? __import_iovec+0x46/0x150 ? __pfx_genl_rcv_msg+0x10/0x10 netlink_rcv_skb+0x54/0x100 genl_rcv+0x24/0x40 netlink_unicast+0x245/0x370 netlink_sendmsg+0x206/0x440 __sys_sendto+0x1dc/0x1f0 ? do_read_fault+0x10c/0x1d0 ? do_pte_missing+0x10d/0x190 __x64_sys_sendto+0x20/0x30 do_syscall_64+0x5c/0xf0 ? __count_memcg_events+0x4f/0xb0 ? mm_account_fault+0x6c/0x100 ? handle_mm_fault+0x116/0x270 ? do_user_addr_fault+0x1d6/0x6a0 ? do_syscall_64+0x6b/0xf0 ? clear_bhb_loop+0x25/0x80 ? clear_bhb_loop+0x25/0x80 ? clear_bhb_loop+0x25/0x80 ? clear_bhb_loop+0x25/0x80 ? clear_bhb_loop+0x25/0x80 entry_SYSCALL_64_after_hwframe+0x78/0x80 Fixes: `512c0cdd80` ("vdpa/mlx5: Decouple cvq iotlb handling from hw mapping code") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Message-Id: <20240827160808.2448017-2-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>	2024-09-10 02:51:47 -04:00
Dragos Tatulea	8e0751af1b	vdpa/mlx5: Don't enable non-active VQs in .set_vq_ready() VQ indices in the range [cur_num_qps, max_vqs) represent queues that have not yet been activated. .set_vq_ready should not activate these VQs. Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-24-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 08:42:51 -04:00
Dragos Tatulea	2638134f71	vdpa/mlx5: Don't reset VQs more than necessary The vdpa device can be reset many times in sequence without any significant state changes in between. Previously this was not a problem: VQs were torn down only on first reset. But after VQ pre-creation was introduced, each reset will delete and re-create the hardware VQs and their associated resources. To solve this problem, avoid resetting hardware VQs if the VQs are still in a blank state. Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-23-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 08:42:50 -04:00
Dragos Tatulea	0fe963d6fc	vdpa/mlx5: Re-create HW VQs under certain conditions There are a few conditions under which the hardware VQs need a full teardown and setup: - VQ size changed to something else than default value. Hardware VQ size modification is not supported. - User turns off certain device features: mergeable buffers, checksum virtio 1.0 compliance. In these cases, the TIR and RQT need to be re-created. Add a needs_teardown configuration variable and set it when detecting the above scenarios. On next DRIVER_OK, the resources will be torn down first. Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-22-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 08:42:50 -04:00
Dragos Tatulea	ffb1aae43e	vdpa/mlx5: Pre-create hardware VQs at vdpa .dev_add time Currently, hardware VQs are created right when the vdpa device gets into DRIVER_OK state. That is easier because most of the VQ state is known by then. This patch switches to creating all VQs and their associated resources at device creation time. The motivation is to reduce the vdpa device live migration downtime by moving the expensive operation of creating all the hardware VQs and their associated resources out of downtime on the destination VM. The VQs are now created in a blank state. The VQ configuration will happen later, on DRIVER_OK. Then the configuration will be applied when the VQs are moved to the Ready state. When .set_vq_ready() is called on a VQ before DRIVER_OK, special care is needed: now that the VQ is already created a resume_vq() will be triggered too early when no mr has been configured yet. Skip calling resume_vq() in this case, let it be handled during DRIVER_OK. For virtio-vdpa, the device configuration is done earlier during .vdpa_dev_add() by vdpa_register_device(). Avoid calling setup_vq_resources() a second time in that case. On a 64 CPU, 256 GB VM with 1 vDPA device of 16 VQps, the full VQ resource creation + resume time was ~370ms. Now it's down to 60 ms (only VQ config and resume). The measurements were done on a ConnectX6DX based vDPA device. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-21-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 08:42:49 -04:00
Dragos Tatulea	3b3adb3bbf	vdpa/mlx5: Use suspend/resume during VQP change Resume a VQ if it is already created when the number of VQ pairs increases. This is done in preparation for VQ pre-creation which is coming in a later patch. It is necessary because calling setup_vq() on an already created VQ will return early and will not enable the queue. For symmetry, suspend a VQ instead of tearing it down when the number of VQ pairs decreases. But only if the resume operation is supported. Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-20-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 08:42:49 -04:00
Dragos Tatulea	ac85cd904d	vdpa/mlx5: Forward error in suspend/resume device Start using the suspend/resume_vq() error return codes previously added. Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-19-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com>	2024-07-09 08:42:49 -04:00
Dragos Tatulea	843250271b	vdpa/mlx5: Consolidate all VQ modify to Ready to use resume_vq() There are a few more places modifying the VQ to Ready directly. Let's consolidate them into resume_vq(). The redundant warnings for resume_vq() errors can also be dropped. There is one special case that needs to be handled for virtio-vdpa: the initialized flag must be set to true earlier in setup_vq() so that resume_vq() doesn't return early. Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-18-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 08:42:48 -04:00
Dragos Tatulea	b89bb349f2	vdpa/mlx5: Add error code for suspend/resume VQ Instead of blindly calling suspend/resume_vqs(), make then return error codes. To keep compatibility, keep suspending or resuming VQs on error and return the last error code. The assumption here is that the error code would be the same. Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-17-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 08:42:48 -04:00
Dragos Tatulea	fc9af25d04	vdpa/mlx5: Accept Init -> Ready VQ transition in resume_vq() Until now resume_vq() was used only for the suspend/resume scenario. This change also allows calling resume_vq() to bring it from Init to Ready state (VQ initialization). Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-16-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>	2024-07-09 08:42:48 -04:00
Dragos Tatulea	e60e9eeb36	vdpa/mlx5: Allow creation of blank VQs Based on the filled flag, create VQs that are filled or blank. Blank VQs will be filled in later through VQ modify. Downstream patches will make use of this to pre-create blank VQs at vdpa device creation. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-15-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>	2024-07-09 08:42:47 -04:00
Dragos Tatulea	ebebaf45e8	vdpa/mlx5: Set mkey modified flags on all VQs Otherwise, when virtqueues are moved from INIT to READY the latest mkey will not be set appropriately. Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-14-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 08:42:47 -04:00
Dragos Tatulea	1e8dac7bb6	vdpa/mlx5: Start off rqt_size with max VQPs Currently rqt_size is initialized during device flag configuration. That's because it is the earliest moment when device knows if MQ (multi queue) is on or off. Shift this configuration earlier to device creation time. This implies that non-MQ devices will have a larger RQT size. But the configuration will still be correct. This is done in preparation for the pre-creation of hardware virtqueues at device add time. When that change will be added, RQT will be created at device creation time so it needs to be initialized to its max size. Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-13-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 08:42:46 -04:00
Dragos Tatulea	ad9758fdaf	vdpa/mlx5: Set an initial size on the VQ The virtqueue size is a pre-requisite for setting up any virtqueue resources. For the upcoming optimization of creating virtqueues at device add, the virtqueue size has to be configured. The queue size check in setup_vq() will always be false. So remove it. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-12-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 08:42:46 -04:00
Dragos Tatulea	cdc3c7eaae	vdpa/mlx5: Add support for modifying the VQ features field This is done in preparation for the pre-creation of hardware virtqueues at device add time. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-11-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 08:42:46 -04:00
Dragos Tatulea	f70080c5bc	vdpa/mlx5: Add support for modifying the virtio_version VQ field This is done in preparation for the pre-creation of hardware virtqueues at device add time. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-10-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 08:42:45 -04:00
Dragos Tatulea	4a19f2942a	vdpa/mlx5: Rename init_mvqs Function is used to set default values, so name it accordingly. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-9-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com>	2024-07-09 08:42:45 -04:00
Dragos Tatulea	e5bcbd1de6	vdpa/mlx5: Clear and reinitialize software VQ data on reset The hardware VQ configuration is mirrored by data in struct mlx5_vdpa_virtqueue . Instead of clearing just a few fields at reset, fully clear the struct and initialize with the appropriate default values. As clear_vqs_ready() is used only during reset, get rid of it. Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-8-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 08:42:45 -04:00
Dragos Tatulea	1835ed4a5d	vdpa/mlx5: Initialize and reset device with one queue pair The virtio spec says that a vdpa device should start off with one queue pair. The driver is already compliant. This patch moves the initialization to device add and reset times. This is done in preparation for the pre-creation of hardware virtqueues at device add time. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-7-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>	2024-07-09 08:42:44 -04:00
Dragos Tatulea	a366465b48	vdpa/mlx5: Remove duplicate suspend code Use the dedicated suspend_vqs() function instead. Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-6-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 08:42:44 -04:00
Dragos Tatulea	34bd86c720	vdpa/mlx5: Iterate over active VQs during suspend/resume No need to iterate over max number of VQs. Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-5-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 08:42:44 -04:00
Dragos Tatulea	ad80739262	vdpa/mlx5: Drop redundant check in teardown_virtqueues() The check is done inside teardown_vq(). Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20240626-stage-vdpa-vq-precreate-v2-4-560c491078df@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 08:42:43 -04:00

1 2 3 4 5

214 Commits