linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-06-08 00:29:33 -04:00

Author	SHA1	Message	Date
Michal Wajdeczko	430d328877	drm/xe: Update MEMIRQ to use tile-based printk macros We already have tile-based printk macros, there is no need to manually prepare MEMIRQ specific messages to include tile id. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20251005133641.2651-5-michal.wajdeczko@intel.com	2025-10-06 19:39:26 +02:00
Michal Wajdeczko	cd11babcd0	drm/xe/pf: Update LMTT to use tile-based messages Since now we have tile-based SR-IOV printk macros, there is no need to manually prepare the LMTT specific warning message (that is now upgraded to proper error level message) nor to use generic debug message without tile/LMTT identification. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20251005133641.2651-4-michal.wajdeczko@intel.com	2025-10-06 19:39:25 +02:00
Michal Wajdeczko	c66e4b6cae	drm/xe: Add tile-based SRIOV printk macros We already have device and GT level SR-IOV specific macros, but unlike native case, we don't have yet tile-based ones. Add macros to match native use case and also update GT-based macros to rely on those new tile-based SR-IOV macros. This will slightly rearrange the output of the GT logs and instead: [...] Tile0: GT0: PF: pushed VF1 config with 2 KLVs... we might see: [...] PF: Tile0: GT0: pushed VF1 config with 2 KLVs... but that's even better. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20251005133641.2651-3-michal.wajdeczko@intel.com	2025-10-06 19:39:23 +02:00
Michal Wajdeczko	c95f180207	drm/xe: Update SRIOV printk macros Recently we introduced xe-based printk macros, use them instead of plain drm-based ones. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20251005133641.2651-2-michal.wajdeczko@intel.com	2025-10-06 19:39:22 +02:00
Michal Wajdeczko	9a54b5127f	drm/xe/pf: Make the late-initialization really late While the late PF per-GT initialization is done quite late in the single GT initialization flow, in case of multi-GT platforms, it may still be done before other GT early initialization. That leads to some issues during unwind, when there are cross-GT dependencies, like resource cleanup that is shared by both GTs, but the other GT may already be sanitized or disabled. The following errors could be observed when trying to unload the PF driver with some LMEM/VRAM already provisioned for few VFs: [ ] xe 0000:03:00.0: DEVRES REL ffff88814708f240 fini_config (16 bytes) [ ] xe 0000:03:00.0: [drm:lmtt_write_pte [xe]] PF: LMTT: WRITE level=2 index=1 pte=0x0 [ ] xe 0000:03:00.0: [drm:lmtt_invalidate_hw [xe]] PF: LMTT: num_fences=2 err=-19 [ ] xe 0000:03:00.0: [drm:lmtt_pt_free [xe]] PF: LMTT: level=0 addr=53a470000 [ ] xe 0000:03:00.0: [drm:lmtt_pt_free [xe]] PF: LMTT: level=1 addr=53a4b0000 [ ] xe 0000:03:00.0: [drm:lmtt_invalidate_hw [xe]] PF: LMTT: num_fences=2 err=-19 [ ] xe 0000:03:00.0: [drm] PF: LMTT0 invalidation failed (-ENODEV) [ ] xe 0000:03:00.0: [drm:lmtt_write_pte [xe]] PF: LMTT: WRITE level=2 index=2 pte=0x0 [ ] xe 0000:03:00.0: [drm:lmtt_invalidate_hw [xe]] PF: LMTT: num_fences=2 err=-19 [ ] xe 0000:03:00.0: [drm:lmtt_pt_free [xe]] PF: LMTT: level=0 addr=539b70000 [ ] xe 0000:03:00.0: [drm:lmtt_pt_free [xe]] PF: LMTT: level=1 addr=539bf0000 [ ] xe 0000:03:00.0: [drm:lmtt_invalidate_hw [xe]] PF: LMTT: num_fences=2 err=-19 [ ] xe 0000:03:00.0: [drm] PF: LMTT0 invalidation failed (-ENODEV) Move all PF per-GT late initialization to the already defined late SR-IOV initialization function to allow proper order of the cleanup actions. While around, format all PF function stubs as one-liners, like many other stubs are defined in the Xe driver. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20251004162008.1782-1-michal.wajdeczko@intel.com	2025-10-06 19:30:17 +02:00
Michal Wajdeczko	71f1939e0d	drm/xe/xe_late_bind_fw: Fix and simplify parsing user input Code was wrongly passing sizeof(uval) as the number base to use, and unlike other debugfs entries that represent bool data, it wasn't using the dedicated function to parse user input as bool. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20251002192736.203186-1-michal.wajdeczko@intel.com	2025-10-06 19:24:15 +02:00
Michal Wajdeczko	869580c415	drm/xe: Don't force DRM_XE_DEBUG_MEMIRQ for SR-IOV debug For pure SR-IOV debugging there is no need to select already separated config for the debugging of the memory based interrupts, as the latter is also very noisy on its own. Change config order and use a weak reverse dependency instead. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20251002171308.203127-1-michal.wajdeczko@intel.com	2025-10-06 19:11:30 +02:00
Shuicheng Lin	a908de69ce	drm/xe: Fix copyright and function naming in xe_ttm_vram_mgr - Correct copyright year from "2002" to "2022". - Rename ttm_vram_mgr_fini() to xe_ttm_vram_mgr_fini() to avoid confusion with generic TTM helpers. No functional changes intended. Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Nitin Gote <nitin.r.gote@intel.com> Link: https://lore.kernel.org/r/20251004000425.2489291-2-shuicheng.lin@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-10-06 12:46:28 -04:00
Piotr Piórkowski	8462d16d1b	drm/xe: Combine userspace context check Both vm->xef and XE_LRC_CREATE_USER_CTX indicate in xe_lrc_init that the context originates from userspace. However, XE_LRC_CREATE_USER_CTX has a broader scope as it may be set even when no vm->xef is present. The XE_BO_FLAG_PINNED_LATE_RESTORE flag can be extended to both cases, so there is no point in handling the two cases separately. Let's combine vm->xef and XE_LRC_CREATE_USER_CTX checks to detect userspace context. Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Suggested-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20251003162619.1984236-6-piotr.piorkowski@intel.com	2025-10-06 08:33:52 +02:00
Piotr Piórkowski	b48140f446	drm/xe/pf: Force use user VRAM for LMEM provisioning The LMEM assigned to VFs should be allocated from the general-purpose VRAM pool, not from the kernel-reserved region. Let's force the use of general-purpose VRAM for BOs intended for VFs. Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20251003162619.1984236-5-piotr.piorkowski@intel.com	2025-10-06 08:33:51 +02:00
Piotr Piórkowski	3f6cd669d5	drm/xe: Force user context allocations in user VRAM In general, kernel structures should be allocated in the kernel-dedicated VRAM region. However, userspace context data - while used by the kernel - does not need to reside there. Let's force the allocation of such data in the general-purpose VRAM region accessible to userspace. Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20251003162619.1984236-4-piotr.piorkowski@intel.com	2025-10-06 08:33:49 +02:00
Piotr Piórkowski	9d290ab0b5	drm/xe: Introduce new BO flag XE_BO_FLAG_FORCE_USER_VRAM When using a separate VRAM region for kernel allocations, some kernel structures, such as context userspace data, should not reside in the VRAM region dedicated to the kernel. The VRAM kernel region is intended only for allocations necessary for driver operation. Allocations created via ioctl are long-lived and not easily evictable. If this region runs out of space, there may not be a fallback, which could cause failures. To prevent this, add a new BO flag that explicitly forces the BO to be allocated in the general-purpose VRAM region accessible to userspace, avoiding the kernel-only VRAM region. v2: - update commit message (Matthew) Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20251003162619.1984236-3-piotr.piorkowski@intel.com	2025-10-06 08:33:48 +02:00
Piotr Piórkowski	db7dde9904	drm/xe: Add initial support for separate kernel VRAM region on the tile So far, kernel and userspace allocations have shared the same VRAM region. However, in some scenarios, it may be necessary to reserve a separate VRAM area exclusively for kernel allocations. Let's add preliminary support for such a configuration. v2: - replaced for_each_bo_flag_vram with the improved for_each_set_bo_vram_flag helper (Matthew) - moved the VRAM flag iteration macro definition into xe_bo.c (Matthew) - drop unused bo_flgas from bo_vram_flags_to_vram_placement (Matthew) - use hweight32 helper in __xe_bo_fixed_placement for readability (Matthew) v3: remove unnecessary VRAM fixup id Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20251003162619.1984236-2-piotr.piorkowski@intel.com	2025-10-06 08:33:46 +02:00
Matthew Brost	bdc2fb17ae	Revert "drm/xe/vf: Fixup CTB send buffer messages after migration" This reverts commit `cef88d1265`. Due to change in the VF migration recovery design this code is not needed any more. v3: - Add commit message (Michal / Lucas) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20251002233824.203417-4-michal.wajdeczko@intel.com	2025-10-03 20:36:26 -07:00
Matthew Brost	6c640592e8	Revert "drm/xe/vf: Post migration, repopulate ring area for pending request" This reverts commit `a0dda25d24`. Due to change in the VF migration recovery design this code is not needed any more. v3: - Add commit message (Michal / Lucas) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20251002233824.203417-3-michal.wajdeczko@intel.com	2025-10-03 20:36:24 -07:00
Matthew Brost	08c98f3f2b	Revert "drm/xe/vf: Rebase exec queue parallel commands during migration recovery" This reverts commit `ba180a3621`. Due to change in the VF migration recovery design this code is not needed any more. v3: - Add commit message (Michal / Lucas) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20251002233824.203417-2-michal.wajdeczko@intel.com	2025-10-03 20:36:23 -07:00
Michal Wajdeczko	2a8fcf7cc9	drm/xe/pf: Synchronize VF FLR between all GTs The PF part of the VF FLR processing shall be done after all GuCs confirm that they finished their part VF FLR processing, otherwise PF may start clearing VF's GGTT that other GuC may still accessing. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://lore.kernel.org/r/20250930233525.201263-7-michal.wajdeczko@intel.com	2025-10-02 23:58:35 +02:00
Michal Wajdeczko	03dc00c782	drm/xe/pf: Split VF FLR processing function On multi-GT platforms (like PTL) we may want to run VF FLR on each GuC (render and media) in parallel. Split our FLR function to allow to wait for GT VF FLR completion separately. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://lore.kernel.org/r/20250930233525.201263-6-michal.wajdeczko@intel.com	2025-10-02 23:58:33 +02:00
Michal Wajdeczko	1f018c8496	drm/xe/pf: Unify VF state tracking log By using single function that dumps VF state transition, final logs are easier to analyze as there is always the same call site in every debug message. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://lore.kernel.org/r/20250930233525.201263-5-michal.wajdeczko@intel.com	2025-10-02 23:58:32 +02:00
Michal Wajdeczko	5b7451fdd7	drm/xe/pf: Expose VF control operations over debugfs To allow the user to control the activity of individual VFs, expose basic VF control operations (pause, resume, stop, reset) over the debugfs as write-only files: /sys/kernel/debug/dri/BDF/sriov/ ├── vf1 │ ├── pause │ ├── reset │ ├── resume │ ├── stop │ : ├── vf2 : : Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://lore.kernel.org/r/20250930233525.201263-4-michal.wajdeczko@intel.com	2025-10-02 23:58:31 +02:00
Michal Wajdeczko	ac43294e8e	drm/xe/pf: Log only top level VF state changes The user likely only care about top level VF state changes, any VF state logs on the per-GT basis can be demoted to the debug level. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://lore.kernel.org/r/20250930233525.201263-3-michal.wajdeczko@intel.com	2025-10-02 23:58:30 +02:00
Michal Wajdeczko	c97cdf7686	drm/xe/pf: Add top level functions to control VFs We already have control functions that we use to control the VF state on the per-GT basis, but that is low level detail from the user point of view, who rather expects VF-level functions. For now add simple functions that just iterate over all GTs and call per-GT control function. We will soon allow to use some of them from the user facing interfaces like debugfs. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://lore.kernel.org/r/20250930233525.201263-2-michal.wajdeczko@intel.com	2025-10-02 23:58:28 +02:00
Michal Wajdeczko	846a81abbe	drm/xe: Detect GT workqueue allocation failure The allocation of the per-GT workqueue may fail and we shouldn't ignore that. While around use drm managed allocation function to drop our custom fini action. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20251001144051.202040-1-michal.wajdeczko@intel.com	2025-10-02 18:48:10 +02:00
Niranjana Vishwanathapura	b56bc81078	drm/xe/doc: Add documentation for Execution Queues Add documentation for Xe Execution Queues and add xe_exec_queue.rst file. v2: Add info about how Execution queue interfaces with other components in the driver (Matt Brost) Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20251002044319.450181-2-niranjana.vishwanathapura@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-10-02 08:43:07 -07:00
Raag Jadav	e4863f1159	drm/xe/i2c: Don't rely on d3cold.allowed flag in system PM path In S3 and above sleep states, the device can loose power regardless of d3cold.allowed flag. Bring up I2C controller explicitly in system PM path to ensure its normal operation after losing power. v2: Cover S3 and above states (Rodrigo) Fixes: `0ea07b6951` ("drm/xe/pm: Wire up suspend/resume for I2C controller") Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250918103200.2952576-1-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-10-02 10:27:44 -04:00
Mallesh Koujalagi	07abc16c14	drm/xe/xe_late_bind_fw: Initialize uval variable in xe_late_bind_fw_num_fans() Initialize the uval variable to 0 in xe_late_bind_fw_num_fans() to fix a potential use of uninitialized variable warning and ensure predictable behavior. The variable is passed by reference to xe_pcode_read() which should populate it on success, but initializing it to 0 provides a safe default value and follows kernel coding best practices. v2: - uval = 0 which serves as both a safe default and the fallback value when the pcode read operation fails. v3: - Handle MMIO failure (Rodrigo) - The function should probably return the error and make the uval as pointer-argument, like the pcode_read. - Change the caller of this function to propagate the error upwards if mmio failed. Fixes: `45832bf9c1` ("drm/xe/xe_late_bind_fw: Initialize late binding firmware") Signed-off-by: Mallesh Koujalagi <mallesh.koujalagi@intel.com> Link: https://lore.kernel.org/r/20251002005648.3185636-1-mallesh.koujalagi@intel.com Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-10-02 10:11:07 -04:00
Thomas Hellström	ad298d9ec9	drm/gpusvm, drm/xe: Fix userptr to not allow device private pages When userptr is used on SVM-enabled VMs, a non-NULL hmm_range::dev_private_owner value might mean that hmm_range_fault() attempts to return device private pages. Either that will fail, or the userptr code will not know how to handle those. Use NULL for hmm_range::dev_private_owner to migrate such pages to system. In order to do that, move the struct drm_gpusvm::device_private_page_owner field to struct drm_gpusvm_ctx::device_private_page_owner so that it doesn't remain immutable over the drm_gpusvm lifetime. v2: - Don't conditionally compile xe_svm_devm_owner(). - Kerneldoc xe_svm_devm_owner(). Fixes: `9e97874148` ("drm/xe/userptr: replace xe_hmm with gpusvm") Cc: Matthew Auld <matthew.auld@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://lore.kernel.org/r/20250930122752.96034-1-thomas.hellstrom@linux.intel.com	2025-10-02 11:50:12 +02:00
Raag Jadav	d9c401d8f3	drm/xe/sysfs: Drop redundant runtime PM usage The device is expected to be in D0 state during driver probe. No need to resume it in ->is_visible() callbacks or non I/O operations. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250918114804.2957177-3-raag.jadav@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-09-30 07:52:45 -07:00
Raag Jadav	5a856e277b	drm/xe/hwmon: Drop redundant runtime PM usage The device is expected to be in D0 state during driver probe. No need to resume it in ->is_visible() callbacks. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250918114804.2957177-2-raag.jadav@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-09-30 07:52:39 -07:00
Colin Ian King	20f3b28e2e	drm/xe/xe_late_bind_fw: Fix missing initialization of variable offset The variable offset is not being initialized, and it is only set inside a for-loop if entry->name is the same as manifest_entry. In the case where it is not initialized a non-zero check on offset is potentialy checking a bogus uninitalized value. Fix this by initializing offset to zero. Fixes: `efa29317a5` ("drm/xe/xe_late_bind_fw: Extract and print version info") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250924102208.9216-1-colin.i.king@gmail.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-30 10:31:44 -04:00
Thomas Hellström	8f1756a7ea	drm/xe/bo: Fix an idle assertion for local bos Before calling ttm_bo_populate() in the CPU fault path of a bo, we assert that the bo is not being migrated. However, for local bos we share the reservation object with other local bos that might be in the process of being migrated. Also some VM operations may attach USAGE_KERNEL fences to the common reservation object and trigger false positives from the assert. So remove the assert and instead wait for bo idle. This may unnecessarily wait for idle in some cases but since we're doing this wait later in the fault path anyway we might as well do it here as well. This fixes warnings like: Sep 25 14:56:23 desky kernel: ------------[ cut here ]------------ Sep 25 14:56:23 desky kernel: xe 0000:03:00.0: [drm] Assertion `dma_resv_test_signaled(tbo->base.resv, DMA_RESV_USAGE_KERNEL) \|\| (tbo->ttm && ttm_tt_is_populated(tbo->ttm))` failed! platform: BATTLEMAGE subplatform: 1 graphics: Xe2_HPG 20.01 step A0 media: Xe2_HPM 13.01 step A1 Sep 25 14:56:23 desky kernel: WARNING: CPU: 6 PID: 24767 at drivers/gpu/drm/xe/xe_bo.c:1748 xe_bo_fault_migrate+0x1bb/0x300 [xe] Sep 25 14:56:23 desky kernel: Modules linked in: cpuid dm_crypt xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc xfrm_user xfr> Sep 25 14:56:23 desky kernel: snd_soc_sdca snd_seq_midi prime_numbers coretemp snd_seq_midi_event drm_ttm_helper snd_hda_codec drm_buddy drm_exec snd_rawmidi snd_soc_core snd_hda_cor> Sep 25 14:56:23 desky kernel: CPU: 6 UID: 1000 PID: 24767 Comm: steamwebhelper Tainted: G U W 6.17.0-rc7+ #32 PREEMPT(voluntary) Sep 25 14:56:23 desky kernel: Tainted: [U]=USER, [W]=WARN Sep 25 14:56:23 desky kernel: Hardware name: Micro-Star International Co., Ltd. MS-7D36/PRO Z690-P DDR4 (MS-7D36), BIOS A.A1 10/18/2022 Sep 25 14:56:23 desky kernel: RIP: 0010:xe_bo_fault_migrate+0x1bb/0x300 [xe] Sep 25 14:56:23 desky kernel: Code: fa 64 29 f9 48 c7 c7 40 e0 d3 c1 51 48 c7 c1 c0 e3 d3 c1 52 4c 8b 45 c0 41 50 44 8b 4d c8 4d 89 e0 48 8b 55 a8 e8 25 27 95 ef <0f> 0b 48 83 c4 40 4> Sep 25 14:56:23 desky kernel: RSP: 0000:ffffae1ca88c7b10 EFLAGS: 00010286 Sep 25 14:56:23 desky kernel: RAX: 0000000000000000 RBX: ffff8d7cfd7e6800 RCX: 0000000000000027 Sep 25 14:56:23 desky kernel: RDX: ffff8d845019cec8 RSI: 0000000000000001 RDI: ffff8d845019cec0 Sep 25 14:56:23 desky kernel: RBP: ffffae1ca88c7bc8 R08: 0000000000000000 R09: 0000000000000000 Sep 25 14:56:23 desky kernel: R10: 0000000000000000 R11: 0000000000000004 R12: ffffffffc1db1faa Sep 25 14:56:23 desky kernel: R13: ffffffffc1db2ab4 R14: 0000000000000001 R15: ffffae1ca88c7bd8 Sep 25 14:56:23 desky kernel: FS: 00007fb1baf31940(0000) GS:ffff8d849c870000(0000) knlGS:0000000000000000 Sep 25 14:56:23 desky kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 25 14:56:23 desky kernel: CR2: 00007fb1b2860020 CR3: 00000001705a9004 CR4: 0000000000772ef0 Sep 25 14:56:23 desky kernel: PKRU: 55555558 Sep 25 14:56:23 desky kernel: Call Trace: Sep 25 14:56:23 desky kernel: <TASK> Sep 25 14:56:23 desky kernel: xe_bo_cpu_fault_fastpath+0x11e/0x220 [xe] Sep 25 14:56:23 desky kernel: xe_bo_cpu_fault+0x84/0x410 [xe] Sep 25 14:56:23 desky kernel: ? __x64_sys_mmap+0x33/0x50 Sep 25 14:56:23 desky kernel: ? x64_sys_call+0x1b2e/0x20d0 Sep 25 14:56:23 desky kernel: ? do_syscall_64+0x9d/0x1f0 Sep 25 14:56:23 desky kernel: ? __check_object_size+0x4a/0x2e0 Sep 25 14:56:23 desky kernel: __do_fault+0x36/0x190 Sep 25 14:56:23 desky kernel: do_fault+0xcf/0x570 Sep 25 14:56:23 desky kernel: __handle_mm_fault+0x92b/0xfe0 Sep 25 14:56:23 desky kernel: ? ktime_get_mono_fast_ns+0x39/0xd0 Sep 25 14:56:23 desky kernel: handle_mm_fault+0x164/0x2c0 Sep 25 14:56:23 desky kernel: do_user_addr_fault+0x2cb/0x840 Sep 25 14:56:23 desky kernel: exc_page_fault+0x75/0x180 Sep 25 14:56:23 desky kernel: asm_exc_page_fault+0x27/0x30 Sep 25 14:56:23 desky kernel: RIP: 0033:0x7fb1bc388bb7 Sep 25 14:56:23 desky kernel: Code: 48 ff c7 48 01 fe 48 8d 54 11 80 0f 1f 84 00 00 00 00 00 c5 fe 6f 0e c5 fe 6f 56 20 c5 fe 6f 5e 40 c5 fe 6f 66 60 48 83 ee 80 <c5> fd 7f 0f c5 fd 7> Sep 25 14:56:23 desky kernel: RSP: 002b:00007ffd7814fad8 EFLAGS: 00010207 Sep 25 14:56:23 desky kernel: RAX: 00007fb1b2860000 RBX: 0000000000000690 RCX: 00007fb1b2860000 Sep 25 14:56:23 desky kernel: RDX: 00007fb1b2860610 RSI: 0000556eda79f4c0 RDI: 00007fb1b2860020 Sep 25 14:56:23 desky kernel: RBP: 00007ffd7814fb60 R08: 0000000000000000 R09: 000000012be0e000 Sep 25 14:56:23 desky kernel: R10: 00007fb1b2860000 R11: 0000000000000246 R12: 0000556edd39a240 Sep 25 14:56:23 desky kernel: R13: 00007fb1b2dcb010 R14: 0000556eda79f420 R15: 0000000000000000 Sep 25 14:56:23 desky kernel: </TASK> Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5250 Fixes: `c2ae94cf8c` ("drm/xe: Convert the CPU fault handler for exhaustive eviction") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250929112649.6131-1-thomas.hellstrom@linux.intel.com	2025-09-30 10:33:51 +02:00
Michal Wajdeczko	65774efef2	drm/xe/debugfs: Update xe_pat_dump signature Our debugfs helper xe_gt_debugfs_show_with_rpm() expects print() functions to return int. New signature allows us to drop wrapper. While around, move kernel-doc closer to the function definition, as suggested in the doc-guide. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250923211613.193347-6-michal.wajdeczko@intel.com	2025-09-30 10:21:28 +02:00
Michal Wajdeczko	ab6ccd4f7e	drm/xe/debugfs: Update xe_mocs_dump signature Our debugfs helper xe_gt_debugfs_show_with_rpm() expects print() functions to return int. New signature allows us to drop wrapper. While around, move kernel-doc closer to the function definition, as suggested in the doc-guide. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250923211613.193347-5-michal.wajdeczko@intel.com	2025-09-30 10:21:27 +02:00
Michal Wajdeczko	8980530abf	drm/xe/debugfs: Update xe_tuning_dump signature Our debugfs helper xe_gt_debugfs_show_with_rpm() expects print() functions to return int. New signature allows us to drop wrapper. While around, print additional separation lines using puts() to avoid output with leading \n which might confuse some printers. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250923211613.193347-4-michal.wajdeczko@intel.com	2025-09-30 10:21:26 +02:00
Michal Wajdeczko	d06e0c33f3	drm/xe/debugfs: Update xe_wa_dump signature Our debugfs helper xe_gt_debugfs_show_with_rpm() expects print() functions to return int. New signature allows us to drop wrapper. While around, print additional separation lines using puts() to avoid output with leading \n which might confuse some printers. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250923211613.193347-3-michal.wajdeczko@intel.com	2025-09-30 10:21:24 +02:00
Michal Wajdeczko	103094205d	drm/xe/debugfs: Update xe_gt_topology_dump signature Our debugfs helper xe_gt_debugfs_show_with_rpm() expects print() functions to return int. New signature allows us to drop wrapper. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250923211613.193347-2-michal.wajdeczko@intel.com	2025-09-30 10:21:23 +02:00
Michal Wajdeczko	486d7f1bd1	drm/xe/pf: Make GGTT/LMEM debugfs files per-tile Due to initial design of the Xe debugfs, the GGTT and LMEM files were defined on the primary GT, instead of being per-tile. While PF provisioning code is now still maintaining GGTT and LMEM also on the per primary-GT level, this will be refactored soon, but we can fix debugfs layout now, as part of the new SR-IOV tree. For backward compatibility we will provide some symlinks that can be removed once our tools will be fully converted. As we are making all those changes in the user facing interface, take this as apportunity to also start replacing the "LMEM" term, used by the SR-IOV code, with the "VRAM" term, used by Xe driver. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250928140029.198847-7-michal.wajdeczko@intel.com	2025-09-30 00:03:52 +02:00
Michal Wajdeczko	8cd71c40e9	drm/xe/debugfs: Promote xe_tile_debugfs_simple_show We will want to use this helper function in other files. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250928140029.198847-6-michal.wajdeczko@intel.com	2025-09-29 23:58:48 +02:00
Michal Wajdeczko	9a719bbf8d	drm/xe/pf: Move SR-IOV GT debugfs files to new tree Instead of expanding GT debugfs directories with large number of SR-IOV files, as those are replicated per each SR-IOV function, move them to our new debugfs tree, organized by the function. But to avoid breaking IGT tests that use current layout, provide symlinks which could be removed once transition period is over, or we can we can leave them for convenience. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250928140029.198847-5-michal.wajdeczko@intel.com	2025-09-29 23:58:47 +02:00
Michal Wajdeczko	5489e7d44a	drm/xe/pf: Populate SR-IOV debugfs tree with tiles Populate new per SR-IOV function debugfs directories with next level directories that represent tiles. There are no files yet, but we will continue updating that tree in upcoming patches. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250928140029.198847-4-michal.wajdeczko@intel.com	2025-09-29 23:58:44 +02:00
Michal Wajdeczko	4d4af0d6cb	drm/xe/pf: Create separate debugfs tree for SR-IOV files Currently we expose debugfs files related to SR-IOV functions together with other native files, but that approach will not scale well as we plan to add more attributes and also expose some of them on the per-tile basis. Start building separate tree for SR-IOV specific debugfs files where we can replicate similar files per every SR-IOV function: /sys/kernel/debug/dri/BDF/ ├── sriov │ ├── pf │ │ ├── tile0 │ │ │ ├── gt0 │ │ │ ├── gt1 │ │ │ : │ │ ├── tile1 │ │ : │ ├── vf1 │ │ ├── tile0 │ │ │ ├── gt0 │ │ │ ├── gt1 │ │ │ : │ │ : │ ├── vf2 │ ├── ... We will populate this new tree in upcoming patches. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250928140029.198847-3-michal.wajdeczko@intel.com	2025-09-29 23:58:43 +02:00
Michal Wajdeczko	1238b84ea3	drm/xe/pf: Promote PF debugfs function to its own file In upcoming patches, we will build on the PF separate debugfs tree for all SR-IOV related files and this new code will need dedicated file. To minimize large diffs later, move existing function now as-is, so any future modifications will be done directly in target file. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250928140029.198847-2-michal.wajdeczko@intel.com	2025-09-29 23:58:41 +02:00
Michal Wajdeczko	e35e288090	drm/xe/vf: Don't claim support for firmware late-bind if VF In general, the VFs can't load firmwares so attempt to initialize the firmware late-bind component leads to errors like: [] xe 0000:03:00.1: [drm] ERROR Late bind component not bound Fixes: `918bd789d6` ("drm/xe/xe_late_bind_fw: Introduce xe_late_bind_fw") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6190 Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://lore.kernel.org/r/20250928174811.198933-3-michal.wajdeczko@intel.com	2025-09-29 21:42:29 +02:00
Michal Wajdeczko	b88bb1eefa	drm/xe/vf: Rename sriov_update_device_info This is a VF only function and its name should reflect that to avoid any confusion. Move the VF check to the caller side. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250928174811.198933-2-michal.wajdeczko@intel.com	2025-09-29 21:42:28 +02:00
Shuicheng Lin	662d98b8b3	drm/xe/hw_engine_group: Fix double write lock release in error path In xe_hw_engine_group_get_mode(), a write lock is acquired before calling switch_mode(), which in turn invokes xe_hw_engine_group_suspend_faulting_lr_jobs(). On failure inside xe_hw_engine_group_suspend_faulting_lr_jobs(), the write lock is released there, and then again in xe_hw_engine_group_get_mode(), leading to a double release. Fix this by keeping both acquire and release operation in xe_hw_engine_group_get_mode(). Fixes: `770bd1d341` ("drm/xe/hw_engine_group: Ensure safe transition between execution modes") Cc: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://lore.kernel.org/r/20250925023145.1203004-2-shuicheng.lin@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-09-26 06:44:18 -07:00
Lucas De Marchi	a4916b4da4	drm/xe/guc: Refactor GuC load to use poll_timeout_us() Currently there are 2 wait loops for loading GuC: one in xe_mmio_wait32_not() and one guc_wait_ucode(). Now that there's a generic poll_timeout_us(), refactor the code to use that to be more readable. Main change in behavior is that there's no exponential wait anymore: that is now replaced by a 10msec retry. Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250922-xe-iopoll-v4-5-06438311a63f@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-09-24 21:23:19 -07:00
Lucas De Marchi	abde96d844	drm/xe/guc: Extract function to print load error Move the error parsing and print out of guc_wait_ucode() into a helper to clean up the wait function. Since now the `load_done != 1` condition has a return statement, also simplify the if/else chain. Reviewed-by: John Harrison <John.C.Harrison@Intel.com> # v2 Link: https://lore.kernel.org/r/20250922-xe-iopoll-v4-4-06438311a63f@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-09-24 21:23:19 -07:00
Lucas De Marchi	2a16f47dcc	drm/xe/guc: Drop helper to read freq As the forcewake is already held during GuC load, there's no need to use a helper function to call xe_guc_pc_get_cur_freq(). Just call xe_guc_pc_get_cur_freq_fw() directly. Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://lore.kernel.org/r/20250922-xe-iopoll-v4-3-06438311a63f@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-09-24 21:23:19 -07:00
Lucas De Marchi	b0ac4ef074	drm/xe/guc_pc: Use poll_timeout_us() for waiting Convert wait_for_pc_state() and wait_for_act_freq_limit() to poll_timeout_us(). This brings 2 changes in behavior: Drop the exponential wait and fix a potential much longer sleep. usleep_range() will wait anywhere between `wait` and `wait << 1`, so it's not correct to assume `slept += wait`. This code is not really accurate. Pairing this with the exponential wait increase, it could be waiting much longer than intended. Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://lore.kernel.org/r/20250922-xe-iopoll-v4-2-06438311a63f@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-09-24 21:23:19 -07:00
Lucas De Marchi	09ab20c41a	drm/xe/device: Use poll_timeout_us() to wait for lmem Now that there's a generic poll_timeout_us(), use it to wait for LMEM_INIT in GU_CNTL. Reviewed-by: Maarten Lankhorst <dev@lankhorst.se> Link: https://lore.kernel.org/r/20250922-xe-iopoll-v4-1-06438311a63f@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-09-24 21:23:18 -07:00

1 2 3 4 5 ...

1384730 Commits