linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-04 09:21:34 -04:00

Author	SHA1	Message	Date
Michal Wajdeczko	aec14e3370	drm/xe/vf: Don't try to capture engine data unavailable to VF Don't capture engine ring registers as thoe are not available for the VF driver. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240213154355.1221-3-michal.wajdeczko@intel.com	2024-02-13 18:59:48 +01:00
Michal Wajdeczko	a43d506008	drm/xe/vf: Assume fixed GSM size if VF VFs can't use size mirrored from PCI config, but it should be safe to assume it covers full 4GiB GGTT. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240213154355.1221-2-michal.wajdeczko@intel.com	2024-02-13 18:59:47 +01:00
Thomas Hellström	157261c58b	drm/xe/pt: Allow for stricter type- and range checking Distinguish between xe_pt and the xe_pt_dir subclass when allocating and freeing. Also use a fixed-size array for the xe_pt_dir page entries to make life easier for dynamic range- checkers. Finally rename the page-directory child pointer array to "children". While no functional change, this fixes ubsan splats similar to: [ 51.463021] ------------[ cut here ]------------ [ 51.463022] UBSAN: array-index-out-of-bounds in drivers/gpu/drm/xe/xe_pt.c:47:9 [ 51.463023] index 0 is out of range for type 'xe_ptw []' [ 51.463024] CPU: 5 PID: 2778 Comm: xe_vm Tainted: G U 6.8.0-rc1+ #218 [ 51.463026] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 2001 02/01/2023 [ 51.463027] Call Trace: [ 51.463028] <TASK> [ 51.463029] dump_stack_lvl+0x47/0x60 [ 51.463030] __ubsan_handle_out_of_bounds+0x95/0xd0 [ 51.463032] xe_pt_destroy+0xa5/0x150 [xe] [ 51.463088] __xe_pt_unbind_vma+0x36c/0x9b0 [xe] [ 51.463144] xe_vm_unbind+0xd8/0x580 [xe] [ 51.463204] ? drm_exec_prepare_obj+0x3f/0x60 [drm_exec] [ 51.463208] __xe_vma_op_execute+0x5da/0x910 [xe] [ 51.463268] ? __drm_gpuvm_sm_unmap+0x1cb/0x220 [drm_gpuvm] [ 51.463272] ? radix_tree_node_alloc.constprop.0+0x89/0xc0 [ 51.463275] ? drm_gpuva_it_remove+0x1f3/0x2a0 [drm_gpuvm] [ 51.463279] ? drm_gpuva_remove+0x2f/0xc0 [drm_gpuvm] [ 51.463283] xe_vm_bind_ioctl+0x1a55/0x20b0 [xe] [ 51.463344] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 51.463414] drm_ioctl_kernel+0xb6/0x120 [ 51.463416] drm_ioctl+0x287/0x4e0 [ 51.463418] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 51.463481] __x64_sys_ioctl+0x94/0xd0 [ 51.463484] do_syscall_64+0x86/0x170 [ 51.463486] ? syscall_exit_to_user_mode+0x7d/0x200 [ 51.463488] ? do_syscall_64+0x96/0x170 [ 51.463490] ? do_syscall_64+0x96/0x170 [ 51.463492] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 51.463494] RIP: 0033:0x7f246bfe817d [ 51.463498] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00 [ 51.463501] RSP: 002b:00007ffc1bd19ad0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 51.463502] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f246bfe817d [ 51.463504] RDX: 00007ffc1bd19b60 RSI: 0000000040886445 RDI: 0000000000000003 [ 51.463505] RBP: 00007ffc1bd19b20 R08: 0000000000000000 R09: 0000000000000000 [ 51.463506] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc1bd19b60 [ 51.463508] R13: 0000000040886445 R14: 0000000000000003 R15: 0000000000010000 [ 51.463510] </TASK> [ 51.463517] ---[ end trace ]--- v2 - Fix kerneldoc warning (Matthew Brost) Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240209112655.4872-1-thomas.hellstrom@linux.intel.com	2024-02-12 22:57:35 +01:00
Jani Nikula	f8237c8c6a	drm/xe: use drm based debugging instead of dev Prefer drm_dbg() over dev_dbg(). Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240212145757.645094-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2024-02-12 20:28:11 +02:00
Matthew Auld	63fb531fbf	drm/xe/display: fix i915_gem_object_is_shmem() wrapper shmem ensures the memory is cleared on allocation, however here we are using TTM, which doesn't natively support shmem (other than for swap), but instead just allocates normal system memory. And we only zero such memory for userspace allocations. In the case of intel_fbdev we are missing the memset_io() since display path incorrectly thinks object is shmem based. Fixes: `44e694958b` ("drm/xe/display: Implement display support") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240205153110.38340-2-matthew.auld@intel.com	2024-02-09 10:28:48 +00:00
Dani Liberman	82bd83a0cf	drm/xe/irq: allocate all possible msix interrupts If platform supports MSIX, driver needs to allocate all possible interrupts. v2: - drop msix_cap and use the api return code instead. - fix commit message. v3: - pass specific type in irq flags. Cc: Ohad Sharabi <osharabi@habana.ai> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240124075058.2302235-1-dliberman@habana.ai	2024-02-08 13:26:44 -08:00
Thomas Hellström	eb538b5574	drm/xe/vm: Avoid reserving zero fences The function xe_vm_prepare_vma was blindly accepting zero as the number of fences and forwarded that to drm_exec_prepare_obj. However, that leads to an out-of-bounds shift in the dma_resv_reserve_fences() and while one could argue that the dma_resv code should be robust against that, avoid attempting to reserve zero fences. Relevant stack trace: [773.183188] ------------[ cut here ]------------ [773.183199] UBSAN: shift-out-of-bounds in ../include/linux/log2.h:57:13 [773.183241] shift exponent 64 is too large for 64-bit type 'long unsigned int' [773.183254] CPU: 2 PID: 1816 Comm: xe_evict Tainted: G U 6.8.0-rc3-xe #1 [773.183256] Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 2014 10/14/2022 [773.183257] Call Trace: [773.183258] <TASK> [773.183260] dump_stack_lvl+0xaf/0xd0 [773.183266] dump_stack+0x10/0x20 [773.183283] ubsan_epilogue+0x9/0x40 [773.183286] __ubsan_handle_shift_out_of_bounds+0x10f/0x170 [773.183293] dma_resv_reserve_fences.cold+0x2b/0x48 [773.183295] ? ww_mutex_lock+0x3c/0x110 [773.183301] drm_exec_prepare_obj+0x45/0x60 [drm_exec] [773.183313] xe_vm_prepare_vma+0x33/0x70 [xe] [773.183375] xe_vma_destroy_unlocked+0x55/0xa0 [xe] [773.183427] xe_vm_close_and_put+0x526/0x940 [xe] Fixes: `2714d50936` ("drm/xe: Convert pagefaulting code to use drm_exec") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240208132115.3132-1-thomas.hellstrom@linux.intel.com	2024-02-08 16:32:08 +01:00
Lucas De Marchi	6badfc463d	drm/xe: Avoid cryptic message when there's no GuC definition If there's no GuC firmware entry in the table and the user didn't pass an override path, the error message is very cryptic: xe will simply try to continue and then fail when submitting the default context: xe 0000:00:02.0: [drm:xe_pci_probe [xe]] XE_LUNARLAKE 64b0:0001 dgfx:0 gfx:Xe2_LPG (20.04) media:Xe2_LPM (20.00) display:no dma_m_s:46 tc:1 gscfi:0 ... xe: probe of 0000:00:02.0 failed with error -22 Add an explicit error message and bail out: xe 0000:00:02.0: [drm:xe_pci_probe [xe]] XE_LUNARLAKE 64b0:0001 dgfx:0 gfx:Xe2_LPG (20.04) media:Xe2_LPM (20.00) display:no dma_m_s:46 tc:1 gscfi:0 xe 0000:00:02.0: [drm] ERROR No GuC firmware defined for platform xe 0000:00:02.0: [drm] ERROR GuC init failed with -2 ... xe: probe of 0000:00:02.0 failed with error -2 Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240201224724.551130-3-lucas.demarchi@intel.com	2024-02-07 15:08:21 -08:00
Lucas De Marchi	4588341896	drm/xe: Always allow to override firmware The current logic for firmware selection is reminiscent from i915 where there are 2 backends and several platforms support only 1: execlist or GuC. The xe driver has only the GuC backend and it simply doesn't work without it. Allow developers to override the firmware path even if there isn't a firmware entry in the table yet: this allows developers to more easily test the very first firmware before adding it there. The justification above is only true for GuC, however those override paths should really be viewed as developer aid param. Simply make the same logic for all firmwares and allow the override path to be used for all of them. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240201224724.551130-2-lucas.demarchi@intel.com	2024-02-07 15:07:31 -08:00
Matthew Brost	d9890c028d	drm/xe: Remove TEST_VM_ASYNC_OPS_ERROR TEST_VM_ASYNC_OPS_ERROR is broken and unused. Remove for now and will pull back in a later time when it is used, fixed, and properly hidden behind a Kconfig option. Also fixup the supported flags value. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240206045010.2981051-1-matthew.brost@intel.com	2024-02-06 08:51:49 -08:00
Riana Tauro	95ec8c1d6c	drm/xe/pm: add debug logs for D3cold add additional debug logs for PME# capability and presence of ACPI _PR3 resources. This is to identify the reason why the card is not capable of D3cold. No functional changes Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240206055917.2629027-1-riana.tauro@intel.com	2024-02-06 08:49:45 -05:00
Karthik Poosa	404669db60	drm/xe/hwmon: Refactor xe hwmon Check latest platform first in xe_hwmon_get_reg. Move PVC HWMON registers to regs/xe_pcode.h. Suggested-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240201180600.434822-1-karthik.poosa@intel.com	2024-02-06 08:42:03 -05:00
Matthew Auld	8087199cd5	drm/xe/vm: don't ignore error when in_kthread If GUP fails and we are in_kthread, we can have pinned = 0 and ret = 0. If that happens we call sg_alloc_append_table_from_pages() with n_pages = 0, which is not well behaved and can trigger: kernel BUG at include/linux/scatterlist.h:115! depending on if the pages array happens to be zeroed or not. Even if we don't hit that it crashes later when trying to dma_map the returned table. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202171435.427630-2-matthew.auld@intel.com	2024-02-06 10:36:03 +00:00
Matthew Brost	5ad6af5c91	drm/xe: Assume large page size if VMA not yet bound The calculation to determine max page size of a VMA during a REMAP operations assumes the VMA has been bound. This assumption is not true if the VMA is from an eariler operation in an array of binds. If a VMA has not been bound use the maximum page size which will ensure the previous / next REMAP operations are not incorrectly skipped. Fixes: `8f33b4f054` ("drm/xe: Avoid doing rebinds") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240205231714.2956225-1-matthew.brost@intel.com	2024-02-05 17:42:19 -08:00
Nirmoy Das	17ffcdb041	drm/xe/query: Use kzalloc for drm_xe_query_engines Use kzalloc like other routines for better consistency. v2: Improve the subject(Matt) Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240131051838.24705-1-nirmoy.das@intel.com	2024-02-02 22:44:55 -08:00
John Harrison	db0adab049	drm/xe/guc: Add support for LNL firmware First release of GuC firmware for LNL is now available, so start using it. v2: Actually use xe directory. Doh! (review feedback from Lucas) Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202200017.2133438-6-John.C.Harrison@Intel.com	2024-02-02 22:12:27 -08:00
John Harrison	32ca46bf29	drm/xe/guc: Update to GuC firmware 70.19.2 API compatibility version: 1.8.2 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202200017.2133438-5-John.C.Harrison@Intel.com	2024-02-02 22:12:01 -08:00
John Harrison	6650ad3e09	drm/xe/uc: Include patch version in expectations Patch level releases can be just as important as major level releases if they fix a critical bug. So include the patch version in the expectation check so the user is properly informed if they need to update. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202200017.2133438-4-John.C.Harrison@Intel.com	2024-02-02 22:11:25 -08:00
Xiaoming Wang	86c99abb5f	drm/xe/display: Fix memleak in display initialization intel_power_domains_init is called twice in xe_device_probe: 1) intel_power_domains_init() xe_display_init_nommio() xe_device_probe() 2) intel_power_domains_init() intel_display_driver_probe_noirq() xe_display_init_noirq() xe_device_probe() It needs remove one to avoid power_domains->power_wells double malloc. unreferenced object 0xffff88811150ee00 (size 512): comm "systemd-udevd", pid 506, jiffies 4294674198 (age 3605.560s) hex dump (first 32 bytes): 10 b4 9d a0 ff ff ff ff ff ff ff ff ff ff ff ff ................ ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff8134b901>] __kmem_cache_alloc_node+0x1c1/0x2b0 [<ffffffff812c98b2>] __kmalloc+0x52/0x150 [<ffffffffa08b0033>] __set_power_wells+0xc3/0x360 [xe] [<ffffffffa08562fc>] xe_display_init_nommio+0x4c/0x70 [xe] [<ffffffffa07f0d1c>] xe_device_probe+0x3c/0x5a0 [xe] [<ffffffffa082e48f>] xe_pci_probe+0x33f/0x5a0 [xe] [<ffffffff817f2187>] local_pci_probe+0x47/0xa0 [<ffffffff817f3db3>] pci_device_probe+0xc3/0x1f0 [<ffffffff8192f2a2>] really_probe+0x1a2/0x410 [<ffffffff8192f598>] __driver_probe_device+0x78/0x160 [<ffffffff8192f6ae>] driver_probe_device+0x1e/0x90 [<ffffffff8192f92a>] __driver_attach+0xda/0x1d0 [<ffffffff8192c95c>] bus_for_each_dev+0x7c/0xd0 [<ffffffff8192e159>] bus_add_driver+0x119/0x220 [<ffffffff81930d00>] driver_register+0x60/0x120 [<ffffffffa05e50a0>] 0xffffffffa05e50a0 The call to intel_power_domains_cleanup() needs to stay where it is for now. The main issue is that while the init is called by the display side, shared by i915 and xe, the cleanup is called by a non-shared code path. Fixing that will be done as a separate commit. Fixes: `44e694958b` ("drm/xe/display: Implement display support") Signed-off-by: Xiaoming Wang <xiaoming.wang@intel.com> [ reword commit message and explain why the fini needs to stay where it is ] Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202215658.561298-1-lucas.demarchi@intel.com	2024-02-02 22:05:07 -08:00
Matthew Brost	72f86ed3c8	drm/xe: Map both mem.kernel_bb_pool and usm.bb_pool For integrated devices we need to map both mem.kernel_bb_pool and usm.bb_pool to be able to run batches from both pools. Fixes: `a682b6a42d` ("drm/xe: Support device page faults on integrated platforms") Tested-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Brian Welty <brian.welty@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202033440.2351862-1-matthew.brost@intel.com	2024-02-02 17:37:58 -08:00
Arnd Bergmann	774ef5dfc9	drm/xe: circumvent bogus stringop-overflow warning gcc-13 warns about an array overflow that it sees but that is prevented by the "asid % NUM_PF_QUEUE" calculation: drivers/gpu/drm/xe/xe_gt_pagefault.c: In function 'xe_guc_pagefault_handler': include/linux/fortify-string.h:57:33: error: writing 16 bytes into a region of size 0 [-Werror=stringop-overflow=] include/linux/fortify-string.h:689:26: note: in expansion of macro '__fortify_memcpy_chk' 689 \| #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \ \| ^~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/xe/xe_gt_pagefault.c:341:17: note: in expansion of macro 'memcpy' 341 \| memcpy(pf_queue->data + pf_queue->tail, msg, len * sizeof(u32)); \| ^~~~~~ drivers/gpu/drm/xe/xe_gt_types.h:102:25: note: at offset [1144, 265324] into destination object 'tile' of size 8 I found that rewriting the assignment using pointer addition rather than the equivalent array index calculation prevents the warning, so use that instead. I sent a bug report against gcc for the false positive warning. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113214 Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240103114819.2913937-1-arnd@kernel.org	2024-02-02 14:01:12 -08:00
Matthew Brost	447f74d223	drm/xe: Pick correct userptr VMA to repin on REMAP op failure A REMAP op is composed of 3 VMA's - unmap, prev map, and next map. When op_execute fails with -EAGAIN we need to update the local VMA pointer to the current op state and then repin the VMA if it is a userptr. Fixes a failure seen in xe_vm.munmap-style-unbind-userptr-one-partial. Fixes: `b06d47be7c` ("drm/xe: Port Xe to GPUVA") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240201004849.2219558-3-matthew.brost@intel.com	2024-02-02 06:55:09 -08:00
Matthew Brost	a856b67a84	drm/xe: Take a reference in xe_exec_queue_last_fence_get() Take a reference in xe_exec_queue_last_fence_get(). Also fix a reference counting underflow bug VM bind and unbind. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240201004849.2219558-2-matthew.brost@intel.com	2024-02-02 06:45:57 -08:00
Matthew Brost	5fcbf83e39	drm/xe: Drop rebind argument from xe_pt_prepare_bind This is unused, drop it. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240201184844.2317004-1-matthew.brost@intel.com	2024-02-01 18:43:11 -08:00
Matthew Brost	3acc1ff1a7	drm/xe: Fix loop in vm_bind_ioctl_ops_unwind The logic for the unwind loop is incorrect resulting in an infinite loop. Fix to unwind to go from the last operations list to he first. Fixes: `617eebb9c4` ("drm/xe: Fix array of binds") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240201175532.2303168-1-matthew.brost@intel.com	2024-02-01 18:42:39 -08:00
Suraj Kandpal	d6beadc8d7	drm/xe/gsc: Add status check during gsc header readout Before checking if data is present in the message reply check the status in header and see if it indicates any error. --v2 - Use drm_err() instead of drm_dbg_kms() [Daniele] --v3 - Use &xe->drm in drm_err to make it more cleaner [Daniele] Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240124045248.687023-1-suraj.kandpal@intel.com	2024-02-01 13:25:47 -08:00
Thomas Hellström	5bd24e7882	drm/xe/vm: Subclass userptr vmas The construct allocating only parts of the vma structure when the userptr part is not needed is very fragile. A developer could add additional fields below the userptr part, and the code could easily attempt to access the userptr part even if its not persent. So introduce xe_userptr_vma which subclasses struct xe_vma the proper way, and accordingly modify a couple of interfaces. This should also help if adding userptr helpers to drm_gpuvm. v2: - Fix documentation of to_userptr_vma() (Matthew Brost) - Fix allocation and freeing of vmas to clearer distinguish between the types. Closes: https://lore.kernel.org/intel-xe/0c4cc1a7-f409-4597-b110-81f9e45d1ffe@embeddedor.com/T/#u Fixes: `a4cc60a55f` ("drm/xe: Only alloc userptr part of xe_vma for userptrs") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240131091628.12318-1-thomas.hellstrom@linux.intel.com	2024-02-01 09:49:26 +01:00
Matthew Brost	152ca51d8d	drm/xe: Use LRC prefix rather than CTX prefix in lrc desc defines The sparc build fails [1] due to CTX_VALID being redefined. Fix this by using a better naming convention of LRC_VALID as this define is used in setting bits in the lrc descriptor. To be uniform, change other define with LRC prefix too. [1] https://lore.kernel.org/all/20240123111235.3097079-1-geert@linux-m68k.org/ v2: - s/LEGACY_64B_CONTEXT/LRC_LEGACY_64B_CONTEXT (Lucas) Fixes: `0bc519d20f` ("drm/xe: Remove GEN[0-9]*_ prefixes") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240123212638.1605626-1-matthew.brost@intel.com	2024-01-31 20:00:47 -08:00
Matthew Brost	d83d8ae275	drm/xe: Make all GuC ABI shift values unsigned All GuC ABI definitions are unsigned and not defining as unsigned is causing build errors [1]. [1] https://lore.kernel.org/all/20240123111235.3097079-1-geert@linux-m68k.org/ Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240131025424.2087936-1-matthew.brost@intel.com	2024-01-31 09:40:10 -08:00
Matt Roper	996da37ffa	drm/xe: Convert job timeouts from assert to warning xe_assert() is intended to be used only for "impossible" situations that should never be hit (and if they are hit it means there's a driver bug somewhere); assertions are only compiled into debug builds. Although we expect jobs submitted by the kernel to be well-behaved and run without error, timeouts are a legitimate possibility for reasons beyond our control (bad firmware, flaky hardware, etc.). We should use a real WARN if we encounter these, even for non-debug builds, to ensure the issue is being properly highlighted in bug reports and such. Also give the WARNs more human-readable messages and move them below the general notice-level message that gets printed for any kind of timeout to make the errors a bit more understandable. v2: - Convert the VM / exec_queue_killed assertion as well. (MattB) Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240130200308.1429134-2-matthew.d.roper@intel.com	2024-01-31 08:09:30 -08:00
Thomas Hellström	78366eed68	drm/xe: Don't use __user error pointers The error pointer macros are not aware of __user pointers and as a consequence sparse warns. Have the copy_mask() function return an integer instead of a __user pointer. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240117134048.165425-5-thomas.hellstrom@linux.intel.com	2024-01-31 14:34:35 +01:00
Thomas Hellström	97fd7a7e4e	drm/xe: Annotate mcr_[un]lock() These functions acquire and release the gt::mcr_lock. Annotate accordingly. Fix the corresponding sparse warning. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Fixes: `fb1d55efdf` ("drm/xe: Cleanup OPEN_BRACE style issues") Cc: Francois Dugast <francois.dugast@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240117134048.165425-4-thomas.hellstrom@linux.intel.com	2024-01-31 14:34:35 +01:00
Jani Nikula	1e5a4dfe38	drm/xe: drop display/ subdir from include directories There are very few places that need to include anything from under display/. Require the display/ prefix in #include directives, and drop the subdirectory from the header search path. Sort the include lists while at it. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240122101428.2683468-2-jani.nikula@intel.com	2024-01-31 14:59:07 +02:00
Jani Nikula	f01ece502a	drm/xe: move xe_display.[ch] under display/ All the other display related files are under display/ subdirectory, also move xe_display.[ch] there. Sort the build list while at it. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240122101428.2683468-1-jani.nikula@intel.com	2024-01-31 14:58:40 +02:00
Matthew Brost	d1df9bfbf6	drm/xe: Only allow 1 ufence per exec / bind IOCTL The way exec ufences are coded only 1 ufence per IOCTL will be signaled. It is possible to fix this but for current use cases 1 ufence per IOCTL is sufficient. Enforce a limit of 1 ufence per IOCTL (both exec and bind to be uniform). v2: - Add fixes tag (Thomas) Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Mika Kahola <mika.kahola@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Brian Welty <brian.welty@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240124234413.1640825-1-matthew.brost@intel.com	2024-01-30 15:51:04 -08:00
José Roberto de Souza	be7d51c5b4	drm/xe: Add batch buffer addresses to devcoredump Those addresses are necessary to Mesa tools knows where in VM are the batch buffers to parse and print instructions that are human readable. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Maarten Lankhorst <dev@lankhorst.se> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240130135648.30211-2-jose.souza@intel.com	2024-01-30 11:53:47 -08:00
José Roberto de Souza	5746eaaa80	drm/xe: Add functions to convert regular address to canonical address and back Some instructions requires canonical address like MI_BATCH_BUFFER_START(UMDs must call xe_exec with a canonical address for Xe2+). So here adding functions to convert regular address to canonical address and back, the first user of this functions will be added in the next patch. v3: - inline removed - rename highest_address_bit_get() to ppgtt_msb_get() v4: - use xe->info.va_bits instead of xe->info.dma_mask_size BSpec: 47626 Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Maarten Lankhorst <dev@lankhorst.se> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240130135648.30211-1-jose.souza@intel.com	2024-01-30 11:53:47 -08:00
José Roberto de Souza	8945a46a7c	drm/xe: Use function to emit PIPE_CONTROL This reduces code duplication in xe_ring_ops. v2: - fix flags of emit_pipe_imm_ggtt() - reduce to only one function v3: - fix emit_pipe_imm_ggtt() stall_only check Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240130132249.8615-1-jose.souza@intel.com	2024-01-30 11:51:53 -08:00
Karthik Poosa	cd43106c9b	drm/xe/guc: Reduce a print from warn to debug Reduce debug print from warn to debug to avoid unnecessary warning message in dmesg: the firmware loading logic already has the right printk priority level when checking the firmware version. Fixes: `c5a06c9169` ("drm/xe/guc: Enable WA 14018913170") Suggested-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240125165652.3764711-1-karthik.poosa@intel.com [ slightly reword debug and commit messages ] Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-01-30 11:37:38 -08:00
Lucas De Marchi	aeacfd2dbe	drm/xe/xe2: Enable has_usm When xe2 support started to be added, USM was still not functional. This has changed, and now USM can be enabled for xe2. Remove FIXME leftover to allow VM to be created with DRM_XE_VM_CREATE_FLAG_FAULT_MODE. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240129214510.123829-1-lucas.demarchi@intel.com	2024-01-30 06:54:26 -08:00
Badal Nilawar	20485e3a81	drm/hwmon: Fix abi doc warnings This fixes warnings in xe, i915 hwmon docs: Warning: /sys/devices/.../hwmon/hwmon<i>/curr1_crit is defined 2 times: Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon:35 Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon:52 Warning: /sys/devices/.../hwmon/hwmon<i>/energy1_input is defined 2 times: Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon:54 Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon:65 Warning: /sys/devices/.../hwmon/hwmon<i>/in0_input is defined 2 times: Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon:46 Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon:0 Warning: /sys/devices/.../hwmon/hwmon<i>/power1_crit is defined 2 times: Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon:22 Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon:39 Warning: /sys/devices/.../hwmon/hwmon<i>/power1_max is defined 2 times: Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon:0 Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon:8 Warning: /sys/devices/.../hwmon/hwmon<i>/power1_max_interval is defined 2 times: Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon:62 Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon:30 Warning: /sys/devices/.../hwmon/hwmon<i>/power1_rated_max is defined 2 times: Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon:14 Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon:22 Use a path containing the driver name to differentiate the documentation of each entry. Fixes: `fb1b70607f` ("drm/xe/hwmon: Expose power attributes") Fixes: `92d44a422d` ("drm/xe/hwmon: Expose card reactive critical power") Fixes: `fbcdc9d3bf` ("drm/xe/hwmon: Expose input voltage attribute") Fixes: `71d0a32524` ("drm/xe/hwmon: Expose hwmon energy attribute") Fixes: `4446fcf220` ("drm/xe/hwmon: Expose power1_max_interval") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/all/20240125113345.291118ff@canb.auug.org.au/ Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240127165040.2348009-1-badal.nilawar@intel.com	2024-01-29 20:41:10 -08:00
Matt Roper	9f5971bdf7	drm/xe: Grab mem_access when disabling C6 on skip_guc_pc platforms If skip_guc_pc is set for a platform, C6 is disabled directly without acquiring a mem_access reference, triggering an assertion inside xe_gt_idle_disable_c6. Fixes: `975e4a3795` ("drm/xe: Manually setup C6 when skip_guc_pc is set") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240126220613.865939-2-matthew.d.roper@intel.com	2024-01-29 08:55:05 -08:00
Fei Yang	348769d1cb	drm/xe: correct the assertion for number of PTEs While one MI_STORE_DATA_IMM can take no more than 0x1fe qwords, the size of the pgtable can be 512 entries. Fixes: `43d48379c9` ("drm/xe: correct the calculation of remaining size") Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Fei Yang <fei.yang@intel.com> Tested-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240125065245.1204731-2-fei.yang@intel.com	2024-01-26 18:29:05 -08:00
Matthew Brost	d688b86a29	drm/xe/guc: Flush G2H handler when turning off CTs Make sure G2H handler is not running when changing the CT state to drop messages or disabled. This will help prevent races in the code ensuring that G2H are not being processed after changing the state. v2: - s/flush_g2h_handler/stop_g2h_handler (Michal) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Rodrigo remove the extra line while pushing] Link: https://patchwork.freedesktop.org/patch/msgid/20240122210156.1517444-4-matthew.brost@intel.com	2024-01-26 15:14:01 -05:00
Matthew Brost	83a7173bac	drm/xe: Move TLB invalidation reset before HW reset This is a software reset which can be done immediately after stopping the UC. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240122210156.1517444-3-matthew.brost@intel.com	2024-01-26 15:12:29 -05:00
Matthew Brost	dc75d03716	drm/xe/guc: Add more GuC CT states The Guc CT has more than enabled / disables states rather it has 4. The 4 states are not initialized, disabled, stopped, and enabled. Change the code to reflect this. These states will enable proper return codes from functions and therefore enable proper error messages. v2: - s/XE_GUC_CT_STATE_DROP_MESSAGES/XE_GUC_CT_STATE_STOPPED (Michal) - Add assert for CT being initialized (Michal) - Fix kernel for CT state enum (Michal) v3: - Kernel doc (Michal) - s/reiecved/received (Michal) - assert CT state not initialized in xe_guc_ct_init (Michal) - add argument xe_guc_ct_set_state to clear g2h (Michal) v4: - Drop clear_outstanding_g2h argument (Michal) v5: - Move xa_destroy outside of fast lock (CI) Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240122210156.1517444-2-matthew.brost@intel.com	2024-01-26 15:12:28 -05:00
José Roberto de Souza	c6878e4743	drm/xe: Fix crash in trace_dma_fence_init() trace_dma_fence_init() uses dma_fence_ops functions like get_driver_name() and get_timeline_name() to generate trace information but the Xe KMD implementation of those functions makes use of xe_hw_fence_ctx that was being set after dma_fence_init(). So here just inverting the order to fix the crash. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240124171830.95774-1-jose.souza@intel.com	2024-01-25 05:31:54 -08:00
Jani Nikula	439987f6f4	drm/xe: don't build debugfs files when CONFIG_DEBUG_FS=n If we unconditionally build the debugfs files, we'll get both the static inline stubs from the headers and the real functions for CONFIG_DEBUG_FS=n. Avoid building the debugfs files with that config. Reported-by: Randy Dunlap <rdunlap@infradead.org> Closes: https://lore.kernel.org/r/152521f9-119f-4c61-b467-3e91f4aecb1a@infradead.org Signed-off-by: Jani Nikula <jani.nikula@intel.com> Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240124090515.3363901-1-jani.nikula@intel.com	2024-01-24 14:34:07 -08:00
José Roberto de Souza	28a98c39fa	drm/xe: Remove additional spaces in devcoredump HW Engines section I guess the indention was to keep it visually aligned but that would require a lot of spaces and was not followed by other registers so lets just drop it. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Maarten Lankhorst <dev@lankhorst.se> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240123204454.246788-7-jose.souza@intel.com	2024-01-24 11:08:26 -08:00
José Roberto de Souza	89e394f0db	drm/xe: Print registers spread in 2 u32 as u64 This makes easier to use those registers when copying its values to calculator also makes easier for tools to parse it. To avoids padding holes in xe_hw_engine_snapshot the u64 variables were moved to the top of xe_hw_engine_snapshot.reg. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Maarten Lankhorst <dev@lankhorst.se> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240123204454.246788-6-jose.souza@intel.com	2024-01-24 11:08:25 -08:00

1 2 3 4 5 ...

1248870 Commits