linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-16 20:42:29 -04:00

Author	SHA1	Message	Date
Marco Crivellari	fa171b805f	drm/xe: replace use of system_unbound_wq with system_dfl_wq This patch continues the effort to refactor workqueue APIs, which has begun with the changes introducing new workqueues and a new alloc_workqueue flag: commit `128ea9f6cc` ("workqueue: Add system_percpu_wq and system_dfl_wq") commit `930c2ea566` ("workqueue: Add new WQ_PERCPU flag") The point of the refactoring is to eventually alter the default behavior of workqueues to become unbound by default so that their workload placement is optimized by the scheduler. Before that to happen, workqueue users must be converted to the better named new workqueues with no intended behaviour changes: system_wq -> system_percpu_wq system_unbound_wq -> system_dfl_wq This way the old obsolete workqueues (system_wq, system_unbound_wq) can be removed in the future. Link: https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/ Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20260202103756.62138-2-marco.crivellari@suse.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-02-02 19:18:09 -05:00
Michal Wajdeczko	65b9886062	drm/xe/guc: Allow second H2G retry on FLR During VF FLR the scratch registers could be cleared both by the GuC and by the PF driver. Allow to retry more times once we find out that the HXG header was cleared and wait at least 256ms before resending the same message again to the GuC. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260127193727.601-7-michal.wajdeczko@intel.com	2026-02-02 22:35:46 +01:00
Michal Wajdeczko	e116fd5c60	drm/xe/guc: Wait before retrying sending H2G We shall resend H2G message after receiving NO_RESPONSE_RETRY reply, but since GuC dropped that H2G due to some interim state, we should give it a little time to stabilize. Wait before sending the same H2G again, start with 1ms delay, then increase exponentially to 256ms. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260127193727.601-6-michal.wajdeczko@intel.com	2026-02-02 22:35:45 +01:00
Michal Wajdeczko	09b45fd9d3	drm/xe/guc: Drop redundant register read The xe_mmio_wait32() already returns the last value of the register for which we were waiting, there is no need read it again. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260127193727.601-5-michal.wajdeczko@intel.com	2026-02-02 22:35:44 +01:00
Michal Wajdeczko	943c4d0637	drm/xe/guc: Limit sleep while waiting for H2G credits Instead of endlessly increasing the sleep timeout while waiting for the H2G credits, use exponential increase only up to the given limit, like it was initially done in the GuC submission code. While here, fix the actual timeout to the 1s as it was documented. Suggested-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260127193727.601-4-michal.wajdeczko@intel.com	2026-02-02 22:35:43 +01:00
Michal Wajdeczko	eec43f3684	drm/xe: Move exponential sleep logic to helper We want to reuse the same increased sleep logic in other places. To avoid code duplication, move it to the helper. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260127193727.601-3-michal.wajdeczko@intel.com	2026-02-02 22:35:42 +01:00
Michal Wajdeczko	94a2ceb190	drm/xe: Promote relaxed_ms_sleep We want to have single place with sleep related helpers for better code reuse. Create xe_sleep.h and move relaxed_ms_sleep() there. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260127193727.601-2-michal.wajdeczko@intel.com	2026-02-02 22:35:41 +01:00
Michal Wajdeczko	316b05ae7e	drm/xe/pf: Simplify IS_SRIOV_PF macro Instead of two having variants of the IS_SRIOV_PF macro, move the CONFIG_PCI_IOV check to the xe_device_is_sriov_pf() function and let the compiler optimize that. This will help us drop poor man's type check of the macro parameter that fails on const xe pointer. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://patch.msgid.link/20260128222714.3056-1-michal.wajdeczko@intel.com	2026-02-02 22:22:57 +01:00
Shuicheng Lin	9f9c117ac5	drm/xe: Fix kerneldoc for xe_tlb_inval_job_alloc_dep Correct the function name in the kerneldoc. It is for below warning: "Warning: drivers/gpu/drm/xe/xe_tlb_inval_job.c:210 expecting prototype for xe_tlb_inval_alloc_dep(). Prototype was for xe_tlb_inval_job_alloc_dep() instead" Fixes: `15366239e2` ("drm/xe: Decouple TLB invalidations from GT") Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260129233834.419977-8-shuicheng.lin@intel.com	2026-02-02 22:05:00 +01:00
Shuicheng Lin	0651dbb9d6	drm/xe: Fix kerneldoc for xe_gt_tlb_inval_init_early Correct the function name in the kerneldoc. It is for below warning: "Warning: drivers/gpu/drm/xe/xe_tlb_inval.c:136 expecting prototype for xe_gt_tlb_inval_init(). Prototype was for xe_gt_tlb_inval_init_early() instead" v2: add () for the function. (Michal) Fixes: `db16f9d90c` ("drm/xe: Split TLB invalidation code in frontend and backend") Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260129233834.419977-7-shuicheng.lin@intel.com	2026-02-02 22:04:58 +01:00
Shuicheng Lin	9fd8da7179	drm/xe: Fix kerneldoc for xe_migrate_exec_queue Correct the function name in the kerneldoc. It is for below warning: "Warning: drivers/gpu/drm/xe/xe_migrate.c:1262 expecting prototype for xe_get_migrate_exec_queue(). Prototype was for xe_migrate_exec_queue() instead" Fixes: `916ee4704a` ("drm/xe/vf: Register CCS read/write contexts with Guc") Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260129233834.419977-6-shuicheng.lin@intel.com	2026-02-02 22:04:55 +01:00
Shuicheng Lin	c2a6859138	drm/xe/query: Fix topology query pointer advance The topology query helper advanced the user pointer by the size of the pointer, not the size of the structure. This can misalign the output blob and corrupt the following mask. Fix the increment to use sizeof(topo). There is no issue currently, as sizeof(topo) happens to be equal to sizeof(topo) on 64-bit systems (both evaluate to 8 bytes). Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260130043907.465128-2-shuicheng.lin@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-02-02 10:09:48 -08:00
Xin Wang	568d9d0d83	drm/xe: use entry_dump callbacks for xe2+ PAT dumps Move xe2+ PAT entry printing into the entry_dump op so platform specific logic stays localized, simplifying future maintenance. v2: - Do not null xe->pat.ops for VFs. - Skip PAT init and dump on VFs (-EOPNOTSUPP), avoiding NULL ops use. v3: - fixed typo v4: (Matt) - Switch xe2_dump() to use the new ops->entry_dump() vfunc. - Remove xe3p_xpc_dump() and reuse the common xe2_dump() for Xe3p XPC. - This also fixes Xe3p_HPM media PAT dumping by using the proper non-MCR access for the PAT register range (bspec 76445). Cc: Matt Roper <matthew.d.roper@intel.com> Suggested-by: Brian Nguyen <brian3.nguyen@intel.com> Signed-off-by: Xin Wang <x.wang@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260130175349.2249033-1-x.wang@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-02-02 10:03:43 -08:00
Chaitanya Kumar Borah	f89dbe14a0	drm/xe/guc: Fix kernel-doc warning in GuC scheduler ABI header The GuC scheduler ABI header contains a file-level comment that is not intended to document a kernel-doc symbol. Using kernel-doc comment syntax (/** /) triggers kernel-doc warnings. With "-Werror", this causes the build to fail. Convert the comment to a regular block comment. HDRTEST drivers/gpu/drm/xe/abi/guc_scheduler_abi.h Warning: drivers/gpu/drm/xe/abi/guc_scheduler_abi.h:11 This comment starts with '/', but isn't a kernel-doc comment. Refer to Documentation/doc-guide/kernel-doc.rst Generic defines required for registration with and submissions to the GuC 1 warnings as errors make[6]: * [drivers/gpu/drm/xe/Makefile:377: drivers/gpu/drm/xe/abi/guc_scheduler_abi.hdrtest] Error 3 make[5]: * [scripts/Makefile.build:544: drivers/gpu/drm/xe] Error 2 make[4]: * [scripts/Makefile.build:544: drivers/gpu/drm] Error 2 make[3]: * [scripts/Makefile.build:544: drivers/gpu] Error 2 make[2]: * [scripts/Makefile.build:544: drivers] Error 2 make[1]: * [/home/kbuild2/kernel/Makefile:2088: .] Error 2 make: *** [Makefile:248: __sub-make] Error 2 v2: - Add Fixes tag (Daniele) Fixes: `b0c5cf4f59` ("drm/gt/guc: extract scheduler-related defines from guc_fwif.h") Signed-off-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patch.msgid.link/20260130135210.2659200-1-chaitanya.kumar.borah@intel.com	2026-02-02 09:13:31 -08:00
Daniele Ceraolo Spurio	dd8ea2f2ab	drm/xe/guc: Fix CFI violation in debugfs access. xe_guc_print_info is void-returning, but the function pointer it is assigned to expects an int-returning function, leading to the following CFI error: [ 206.873690] CFI failure at guc_debugfs_show+0xa1/0xf0 [xe] (target: xe_guc_print_info+0x0/0x370 [xe]; expected type: 0xbe3bc66a) Fix this by updating xe_guc_print_info to return an integer. Fixes: `e15826bb3c` ("drm/xe/guc: Refactor GuC debugfs initialization") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: George D Sworo <george.d.sworo@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260129182547.32899-2-daniele.ceraolospurio@intel.com	2026-02-02 09:12:56 -08:00
Balasubramani Vivekanandan	de96c43a69	drm/xe: Apply WA_16028005424 to Media Apply WA_16028005424 to following IPs: Xe2_LPM, Xe2_HPM, Xe3_LPM, Xe3p_LPM While doing this move the same WA defined for Xe3_LPG under the comment for Xe3_LPG. It was wrongly placed under Xe3_LPM. Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260128062911.1456539-2-balasubramani.vivekanandan@intel.com	2026-02-02 15:03:35 +05:30
Michal Wajdeczko	b47239bc30	drm/xe/pf: Fix typo in function kernel-doc The function name is missing an underscore, which results in: Warning: ../drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c:1261 This comment starts with '/*', but isn't a kernel-doc comment. Refer to Documentation/doc-guide/kernel-doc.rst xe_gt_sriov_pf_control_trigger restore_vf() - Start ... Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://patch.msgid.link/20251217150702.2669-1-michal.wajdeczko@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-01-30 16:08:16 -05:00
Niranjana Vishwanathapura	e694179a2c	drm/xe/multi_queue: Protect priority against concurrent access Use a spinlock to protect multi-queue priority being concurrently updated by multiple set_priority ioctls and to protect against concurrent read and write to this field. v2: Update documentation, remove WRITE/READ_LOCK() (Thomas) Use scoped_guard, reduced lock scope (Matt Brost) v3: Fix author (checkpatch) Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20260126174241.3470390-2-niranjana.vishwanathapura@intel.com	2026-01-29 10:03:53 -08:00
Shuicheng Lin	7755ed58a4	drm/xe/nvm: Defer xe->nvm assignment until init succeeds Allocate and initialize the NVM structure using a local pointer and assign it to xe->nvm only after all initialization steps succeed. This avoids exposing a partially initialized xe->nvm and removes the need to explicitly clear xe->nvm on error paths, simplifying error handling and making the lifetime rules clearer. Cc: Alexander Usyskin <alexander.usyskin@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Brian Nguyen <brian3.nguyen@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Brian Nguyen <brian3.nguyen@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/20260120183239.2966782-8-shuicheng.lin@intel.com	2026-01-28 18:40:08 -08:00
Shuicheng Lin	a3187c0c2b	drm/xe/nvm: Fix double-free on aux add failure After a successful auxiliary_device_init(), aux_dev->dev.release (xe_nvm_release_dev()) is responsible for the kfree(nvm). When there is failure with auxiliary_device_add(), driver will call auxiliary_device_uninit(), which call put_device(). So that the .release callback will be triggered to free the memory associated with the auxiliary_device. Move the kfree(nvm) into the auxiliary_device_init() failure path and remove the err goto path to fix below error. " [ 13.232905] ================================================================== [ 13.232911] BUG: KASAN: double-free in xe_nvm_init+0x751/0xf10 [xe] [ 13.233112] Free of addr ffff888120635000 by task systemd-udevd/273 [ 13.233120] CPU: 8 UID: 0 PID: 273 Comm: systemd-udevd Not tainted 6.19.0-rc2-lgci-xe-kernel+ #225 PREEMPT(voluntary) ... [ 13.233125] Call Trace: [ 13.233126] <TASK> [ 13.233127] dump_stack_lvl+0x7f/0xc0 [ 13.233132] print_report+0xce/0x610 [ 13.233136] ? kasan_complete_mode_report_info+0x5d/0x1e0 [ 13.233139] ? xe_nvm_init+0x751/0xf10 [xe] ... " v2: drop err goto path. (Alexander) Fixes: `d4c3ed963e` ("drm/xe: defer free of NVM auxiliary container to device release callback") Reviewed-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Brian Nguyen <brian3.nguyen@intel.com> Cc: Alexander Usyskin <alexander.usyskin@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Suggested-by: Brian Nguyen <brian3.nguyen@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/20260120183239.2966782-7-shuicheng.lin@intel.com	2026-01-28 18:40:05 -08:00
Shuicheng Lin	11035eab1b	drm/xe/nvm: Manage nvm aux cleanup with devres Move nvm teardown to a devm-managed action registered from xe_nvm_init(). This ensures the auxiliary NVM device is deleted on probe failure and device detach without requiring explicit calls from remove paths. As part of this, drop xe_nvm_fini() from xe_device_remove() and from the survivability sysfs teardown, and remove the public xe_nvm_fini() API from the header. This is to fix below warn message when there is probe failure after xe_nvm_init(), then xe_device_probe() is called again: " [ 207.318152] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/xe.nvm.768' [ 207.318157] CPU: 5 UID: 0 PID: 10261 Comm: modprobe Tainted: G B W 6.19.0-rc2-lgci-xe-kernel+ #223 PREEMPT(voluntary) [ 207.318160] Tainted: [B]=BAD_PAGE, [W]=WARN [ 207.318161] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 0812 02/24/2023 [ 207.318163] Call Trace: [ 207.318163] <TASK> [ 207.318165] dump_stack_lvl+0xa0/0xc0 [ 207.318170] dump_stack+0x10/0x20 [ 207.318171] sysfs_warn_dup+0xd5/0x110 [ 207.318175] sysfs_create_dir_ns+0x1f6/0x280 [ 207.318177] ? __pfx_sysfs_create_dir_ns+0x10/0x10 [ 207.318179] ? lock_acquire+0x1a4/0x2e0 [ 207.318182] ? __kasan_check_read+0x11/0x20 [ 207.318185] ? do_raw_spin_unlock+0x5c/0x240 [ 207.318187] kobject_add_internal+0x28d/0x8e0 [ 207.318189] kobject_add+0x11f/0x1f0 [ 207.318191] ? __pfx_kobject_add+0x10/0x10 [ 207.318193] ? lockdep_init_map_type+0x4b/0x230 [ 207.318195] ? get_device_parent.isra.0+0x43/0x4c0 [ 207.318197] ? kobject_get+0x55/0xf0 [ 207.318199] device_add+0x2d7/0x1500 [ 207.318201] ? __pfx_device_add+0x10/0x10 [ 207.318203] ? lockdep_init_map_type+0x4b/0x230 [ 207.318205] __auxiliary_device_add+0x99/0x140 [ 207.318208] xe_nvm_init+0x7a2/0xef0 [xe] [ 207.318333] ? xe_devcoredump_init+0x80/0x110 [xe] [ 207.318452] ? __devm_add_action+0x82/0xc0 [ 207.318454] ? fs_reclaim_release+0xc0/0x110 [ 207.318457] xe_device_probe+0x17dd/0x2c40 [xe] [ 207.318574] ? __pfx___drm_dev_dbg+0x10/0x10 [ 207.318576] ? add_dr+0x180/0x220 [ 207.318579] ? __pfx___drmm_mutex_release+0x10/0x10 [ 207.318582] ? __pfx_xe_device_probe+0x10/0x10 [xe] [ 207.318697] ? xe_pm_init_early+0x33a/0x410 [xe] [ 207.318850] xe_pci_probe+0x936/0x1250 [xe] [ 207.318999] ? lock_acquire+0x1a4/0x2e0 [ 207.319003] ? __pfx_xe_pci_probe+0x10/0x10 [xe] [ 207.319151] local_pci_probe+0xe6/0x1a0 [ 207.319154] pci_device_probe+0x523/0x840 [ 207.319157] ? __pfx_pci_device_probe+0x10/0x10 [ 207.319159] ? sysfs_do_create_link_sd.isra.0+0x8c/0x110 [ 207.319162] ? sysfs_create_link+0x48/0xc0 ... " Fixes: `c28bfb107d` ("drm/xe/nvm: add on-die non-volatile memory device") Reviewed-by: Alexander Usyskin <alexander.usyskin@intel.com> Reviewed-by: Brian Nguyen <brian3.nguyen@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Riana Tauro <riana.tauro@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/20260120183239.2966782-6-shuicheng.lin@intel.com	2026-01-28 18:40:02 -08:00
Shuicheng Lin	63b3360436	drm/xe/configfs: Fix is_bound() pci_dev lifetime Move pci_dev_put() after pci_dbg() to avoid using pdev after dropping its reference. Fixes: `2674f1ef29` ("drm/xe/configfs: Block runtime attribute changes") Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/20260121173750.3090907-2-shuicheng.lin@intel.com	2026-01-28 18:39:30 -08:00
Shuicheng Lin	6edeabacb7	drm/xe/gt: Use CLASS() for forcewake in xe_gt_enable_comp_1wcoh Adopt the scoped forcewake management using CLASS(xe_force_wake, ...) to simplify the code and ensure proper resource release. Cc: Xin Wang <x.wang@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Xin Wang <x.wang@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/20260123180425.3262944-2-shuicheng.lin@intel.com	2026-01-28 18:38:58 -08:00
Michal Wajdeczko	d043b95983	drm/xe/vf: Reset VF GuC state on fini Unlike native/PF driver, which was explicitly triggering full GuC reset during driver unwind, the VF driver was not notifying GuC that it is about to unwind, and this could lead GuC to access stale data, which in turn could be interpreted as VF's malicious activity. Add managed action to send to GuC VF_RESET message during GT unwind. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patch.msgid.link/20260122151924.3726-1-michal.wajdeczko@intel.com	2026-01-27 21:26:38 +01:00
Nathan Chancellor	91e0c2fec1	drm/xe: Move _THIS_IP_ usage from xe_vm_create() to dedicated function After commit `a3866ce7b1` ("drm/xe: Add vm to exec queues association"), building for an architecture other than x86 (which defines its own _THIS_IP_) with clang fails with: drivers/gpu/drm/xe/xe_vm.c:1586:3: error: cannot jump from this indirect goto statement to one of its possible targets 1586 \| drm_exec_retry_on_contention(&exec); \| ^ include/drm/drm_exec.h:123:4: note: expanded from macro 'drm_exec_retry_on_contention' 123 \| goto *__drm_exec_retry_ptr; \ \| ^ drivers/gpu/drm/xe/xe_vm.c:1542:3: note: possible target of indirect goto statement 1542 \| might_lock(&vm->exec_queues.lock); \| ^ include/linux/lockdep.h:553:33: note: expanded from macro 'might_lock' 553 \| lock_release(&(lock)->dep_map, _THIS_IP_); \ \| ^ include/linux/instruction_pointer.h:10:41: note: expanded from macro '_THIS_IP_' 10 \| #define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; }) \| ^ drivers/gpu/drm/xe/xe_vm.c:1583:2: note: jump exits scope of variable with __attribute__((cleanup)) 1583 \| xe_validation_guard(&ctx, &xe->val, &exec, (struct xe_val_flags) {.interruptible = true}, \| ^ drivers/gpu/drm/xe/xe_validation.h:189:2: note: expanded from macro 'xe_validation_guard' 189 \| scoped_guard(xe_validation, _ctx, _val, _exec, _flags, &_ret) \ \| ^ include/linux/cleanup.h:442:2: note: expanded from macro 'scoped_guard' 442 \| __scoped_guard(_name, __UNIQUE_ID(label), args) \| ^ include/linux/cleanup.h:433:20: note: expanded from macro '__scoped_guard' 433 \| for (CLASS(_name, scope)(args); \ \| ^ drivers/gpu/drm/xe/xe_vm.c:1542:3: note: jump enters a statement expression 1542 \| might_lock(&vm->exec_queues.lock); \| ^ include/linux/lockdep.h:553:33: note: expanded from macro 'might_lock' 553 \| lock_release(&(lock)->dep_map, _THIS_IP_); \ \| ^ include/linux/instruction_pointer.h:10:20: note: expanded from macro '_THIS_IP_' 10 \| #define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; }) \| ^ While this is a false positive error because __drm_exec_retry_ptr is only ever assigned the label in drm_exec_until_all_locked() (thus it can never jump over the cleanup variable), this error is not unreasonable in general because the only supported use case for taking the address of a label is computed gotos [1]. The kernel's use of the address of a label in _THIS_IP_ is considered problematic by both GCC [2][3] and clang [4] but they need to provide something equivalent before they can break this use case. Hide the usage of _THIS_IP_ by moving the CONFIG_PROVE_LOCKING if statement to its own function, avoiding the error. This is similar to commit `187e16f69d` ("drm/xe: Work around clang multiple goto-label error") but with the sources of _THIS_IP_. Fixes: `a3866ce7b1` ("drm/xe: Add vm to exec queues association") Link: https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html [1] Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44298 [2] Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120071 [3] Link: https://github.com/llvm/llvm-project/issues/138272 [4] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260121-xe-vm-fix-clang-goto-error-v1-1-7e121d81512e@kernel.org	2026-01-27 12:19:05 -08:00
Shuicheng Lin	60bfb8baf8	drm/xe: Unregister drm device on probe error Call drm_dev_unregister() when xe_device_probe() fails after successful drm_dev_register(). This ensures the DRM device is promptly unregistered before returning an error, avoiding leaving it registered on the failure path. Otherwise, there is warn message if xe_device_probe() is called again: " [ 207.322365] [drm:drm_minor_register] [ 207.322381] debugfs: '128' already exists in 'dri' [ 207.322432] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/drm/renderD128' [ 207.322435] CPU: 5 UID: 0 PID: 10261 Comm: modprobe Tainted: G B W 6.19.0-rc2-lgci-xe-kernel+ #223 PREEMPT(voluntary) [ 207.322439] Tainted: [B]=BAD_PAGE, [W]=WARN [ 207.322440] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 0812 02/24/2023 [ 207.322441] Call Trace: [ 207.322442] <TASK> [ 207.322443] dump_stack_lvl+0xa0/0xc0 [ 207.322446] dump_stack+0x10/0x20 [ 207.322448] sysfs_warn_dup+0xd5/0x110 [ 207.322451] sysfs_create_dir_ns+0x1f6/0x280 [ 207.322453] ? __pfx_sysfs_create_dir_ns+0x10/0x10 [ 207.322455] ? lock_acquire+0x1a4/0x2e0 [ 207.322458] ? __kasan_check_read+0x11/0x20 [ 207.322461] kobject_add_internal+0x28d/0x8e0 [ 207.322464] kobject_add+0x11f/0x1f0 [ 207.322465] ? lock_acquire+0x1a4/0x2e0 [ 207.322467] ? __pfx_kobject_add+0x10/0x10 [ 207.322469] ? __kasan_check_write+0x14/0x20 [ 207.322471] ? kobject_put+0x62/0x4a0 [ 207.322473] ? get_device_parent.isra.0+0x1bb/0x4c0 [ 207.322475] ? kobject_put+0x62/0x4a0 [ 207.322477] device_add+0x2d7/0x1500 [ 207.322479] ? __pfx_device_add+0x10/0x10 [ 207.322481] ? drm_debugfs_add_file+0xfa/0x170 [ 207.322483] ? drm_debugfs_add_files+0x82/0xd0 [ 207.322485] ? drm_debugfs_add_files+0x82/0xd0 [ 207.322487] drm_minor_register+0x10a/0x2d0 [ 207.322489] drm_dev_register+0x143/0x860 [ 207.322491] ? xe_configfs_get_psmi_enabled+0x12/0x90 [xe] [ 207.322667] xe_device_probe+0x185b/0x2c40 [xe] [ 207.322812] ? __pfx___drm_dev_dbg+0x10/0x10 [ 207.322815] ? add_dr+0x180/0x220 [ 207.322818] ? __pfx___drmm_mutex_release+0x10/0x10 [ 207.322821] ? __pfx_xe_device_probe+0x10/0x10 [xe] [ 207.322966] ? xe_pm_init_early+0x33a/0x410 [xe] [ 207.323136] xe_pci_probe+0x936/0x1250 [xe] [ 207.323298] ? lock_acquire+0x1a4/0x2e0 [ 207.323302] ? __pfx_xe_pci_probe+0x10/0x10 [xe] [ 207.323464] local_pci_probe+0xe6/0x1a0 [ 207.323468] pci_device_probe+0x523/0x840 [ 207.323470] ? __pfx_pci_device_probe+0x10/0x10 [ 207.323473] ? sysfs_do_create_link_sd.isra.0+0x8c/0x110 [ 207.323476] ? sysfs_create_link+0x48/0xc0 [ 207.323479] really_probe+0x1fd/0x8a0 ... " Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patch.msgid.link/20260109211041.2446012-2-shuicheng.lin@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-01-26 16:10:06 -08:00
Vinay Belgaumkar	40ee63f5df	drm/xe/ptl: Disable DCC on PTL On PTL, the recommendation is to disable DCC(Duty Cycle Control) as it may cause some regressions due to added latencies. Upcoming GuC releases will disable DCC on PTL as well, but we need to force it in KMD so that this behavior is propagated to older kernels. v2: Update commit message (Rodrigo) v3: Rebase v4: Fix typo: s/propagted/propagated Fixes: `5cdb71d3b0` ("drm/xe/ptl: Add GuC FW definition for PTL") Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patch.msgid.link/20260124005917.398522-1-vinay.belgaumkar@intel.com Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-01-26 16:25:54 -05:00
Tvrtko Ursulin	7fe6cae2f7	drm/xe/xelp: Fix Wa_18022495364 It looks I mistyped CS_DEBUG_MODE2 as CS_DEBUG_MODE1 when adding the workaround. Fix it. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: `ca33cd271e` ("drm/xe/xelp: Add Wa_18022495364") Cc: Matt Roper <matthew.d.roper@intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <stable@vger.kernel.org> # v6.18+ Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20260116095040.49335-1-tvrtko.ursulin@igalia.com	2026-01-26 14:34:53 +01:00
Shuicheng Lin	4761791c1e	drm/xe: Skip address copy for sync-only execs For parallel exec queues, xe_exec_ioctl() copied the batch buffer address array from userspace without checking num_batch_buffer. If user creates a sync-only exec that doesn't use the address field, the exec will fail with -EFAULT. Add num_batch_buffer check to skip the copy, and the exec could be executed successfully. Here is the sync-only exec: struct drm_xe_exec exec = { .extensions = 0, .exec_queue_id = qid, .num_syncs = 1, .syncs = (uintptr_t)&sync, .address = 0, /* ignored for sync-only / .num_batch_buffer = 0, / sync-only */ }; Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260122214053.3189366-2-shuicheng.lin@intel.com	2026-01-23 16:34:36 -08:00
Nitin Gote	6ef02656c3	drm/xe: derive mem copy capability from graphics version Drop .has_mem_copy_instr from the platform descriptors and set it in xe_info_init() after handle_gmdid() populates graphics_verx100. Centralizing the GRAPHICS_VER(xe) >= 20 check keeps MEM_COPY enabled on Xe2+ and removes redundant per-platform plumbing. Bspec: 57561 Fixes: `1e12dbae9d` ("drm/xe/migrate: support MEM_COPY instruction") Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Suggested-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Link: https://patch.msgid.link/20260120054724.1982608-2-nitin.r.gote@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>	2026-01-23 11:55:06 +05:30
Sanjay Yadav	dc2fc00ba9	drm/xe: Use DRM_BUDDY_CONTIGUOUS_ALLOCATION for contiguous allocations The VRAM/stolen memory managers do not currently set DRM_BUDDY_CONTIGUOUS_ALLOCATION for contiguous allocations. Enabling this flag activates the buddy allocator's try_harder path, which helps handle fragmented memory scenarios. This enables the __alloc_contig_try_harder fallback in the buddy allocator, allowing contiguous allocation requests to succeed even when memory is fragmented by combining allocations from both(RHS and LHS) sides of a large free block. v2: (Matt B) - Remove redundant logic for rounding allocation size and trimming when TTM_PL_FLAG_CONTIGUOUS is set, since drm_buddy now handles this when DRM_BUDDY_CONTIGUOUS_ALLOCATION is enabled Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6713 Suggested-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20260121111416.3104399-2-sanjay.kumar.yadav@intel.com	2026-01-22 09:52:27 +00:00
Thomas Hellström	9386f49316	drm/xe: Select CONFIG_DEVICE_PRIVATE when DRM_XE_GPUSVM is selected CONFIG_DEVICE_PRIVATE is a prerequisite for DRM_XE_GPUSVM. Explicitly select it so that DRM_XE_GPUSVM is not unintentionally left out from distro configs not explicitly enabling CONFIG_DEVICE_PRIVATE. v2: - Select also CONFIG_ZONE_DEVICE since it's needed by CONFIG_DEVICE_PRIVATE. v3: - Depend on CONFIG_ZONE_DEVICE rather than selecting it. Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <dri-devel@lists.freedesktop.org> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20260121091048.41371-3-thomas.hellstrom@linux.intel.com	2026-01-22 09:56:28 +01:00
Thomas Hellström	1e372b2461	drm, drm/xe: Fix xe userptr in the absence of CONFIG_DEVICE_PRIVATE CONFIG_DEVICE_PRIVATE is not selected by default by some distros, for example Fedora, and that leads to a regression in the xe driver since userptr support gets compiled out. It turns out that DRM_GPUSVM, which is needed for xe userptr support compiles also without CONFIG_DEVICE_PRIVATE, but doesn't compile without CONFIG_ZONE_DEVICE. Exclude the drm_pagemap files from compilation with !CONFIG_ZONE_DEVICE, and remove the CONFIG_DEVICE_PRIVATE dependency from CONFIG_DRM_GPUSVM and the xe driver's selection of it, re-enabling xe userptr for those configs. v2: - Don't compile the drm_pagemap files unless CONFIG_ZONE_DEVICE is set. - Adjust the drm_pagemap.h header accordingly. Fixes: `9e97874148` ("drm/xe/userptr: replace xe_hmm with gpusvm") Cc: Matthew Auld <matthew.auld@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.18+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patch.msgid.link/20260121091048.41371-2-thomas.hellstrom@linux.intel.com	2026-01-22 09:56:27 +01:00
Matthew Auld	9dd1048bca	drm/xe/migrate: fix job lock assert We are meant to be checking the user vm for the bind queue, but actually we are checking the migrate vm. For various reasons this is not currently firing but this will likely change in the future. Now that we have the user_vm attached to the bind queue, we can fix this by directly checking that here. Fixes: `dba89840a9` ("drm/xe: Add GT TLB invalidation jobs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Arvind Yadav <arvind.yadav@intel.com> Link: https://patch.msgid.link/20260120110609.77958-4-matthew.auld@intel.com	2026-01-21 10:07:21 +00:00
Matthew Auld	9dd08fdecc	drm/xe/uapi: disallow bind queue sharing Currently this is very broken if someone attempts to create a bind queue and share it across multiple VMs. For example currently we assume it is safe to acquire the user VM lock to protect some of the bind queue state, but if allow sharing the bind queue with multiple VMs then this quickly breaks down. To fix this reject using a bind queue with any VM that is not the same VM that was originally passed when creating the bind queue. This a uAPI change, however this was more of an oversight on kernel side that we didn't reject this, and expectation is that userspace shouldn't be using bind queues in this way, so in theory this change should go unnoticed. Based on a patch from Matt Brost. v2 (Matt B): - Hold the vm lock over queue create, to ensure it can't be closed as we attach the user_vm to the queue. - Make sure we actually check for NULL user_vm in destruction path. v3: - Fix error path handling. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Reported-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michal Mrozek <michal.mrozek@intel.com> Cc: Carl Zhang <carl.zhang@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Acked-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Arvind Yadav <arvind.yadav@intel.com> Acked-by: Michal Mrozek <michal.mrozek@intel.com> Link: https://patch.msgid.link/20260120110609.77958-3-matthew.auld@intel.com	2026-01-21 10:07:18 +00:00
Matthew Brost	6cdaa5346d	drm/xe: Add context-based invalidation to GuC TLB invalidation backend Introduce context-based invalidation support to the GuC TLB invalidation backend. This is implemented by iterating over each exec queue per GT within a VM, skipping inactive queues, and issuing a context-based (GuC ID) H2G TLB invalidation. All H2G messages, except the final one, are sent with an invalid seqno, which the G2H handler drops to ensure the TLB invalidation fence is only signaled once all H2G messages are completed. A watermark mechanism is also added to switch between context-based TLB invalidations and full device-wide invalidations, as the return on investment for context-based invalidation diminishes when many exec queues are mapped. v2: - Fix checkpatch warnings v3: - Rebase on PRL - Use ref counting to avoid racing with deregisters v4: - Extra braces (Stuart) - Use per GT list (CI) - Reorder put Signed-off-by: Matthew Brost <matthew.brost@intel.com> Tested-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20260116221731.868657-12-matthew.brost@intel.com	2026-01-16 18:24:57 -08:00
Matthew Brost	628d59392c	drm/xe: Add exec queue active vfunc If an exec queue is inactive (e.g., not registered or scheduling is disabled), TLB invalidations are not issued for that queue. Add a virtual function to determine the active state, which TLB invalidation logic can hook into. v5: - Operate on primary in active function Signed-off-by: Matthew Brost <matthew.brost@intel.com> Tested-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20260116221731.868657-11-matthew.brost@intel.com	2026-01-16 18:24:56 -08:00
Matthew Brost	6b42b635d6	drm/xe: Add xe_tlb_inval_idle helper Introduce the xe_tlb_inval_idle helper to detect whether any TLB invalidations are currently in flight. This is used in context-based TLB invalidations to determine whether dummy TLB invalidations need to be sent to maintain proper TLB invalidation fence ordering.. v2: - Implement xe_tlb_inval_idle based on pending list Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Tested-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20260116221731.868657-10-matthew.brost@intel.com	2026-01-16 18:24:54 -08:00
Matthew Brost	2d93d5d530	drm/xe: Add send_tlb_inval_ppgtt helper Extract the common code that issues a TLB invalidation H2G for PPGTTs into a helper function. This helper can be reused for both ASID-based and context-based TLB invalidations. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Tested-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20260116221731.868657-9-matthew.brost@intel.com	2026-01-16 18:24:53 -08:00
Matthew Brost	edcc15f489	drm/xe: Rename send_tlb_inval_ppgtt to send_tlb_inval_asid_ppgtt Context-based TLB invalidations have their own set of GuC TLB invalidation operations. Rename the current PPGTT invalidation function, which operates on ASIDs, to a more descriptive name that reflects its purpose. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Tested-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20260116221731.868657-8-matthew.brost@intel.com	2026-01-16 18:24:52 -08:00
Matthew Brost	8d7a9f801e	drm/xe: Taint TLB invalidation seqno lock with GFP_KERNEL Taint TLB invalidation seqno lock with GFP_KERNEL as TLB invalidations can be in the path of reclaim (e.g., MMU notifiers). Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Tested-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20260116221731.868657-7-matthew.brost@intel.com	2026-01-16 18:24:51 -08:00
Matthew Brost	a3866ce7b1	drm/xe: Add vm to exec queues association Maintain a list of exec queues per vm which will be used by TLB invalidation code to do context-ID based tlb invalidations. v4: - More asserts (Stuart) - Per GT list (CI) - Skip adding / removal if context TLB invalidatiions not supported (Stuart) Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Tested-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20260116221731.868657-6-matthew.brost@intel.com	2026-01-16 18:24:49 -08:00
Matthew Brost	43c3e6eacb	drm/xe: Add xe_device_asid_to_vm helper Introduce the xe_device_asid_to_vm helper, which can be used throughout the driver to resolve the VM from a given ASID. v4: - Move forward declare after includes (Stuart) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Tested-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20260116221731.868657-5-matthew.brost@intel.com	2026-01-16 18:24:48 -08:00
Matthew Brost	dea333b244	drm/xe: Add has_ctx_tlb_inval to device info Add has_ctx_tlb_inval to device info indicating a device has context basd TLB invalidation. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Tested-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20260116221731.868657-4-matthew.brost@intel.com	2026-01-16 18:24:46 -08:00
Matthew Brost	444d78578e	drm/xe: Make usm.asid_to_vm allocation use GFP_NOWAIT Ensure the asid_to_vm lookup is reclaim-safe so it can be performed during TLB invalidations, which is necessary for context-based TLB invalidation support. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Tested-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20260116221731.868657-3-matthew.brost@intel.com	2026-01-16 18:24:45 -08:00
Matthew Brost	888c7f991f	drm/xe: Add normalize_invalidation_range Extract the code that determines the alignment of TLB invalidation into a helper function — normalize_invalidation_range. This will be useful when adding context-based invalidations to the GuC TLB invalidation backend. Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Tested-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20260116221731.868657-2-matthew.brost@intel.com	2026-01-16 18:24:44 -08:00
Niranjana Vishwanathapura	769d7774a1	drm/xe/multi_queue: Enable multi_queue on xe3p_xpc xe3p_xpc supports multi_queue, enable it. v2: Rename multi_queue_enable_mask to multi_queue_engine_class_mask (Matt Brost) Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260116220333.861850-3-matthew.brost@intel.com	2026-01-16 18:19:09 -08:00
Matthew Brost	bbd3678730	drm/xe: Ban entire multi-queue group on any job timeout In multi-queue mode, we only have control over the entire group, so we cannot ban individual queues or signal fences until the whole group is removed from hardware. Implement banning of the entire group if any job within it times out. v2: - Fix lock inversion (Niranjana) - Initialize new queues in group to stopped v3: - Blindly call xe_exec_queue_multi_queue_primary (Niranjana) - More comments around temporary list when stopping (Niranjana) - Restart group on false timeout (Niranjana) Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://patch.msgid.link/20260116220333.861850-2-matthew.brost@intel.com	2026-01-16 18:18:15 -08:00
Nakshtra Goyal	c51595b3d2	drm/xe/xe_query: Remove check for gt There's no need to check a userspace-provided GT ID (which may come from any tile) against the number of GTs that can be present on a single tile. The xe_device_get_gt() lookup already checks that the GT ID passed is valid for the current device.(Matt Roper) Signed-off-by: Nakshtra Goyal <nakshtra.goyal@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260113091928.67446-1-nakshtra.goyal@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-01-15 12:57:25 -08:00
Matthew Brost	e89aacd1ec	drm/xe: Reduce LRC timestamp stuck message on VFs to notice An LRC timestamp getting stuck is a somewhat normal occurrence. If a single VF submits a job that does not get timesliced, the LRC timestamp will not increment. Reduce the LRC timestamp stuck message on VFs to notice (same log level as job timeout) to avoid false CI bugs in tests where a VF submits a job that does not get timesliced. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7032 Fixes: `bb63e7257e` ("drm/xe: Avoid toggling schedule state to check LRC timestamp in TDR") Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20260114184905.4189026-1-matthew.brost@intel.com	2026-01-15 09:36:08 -08:00

1 2 3 4 5 ...

121061 Commits