linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-02-16 18:55:32 -05:00

Author	SHA1	Message	Date
Matthew Brost	efffd56e4b	drm/xe: Disable timestamp WA on VFs The timestamp WA does not work on a VF because it requires reading MMIO registers, which are inaccessible on a VF. This timestamp WA confuses LRC sampling on a VF during TDR, as the LRC timestamp would always read as 1 for any active context. Disable the timestamp WA on VFs to avoid this confusion. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Fixes: `617d824c53` ("drm/xe: Add WA BB to capture active context utilization") Link: https://patch.msgid.link/20260110012739.2888434-7-matthew.brost@intel.com	2026-01-10 13:39:52 -08:00
Matthew Brost	ddb5cf9b90	drm/xe: Remove special casing for LR queues in submission Now that LR jobs are tracked by the DRM scheduler, there's no longer a need to special-case LR queues. This change removes all LR queue-specific handling, including dedicated TDR logic, reference counting schemes, and other related mechanisms. v4: - Remove xe_exec_queue_lr_cleanup tracepoint (Niranjana) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://patch.msgid.link/20260110012739.2888434-6-matthew.brost@intel.com	2026-01-10 13:39:52 -08:00
Matthew Brost	58624c195b	drm/xe: Do not deregister queues in TDR Deregistering queues in the TDR introduces unnecessary complexity, requiring reference-counting techniques to function correctly, particularly to prevent use-after-free (UAF) issues while a deregistration initiated from the TDR is in progress. All that's needed in the TDR is to kick the queue off the hardware, which is achieved by disabling scheduling. Queue deregistration should be handled in a single, well-defined point in the cleanup path, tied to the queue's reference count. v4: - Explain why extra ref were needed prior to this patch (Niranjana) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://patch.msgid.link/20260110012739.2888434-5-matthew.brost@intel.com	2026-01-10 13:39:52 -08:00
Matthew Brost	dd1ef5e245	drm/xe: Only toggle scheduling in TDR if GuC is running If the firmware is not running during TDR (e.g., when the driver is unloading), there's no need to toggle scheduling in the GuC. In such cases, skip this step. v4: - Bail on wait UC not running (Niranjana) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://patch.msgid.link/20260110012739.2888434-4-matthew.brost@intel.com	2026-01-10 13:39:52 -08:00
Matthew Brost	95f27831ee	drm/xe: Stop abusing DRM scheduler internals Use new pending job list iterator and new helper functions in Xe to avoid reaching into DRM scheduler internals. Part of this change involves removing pending jobs debug information from debugfs and devcoredump. As agreed, the pending job list should only be accessed when the scheduler is stopped. However, it's not straightforward to determine whether the scheduler is stopped from the shared debugfs/devcoredump code path. Additionally, the pending job list provides little useful information, as pending jobs can be inferred from seqnos and ring head/tail positions. Therefore, this debug information is being removed. v4: - Add comment around DRM_GPU_SCHED_STAT_NO_HANG (Niranjana) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://patch.msgid.link/20260110012739.2888434-3-matthew.brost@intel.com	2026-01-10 13:39:50 -08:00
Matthew Brost	e70f43c21d	drm/xe: Add dedicated message lock Stop abusing DRM scheduler job list lock for messages, add dedicated message lock. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Acked-by: Philipp Stanner <phasta@kernel.org> Link: https://patch.msgid.link/20260110012739.2888434-2-matthew.brost@intel.com	2026-01-10 13:38:47 -08:00
Xin Wang	98466abe4e	drm/xe: Allow compressible surfaces to be 1-way coherent Previously, compressible surfaces were required to be non-coherent (allocated as WC) because compression and coherency were mutually exclusive. Starting with Xe3, hardware supports combining compression with 1-way coherency, allowing compressible surfaces to be allocated as WB memory. This provides applications with more efficient memory allocation by avoiding WC allocation overhead that can cause system stuttering and memory management challenges. The implementation adds support for compressed+coherent PAT entry for the xe3_lpg devices and updates the driver logic to handle the new compression capabilities. v2: (Matthew Auld) - Improved error handling with XE_IOCTL_DBG() - Enhanced documentation and comments - Fixed xe_bo_needs_ccs_pages() outdated compression assumptions v3: - Improve WB compression support detection by checking PAT table instead of version check v4: - Add XE_CACHE_WB_COMPRESSION, which simplifies the logic. v5: - Use U16_MAX for the invalid PAT index. (Matthew Auld) Bspec: 71582, 59361, 59399 Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Xin Wang <x.wang@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20260109093007.546784-1-x.wang@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-01-09 14:55:58 -08:00
Jani Nikula	72f654f424	drm/xe: improve header check Improve header check: Remove unused -DHDRTEST. Include the header twice to check for include guards. Run kernel-doc on the header. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260107155401.2379127-5-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2026-01-09 16:42:39 +02:00
Jani Nikula	b3a7767989	drm/xe/vm: fix xe_vm_validation_exec() kernel-doc Fix kernel-doc warnings on xe_vm_validation_exec(): Warning: ../drivers/gpu/drm/xe/xe_vm.h:392 expecting prototype for xe_vm_set_validation_exec(). Prototype was for xe_vm_validation_exec() instead Fixes: `0131514f97` ("drm/xe: Pass down drm_exec context to validation") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260107155401.2379127-4-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2026-01-09 16:42:39 +02:00
Jani Nikula	a857e61029	drm/xe/xe_late_bind_fw: fix enum xe_late_bind_fw_id kernel-doc Fix kernel-doc warnings on enum xe_late_bind_fw_id: Warning: ../drivers/gpu/drm/xe/xe_late_bind_fw_types.h:19 cannot understand function prototype: 'enum xe_late_bind_fw_id' Fixes: `45832bf9c1` ("drm/xe/xe_late_bind_fw: Initialize late binding firmware") Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patch.msgid.link/20260107155401.2379127-3-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2026-01-09 16:42:38 +02:00
Jani Nikula	44393331c7	drm/xe/vf: fix struct xe_gt_sriov_vf_migration kernel-doc Fix kernel-doc warnings on struct xe_gt_sriov_vf_migration: Warning: ../drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h:47 cannot understand function prototype: 'struct xe_gt_sriov_vf_migration' Fixes: `e1d2e2d878` ("drm/xe/vf: Add xe_gt_recovery_pending helper") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Tomasz Lis <tomasz.lis@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260107155401.2379127-2-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2026-01-09 16:42:38 +02:00
Jani Nikula	4cdcfa64b6	drm/xe/guc: fix struct guc_lfd_file_header kernel-doc Fix kernel-doc warnings on struct guc_lfd_file_header: Warning: ../drivers/gpu/drm/xe/abi/guc_lfd_abi.h:168 expecting prototype for struct guc_logfile_header. Prototype was for struct guc_lfd_file_header instead Fixes: `7eeb0e5408` ("drm/xe/guc: Add LFD related abi definitions") Cc: Zhanjun Dong <zhanjun.dong@intel.com> Cc: Julia Filipchuk <julia.filipchuk@intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260107155401.2379127-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2026-01-09 16:42:38 +02:00
Brian Nguyen	2e08feebe0	drm/xe: Add page reclamation related stats Add page reclaim list (PRL) related stats to GT stats to assist in debugging and tuning of page reclaim related actions. Include counters of page sizes added to PRL and if PRL action is issued. v2: - Add PRL_ABORTED_COUNT stats and corresponding changes. (Matthew B) Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260107010447.4125005-10-brian3.nguyen@intel.com	2026-01-08 14:33:34 -08:00
Brian Nguyen	83b914f972	drm/xe: Fix page reclaim entry handling for large pages For 64KB pages, XE_PTE_PS64 is defined for all consecutive 4KB pages and are all considered leaf nodes, so existing check was falsely adding multiple 64KB pages to PRL. For larger entries such as 2MB PDE, the check for pte->base.children is insufficient since this array is always defined for page directory, level 1 and above, so perform a check on the entry itself pointing to the correct page. For unmaps, if the range is properly covered by the page full directory, page walker may finish without walking to the leaf nodes. For example, a 1G range can be fully covered by 512 2MB pages if alignment allows. In this case, the page walker will walk until it reaches this corresponding directory which can correlate to the 1GB range. Page walker will simply complete its walk and the individual 2MB PDE leaves won't get accessed. In this case, PRL invalidation is also required, so add a check to see if pt entry cover the entire range since the walker will complete the walk. There are possible race conditions that will cause driver to read a pte that hasn't been written to yet. The 2 scenarios are: - Another issued TLB invalidation such as from userptr or MMU notifier. - Dependencies on original bind that has yet to be executed with an unbind on that job. The expectation is these race conditions are likely rare cases so simply perform a fallback to full PPC flush invalidation instead. v2: - Reword commit and updated zero-pte handling. (Matthew B) v3: - Rework if statement for abort case with additional comments. (Matthew B) Fixes: `b912138df2` ("drm/xe: Create page reclaim list on unbind") Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260107010447.4125005-9-brian3.nguyen@intel.com	2026-01-08 14:33:32 -08:00
Brian Nguyen	7a0e86e3c9	drm/xe: Add explicit abort page reclaim list PRLs could be invalidated to indicate its getting dropped from current scope but are still valid. So standardize calls and add abort to clearly define when an invalidation is a real abort and PRL should fallback. v3: - Update abort function to macro. (Matthew B) Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260107010447.4125005-8-brian3.nguyen@intel.com	2026-01-08 14:33:31 -08:00
Brian Nguyen	52cb4a595f	drm/xe: Remove debug comment in page reclaim Drop debug comment erronenously added in patch commit. Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260107010447.4125005-7-brian3.nguyen@intel.com	2026-01-08 14:33:30 -08:00
Marco Crivellari	aa39abc08e	drm/xe: fix WQ_MEM_RECLAIM passed as max_active to alloc_workqueue() Workqueue xe-ggtt-wq has been allocated using WQ_MEM_RECLAIM, but the flag has been passed as 3rd parameter (max_active) instead of 2nd (flags) creating the workqueue as per-cpu with max_active = 8 (the WQ_MEM_RECLAIM value). So change this by set WQ_MEM_RECLAIM as the 2nd parameter with a default max_active. Fixes: `60df57e496` ("drm/xe: Mark GGTT work queue with WQ_MEM_RECLAIM") Cc: stable@vger.kernel.org Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260108180148.423062-1-marco.crivellari@suse.com	2026-01-08 14:33:29 -08:00
Osama Abdelkader	351fa2ff09	drm/xe: Add missing newlines to drm_warn messages The drm_warn() calls in the default cases of various switch statements in xe_vm.c were missing trailing newlines, which can cause log messages to be concatenated with subsequent output. Add '\n' to all affected messages. Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com> Link: https://patch.msgid.link/20251224212116.59021-1-osama.abdelkader@gmail.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-01-08 16:16:02 -05:00
Lukasz Laguna	96d45e34f8	drm/xe/pf: Allow upon-any-hang wedged mode only in debug config The GuC reset policy is global, so disabling it on PF can affect all running VFs. To avoid unintended side effects, restrict setting upon-any-hang (2) wedged mode on the PF to debug builds only. Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260107174741.29163-5-lukasz.laguna@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-01-08 16:07:59 -05:00
Lukasz Laguna	43d78aca8e	drm/xe/vf: Disallow setting wedged mode to upon-any-hang In upon-any-hang (2) wedged mode, engine resets need to be disabled, which requires changing the GuC reset policy. VFs are not permitted to do that. Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260107174741.29163-4-lukasz.laguna@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-01-08 16:07:53 -05:00
Lukasz Laguna	0f13dead4e	drm/xe: Update wedged.mode only after successful reset policy change Previously, the driver's internal wedged.mode state was updated without verifying whether the corresponding engine reset policy update in GuC succeeded. This could leave the driver reporting a wedged.mode state that doesn't match the actual reset behavior programmed in GuC. With this change, the reset policy is updated first, and the driver's wedged.mode state is modified only if the policy update succeeds on all available GTs. This patch also introduces two functional improvements: - The policy is sent to GuC only when a change is required. An update is needed only when entering or leaving XE_WEDGED_MODE_UPON_ANY_HANG, because only in that case the reset policy changes. For example, switching between XE_WEDGED_MODE_UPON_CRITICAL_ERROR and XE_WEDGED_MODE_NEVER doesn't affect the reset policy, so there is no need to send the same value to GuC. - An inconsistent_reset flag is added to track cases where reset policy update succeeds only on a subset of GTs. If such inconsistency is detected, future wedged mode configuration will force a retry of the reset policy update to restore a consistent state across all GTs. Fixes: `6b8ef44cc0` ("drm/xe: Introduce the wedged_mode debugfs") Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com> Link: https://patch.msgid.link/20260107174741.29163-3-lukasz.laguna@intel.com Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-01-08 16:07:42 -05:00
Lukasz Laguna	17d3c3365b	drm/xe: Validate wedged_mode parameter and define enum for modes Check correctness of the wedged_mode parameter input to ensure only supported values are accepted. Additionally, replace magic numbers with a clearly defined enum. Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20260107174741.29163-2-lukasz.laguna@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-01-08 16:07:07 -05:00
Raag Jadav	644673a69f	drm/xe/pm: Handle GT resume failure We've been historically ignoring GT resume failure. Since the function can return error, handle it properly. v2: Bring up display before bailing (Matt Roper, Rodrigo) Signed-off-by: Raag Jadav <raag.jadav@intel.com> Link: https://patch.msgid.link/20251220073657.166810-1-raag.jadav@intel.com Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-01-08 15:35:36 -05:00
Matt Roper	4e88de313f	drm/xe/nvls: Define GuC firmware for NVL-S Although NVL-S has a similar Xe3 to PTL/WCL, it requires a unique GuC firmware. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20251016-xe3p-v3-12-3dd173a3097a@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patch.msgid.link/20260108181956.1254908-9-julia.filipchuk@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-01-08 15:15:55 -05:00
Matthew Brost	10dd1eaa80	drm/pagemap: Disable device-to-device migration Device-to-device migration is causing xe_exec_system_allocator --r raceno* to intermittently fail with engine resets and a kernel hang on a page lock. This should work but is clearly buggy somewhere. Disable device-to-device migration in the interim until the issue can be root-caused. The only downside of disabling device-to-device migration is that memory will bounce through system memory during migration. However, this path should be rare, as it only occurs when madvise attributes are changed or atomics are used. Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: `ec265e1f1c` ("drm/pagemap: Support source migration over interconnect") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20260107182716.2236607-3-matthew.brost@intel.com	2026-01-07 21:29:40 -08:00
Matthew Brost	3902846af3	drm/pagemap Fix error paths in drm_pagemap_migrate_to_devmem Avoid unlocking and putting device pages unless they were successfully locked, and do not calculate migrated_pages on error paths. Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: `75af93b3f5` ("drm/pagemap, drm/xe: Support destination migration over interconnect") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20260107182716.2236607-2-matthew.brost@intel.com	2026-01-07 21:29:39 -08:00
Matthew Brost	cc54eabdfb	drm/xe: Adjust page count tracepoints in shrinker Page accounting can change via the shrinker without calling xe_ttm_tt_unpopulate(), which normally updates page count tracepoints through update_global_total_pages. Add a call to update_global_total_pages when the shrinker successfully shrinks a BO. v2: - Don't adjust global accounting when pinning (Stuart) Cc: stable@vger.kernel.org Fixes: `ce3d39fae3` ("drm/xe/bo: add GPU memory trace points") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20260107205732.2267541-1-matthew.brost@intel.com	2026-01-07 21:29:38 -08:00
Rodrigo Vivi	3f0e3af468	Merge drm/drm-next into drm-xe-next Bring some drm-scheduler patches to Xe. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-01-07 16:49:20 -05:00
Matthew Brost	7c0c19c076	drm/xe: Validate preferred system memory placement in xe_svm_range_validate Ensure preferred system memory placement is checked in xe_svm_range_validate when dpagemap is NULL. Without this check, a prefetch to system memory may become a no-op because device memory is considered a valid placement. Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: `238dbc9d9f` ("drm/xe: Use the vma attibute drm_pagemap to select where to migrate") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patch.msgid.link/20260106213443.1866797-1-matthew.brost@intel.com	2026-01-07 09:27:58 -08:00
Niranjana Vishwanathapura	051114652b	drm/xe/doc: Remove KEEP_ACTIVE feature The KEEP_ACTIVE feature is being reverted, update documentation. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260106191051.2866538-6-niranjana.vishwanathapura@intel.com	2026-01-06 11:13:56 -08:00
Niranjana Vishwanathapura	caaed1dda7	Revert "drm/xe/multi_queue: Support active group after primary is destroyed" This reverts commit `3131a43ecb`. There is no must have requirement for this feature from Compute UMD. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260106191051.2866538-5-niranjana.vishwanathapura@intel.com	2026-01-06 11:13:54 -08:00
Raag Jadav	e70711be0d	drm/xe/i2c: Force polling mode in survivability SGUnit interrupts are not initialized in survivability. Force I2C controller to polling mode while in survivability. v2: Use helper function instead of manual check (Riana) Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://patch.msgid.link/20260105080750.16605-1-raag.jadav@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-01-05 07:43:22 -08:00
Dave Airlie	59260fe582	Merge tag 'drm-xe-next-2025-12-30' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next Core Changes: - Dynamic pagemaps and multi-device SVM (Thomas) Driver Changes: - Introduce SRIOV scheduler Groups (Daniele) - Configure migration queue as low latency (Francois) - Don't use absolute path in generated header comment (Calvin Owens) - Add SoC remapper support for system controller (Umesh) - Insert compiler barriers in GuC code (Jonathan) - Rebar updates (Lucas) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/aVOiULyYdnFbq-JB@fedora	2026-01-01 17:00:59 +10:00
Dave Airlie	9ec3c8ee16	Merge tag 'drm-xe-next-2025-12-19' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next [airlied: fix guc submit double definition] UAPI Changes: - Multi-Queue support (Niranjana) - Add DRM_XE_EXEC_QUEUE_SET_HANG_REPLAY_STATE (Brost) - Add NO_COMPRESSION BO flag and query capability (Sanjay) - Add gt_id to struct drm_xe_oa_unit (Ashutosh) - Expose MERT OA unit (Ashutosh) - Sysfs Survivability refactor (Riana) Cross-subsystem Changes: - VFIO: Add device specific vfio_pci driver variant for Intel graphics (Winiarski) Driver Changes: - MAINTAINERS update (Lucas -> Matt) - Add helper to query compression enable status (Xin) - Xe_VM fixes and updates (Shuicheng, Himal) - Documentation fixes (Winiarski, Swaraj, Niranjana) - Kunit fix (Roper) - Fix potential leaks, uaf, null derref, and oversized allocations (Shuicheng, Sanjay, Mika, Tapani) - Other minor fixes like kbuild duplication and sysfs_emit (Shuicheng, Madhur) - Handle msix vector0 interrupt (Venkata) - Scope-based forcewake and runtime PM (Roper, Raag) - GuC/HuC related fixes and refactors (Lucas, Zhanjun, Brost, Julia, Wajdeczko) - Fix conversion from clock ticks to milliseconds (Harish) - SRIOV PF PF: Add support for MERT (Lukasz) - Enable SR-IOV VF migration and other SRIOV updates (Winiarski, Satya, Brost, Wajdeczko, Piotr, Tomasz, Daniele) - Optimize runtime suspend/resume and other PM improvements (Raag) - Some W/a additions and updates (Bala, Harish, Roper) - Use for_each_tlb_inval() to calculate invalidation fences (Roper) - Fix VFIO link error (Arnd) - Fix ix drm_gpusvm_init() arguments (Arnd) - Other OA refactor (Ashutosh) - Refactor PAT and expose debugfs (Xin) - Enable Indirect Ring State for xe3p_xpc (Niranjana) - MEI interrupt fix (Junxiao) - Add stats for mode switching on hw_engine_group (Francois) - DMA-Buf related changes (Thomas) - Multi Queue feature support (Niranjana) - Enable I2C controller for Crescent Island (Raag) - Enable NVM for Crescent Island (Sasha) - Increase TDF timeout (Jagmeet) - Restore engine registers before restarting schedulers after GT reset (Jan) - Page Reclamation Support for Xe3p Platforms (Brian, Brost, Oak) - Fix performance when pagefaults and 3d/display share resources (Brost) - More OA MERT work (Ashutosh) - Fix return values (Dan) - Some log level and messages improvements (Brost) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/aUXUhEgzs6hDLQuu@intel.com	2025-12-27 17:22:23 +10:00
Dave Airlie	c5fb82d113	Merge tag 'drm-intel-next-2025-12-19' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next Beyond Display related: - Switch to use kernel standard fault injection in i915 (Juha-Pekka) Display uAPI related: - Display uapi vs. hw state fixes (Ville) - Expose sharpness only if num_scalers is >= 2 (Nemesa) Display related: - More display driver refactor and clean-ups, specially towards separation (Jani) - Add initial support Xe3p_LPD for NVL (Gustavo, Sai, ) - BMG FBC W/a (Vinod) - RPM fix (Dibin) - Add MTL+ platforms to support dpll framework (Mika, Imre) - Other PLL related fixes (Imre) - Fix DIMM_S DRAM decoding on ICL (Ville) - Async flip refactor (Ville, Jouni) - Go back to using AUX interrupts (Ville) - Reduce severity of failed DII FEC enabling (Grzelak) - Enable system cache support for FBC (Vinod) - Move PSR/Panel Replay sink data into intel_connector and other PSR changes (Jouni) - Detect AuxCCS support via display parent interface (Tvrtko) - Clean up link BW/DSC slice config computation(Imre) - Toggle powerdown states for C10 on HDMI (Gustavo) - Add parent interface for PC8 forcewake tricks (Ville) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/aUW3bVDdE63aSFOJ@intel.com	2025-12-27 16:27:04 +10:00
Dave Airlie	7bc0f871f9	Merge tag 'drm-misc-next-2025-12-19' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for 6.20: Core Changes: - dma-buf: Add tracepoints - sched: Introduce new helpers Driver Changes: - amdxdna: Enable hardware context priority, Remove (obsolete and never public) NPU2 Support, Race condition fix - rockchip: Add RK3368 HDMI Support - rz-du: Add RZ/V2H(P) MIPI-DSI Support - panels: - st7571: Introduce SPI support - New panels: Sitronix ST7920, Samsung LTL106HL02, LG LH546WF1-ED01, HannStar HSD156JUW2 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patch.msgid.link/20251219-arcane-quaint-skunk-e383b0@houat	2025-12-26 19:00:41 +10:00
Dave Airlie	6c8e404891	Merge tag 'drm-misc-next-2025-12-12' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for 6.19: UAPI Changes: - panfrost: Add PANFROST_BO_SYNC ioctl - panthor: Add PANTHOR_BO_SYNC ioctl Core Changes: - atomic: Add drm_device pointer to drm_private_obj - bridge: Introduce drm_bridge_unplug, drm_bridge_enter, and drm_bridge_exit - dma-buf: Improve sg_table debugging - dma-fence: Add new helpers, and use them when needed - dp_mst: Avoid out-of-bounds access with VCPI==0 - gem: Reduce page table overhead with transparent huge pages - panic: Report invalid panic modes - sched: Add TODO entries - ttm: Various cleanups - vblank: Various refactoring and cleanups - Kconfig cleanups - Removed support for kdb Driver Changes: - amdxdna: Fix race conditions at suspend, Improve handling of zero tail pointers, Fix cu_idx being overwritten during command setup - ast: Support imported cursor buffers - - panthor: Enable timestamp propagation, Multiple improvements and fixes to improve the overall robustness, notably of the scheduler. - panels: - panel-edp: Support for CSW MNE007QB3-1, AUO B140HAN06.4, AUO B140QAX01.H Signed-off-by: Dave Airlie <airlied@redhat.com> [airlied: fix mm conflict] From: Maxime Ripard <mripard@redhat.com> Link: https://patch.msgid.link/20251212-spectacular-agama-of-abracadabra-aaef32@penduick	2025-12-26 18:15:33 +10:00
Lucas De Marchi	0b075f8293	drm/xe: Improve rebar log messages Some minor improvements to the log messages in the rebar logic: use xe-oriented printk, switch unit from M to MiB in a few places for consistency and use ilog2(SZ_1M) for clarity. Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251219211650.1908961-6-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2025-12-24 07:59:35 -08:00
Lucas De Marchi	382876afa7	drm/xe: Move rebar to its own file Now that xe_pci.c calls the rebar directly, it doesn't make sense to keep it in xe_vram.c since it's closer to the PCI initialization than to the VRAM. Move it to its own file. While at it, add a better comment to document the possible values for the vram_bar_size module parameter. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/20251219211650.1908961-5-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2025-12-24 07:59:35 -08:00
Jonathan Cavitt	ac1317df03	drm/xe/guc: READ/WRITE_ONCE ct->state Use READ_ONCE and WRITE_ONCE when operating on ct->state to prevent the compiler form ignoring important modifications to its value. Suggested-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251222201957.63245-6-jonathan.cavitt@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-12-23 16:43:49 -05:00
Jonathan Cavitt	b5179dbd1c	drm/xe/guc: READ/WRITE_ONCE g2h_fence->done Use READ_ONCE and WRITE_ONCE when operating on g2h_fence->done to prevent the compiler from ignoring important modifications to its value. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Suggested-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251222201957.63245-5-jonathan.cavitt@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-12-23 16:43:49 -05:00
Umesh Nerlige Ramappa	c3a613a039	drm/xe/soc_remapper: Add system controller config for SoC remapper Define system controller config bits and helpers for SoC remapper. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patch.msgid.link/20251223183943.3175941-8-umesh.nerlige.ramappa@intel.com	2025-12-23 11:43:51 -08:00
Umesh Nerlige Ramappa	32eab46a61	drm/xe/soc_remapper: Use SoC remapper helper from VSEC code Since different drivers can use SoC remapper, modify VSEC code to access SoC remapper via a helper that would synchronize such accesses. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patch.msgid.link/20251223183943.3175941-7-umesh.nerlige.ramappa@intel.com	2025-12-23 11:43:49 -08:00
Umesh Nerlige Ramappa	a9f88c68f8	drm/xe/soc_remapper: Initialize SoC remapper during Xe probe SoC remapper is used to map different HW functions in the SoC to their respective drivers. Initialize SoC remapper during driver load. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patch.msgid.link/20251223183943.3175941-6-umesh.nerlige.ramappa@intel.com	2025-12-23 11:43:46 -08:00
Calvin Owens	e67870321a	drm/xe: Don't use absolute path in generated header comment Building the XE driver through Yocto throws this QA warning: WARNING: mc🏠linux-stable-6.17-r0 do_package_qa: QA Issue: File /usr/src/debug/linux-stable/6.17/drivers/gpu/drm/xe/generated/xe_device_wa_oob.h in package linux-stable-src contains reference to TMPDIR [buildpaths] WARNING: mc🏠linux-stable-6.17-r0 do_package_qa: QA Issue: File /usr/src/debug/linux-stable/6.17/drivers/gpu/drm/xe/generated/xe_wa_oob.h in package linux-stable-src contains reference to TMPDIR [buildpaths] ...because the comment at the top of the generated header contains the absolute path to the rules file at build time: * This file was generated from rules: /home/calvinow/git/meta-house/build/tmp-house/work-shared/nuc14rvhu7/kernel-source/drivers/gpu/drm/xe/xe_device_wa_oob.rules Fix this minor annoyance by putting the basename of the rules file in the generated comment instead of the absolute path, so the generated header contents no longer depend on the location of the kernel source. Signed-off-by: Calvin Owens <calvin@wbinvd.org> Link: https://patch.msgid.link/20251222165441.516102-2-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-12-23 10:05:06 -05:00
Francois Dugast	15e096960a	drm/xe/migrate: Configure migration queue as low latency Commit `5488bec96b` ("drm/xe/uapi: Use hint for guc to set GT frequency") introduced low latency hint for use by user space when creating an exec queue. This instructs SLPC to ramp the GT frequency aggressively. SVM relies on an internal exec queue to migrate memory upon page faults. This change creates this exec queue with the low latency hint to speed up migration. This should not impact systems where GT frequency is set over sysfs, or with long running workloads which give enough time for the frequency to ramp up. An example of memory access pattern that shows an improvement of SVM performance is running hundreds of times IGT eu-fault-2m-once-device in xe_exec_system_allocator. The copy duration provided by GT stats in svm_2M_device_copy_us shows per GPU page fault: ~ 165 μs without low latency hint ~ 130 μs with low latency hint Suggested-by: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20251223115327.49555-1-francois.dugast@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-12-23 10:03:48 -05:00
Thomas Hellström	0620837490	drm/xe/svm: Serialize migration to device if racing Introduce an rw-semaphore to serialize migration to device if it's likely that migration races with another device migration of the same CPU address space range. This is a temporary fix to attempt to mitigate a livelock that might happen if many devices try to migrate a range at the same time, and it affects only devices using the xe driver. A longer term fix is probably improvements in the core mm migration layer. Suggested-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251219113320.183860-25-thomas.hellstrom@linux.intel.com	2025-12-23 10:00:49 +01:00
Thomas Hellström	ec265e1f1c	drm/pagemap: Support source migration over interconnect Support source interconnect migration by using the copy_to_ram() op of the source device private pages. Source interconnect migration is required to flush the L2 cache of the source device, which among other things is a requirement for correct global atomic operation. It also enables the source GPU to potentially decompress any compressed content which is not understood by peers, and finally for the PCIe case, it's expected that writes over PCIe will be faster than reads. The implementation can probably be improved by coalescing subregions with the same source. v5: - Update waiting for the pre_migrate_fence and comments around that, previously in another patch. (Himal). - Actually select device private pages to migrate when source_peer_migrates is true. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> # For merging through drm-xe. Link: https://patch.msgid.link/20251219113320.183860-24-thomas.hellstrom@linux.intel.com	2025-12-23 10:00:49 +01:00
Thomas Hellström	75af93b3f5	drm/pagemap, drm/xe: Support destination migration over interconnect Support destination migration over interconnect when migrating from device-private pages with the same dev_pagemap owner. Since we now also collect device-private pages to migrate, also abort migration if the range to migrate is already fully populated with pages from the desired pagemap. Finally return -EBUSY from drm_pagemap_populate_mm() if the migration can't be completed without first migrating all pages in the range to system. It is expected that the caller will perform that before retrying the call to drm_pagemap_populate_mm(). v3: - Fix a bug where the p2p dma-address was never used. - Postpone enabling destination interconnect migration, since xe devices require source interconnect migration to ensure the source L2 cache is flushed at migration time. - Update the drm_pagemap_migrate_to_devmem() interface to pass migration details. v4: - Define XE_INTERCONNECT_P2P unconditionally (CI) - Include a missing header (CI) v5: - Use page order increments where possible (Matt Brost). - Fix a negated value of can_migrate_same_pagemap. - Move removal of some dead code to a separate patch (Matt Brost). - Remove an unnecessary zdd get() and put() (Matt Brost). Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> # For merging through drm-xe. Link: https://patch.msgid.link/20251219113320.183860-23-thomas.hellstrom@linux.intel.com	2025-12-23 10:00:49 +01:00
Thomas Hellström	0471ed20df	drm/xe: Use drm_gpusvm_scan_mm() Use drm_gpusvm_scan_mm() to avoid unnecessarily calling into drm_pagemap_populate_mm(); v3: - New patch. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patch.msgid.link/20251219113320.183860-22-thomas.hellstrom@linux.intel.com	2025-12-23 10:00:48 +01:00

1 2 3 4 5 ...

1412912 Commits