linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-17 02:32:57 -04:00

Author	SHA1	Message	Date
Satyanarayana K V P	bcd768d787	drm/xe/vf: Fix fs_reclaim warning with CCS save/restore BB allocation CCS save/restore batch buffers are attached during BO allocation and detached during BO teardown. The shrinker triggers xe_bo_move(), which is used for both allocation and deletion paths. When BO allocation and shrinking occur concurrently, a circular locking dependency involving fs_reclaim and swap_guard can occur, leading to a deadlock such as: =============================================================== * WARNING: possible circular locking dependency detected * --------------------------------------------------------------- * * * CPU0 CPU1 * * ---- ---- * * lock(fs_reclaim); * * lock(&sa_manager->swap_guard); * * lock(fs_reclaim); * * lock(&sa_manager->swap_guard); * * * * * DEADLOCK * * =============================================================== To avoid this, the BB pointer and SA are allocated using xe_bb_alloc() before taking lock and SA is initialized using xe_bb_init() preventing reclaim from being invoked in this context. Fixes: `864690cf4d` ("drm/xe/vf: Attach and detach CCS copy commands with BO") Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Maarten Lankhorst <dev@lankhorst.se> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260220055519.2485681-7-satyanarayana.k.v.p@intel.com	2026-02-20 10:54:03 -08:00
Satyanarayana K V P	16843e6638	drm/sa: Split drm_suballoc_new() into SA alloc and init helpers drm_suballoc_new() currently both allocates the SA object using kmalloc() and searches for a suitable hole in the sub-allocator for the requested size. If SA allocation is done by holding sub-allocator mutex, this design can lead to reclaim safety issues. By splitting the kmalloc() step outside of the critical section, we allow the memory allocation to use GFP_KERNEL (reclaim-safe) while ensuring that the initialization step that holds reclaim-tainted locks (sub-allocator mutex) operates in a reclaim-unsafe context with pre-allocated memory. This separation prevents potential deadlocks where memory reclaim could attempt to acquire locks that are already held during the sub-allocator operations. Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Suggested-by: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Christian König <christian.koenig@amd.com> Cc: dri-devel@lists.freedesktop.org Cc: Maarten Lankhorst <dev@lankhorst.se> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Maarten Lankhorst <dev@lankhorst.se> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260220055519.2485681-6-satyanarayana.k.v.p@intel.com	2026-02-20 10:54:02 -08:00
Shuicheng Lin	a5d5634cde	drm/xe/sync: Fix user fence leak on alloc failure When dma_fence_chain_alloc() fails, properly release the user fence reference to prevent a memory leak. Fixes: `adda4e855a` ("drm/xe: Enforce correct user fence signaling order using") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260219233516.2938172-6-shuicheng.lin@intel.com	2026-02-20 10:49:08 -08:00
Shuicheng Lin	f939bdd920	drm/xe/sync: Cleanup partially initialized sync on parse failure xe_sync_entry_parse() can allocate references (syncobj, fence, chain fence, or user fence) before hitting a later failure path. Several of those paths returned directly, leaving partially initialized state and leaking refs. Route these error paths through a common free_sync label and call xe_sync_entry_cleanup(sync) before returning the error. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260219233516.2938172-5-shuicheng.lin@intel.com	2026-02-20 10:49:07 -08:00
Michal Wajdeczko	9ca192cbcd	drm/xe/pf: Add documentation for vram_quota Add initial documentation for recently added VRAM provisioning Xe driver specific SR-IOV sysfs files under device/sriov_admin. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20260218205553.3561-11-michal.wajdeczko@intel.com	2026-02-20 15:50:08 +01:00
Michal Wajdeczko	d039fa856e	drm/xe/pf: Skip VRAM auto-provisioning if already provisioned In case VF's VRAM provisioning using sysfs is done by the admin prior to VFs enabling, this provisioning will be lost as PF will run VRAM auto-provisioning anyway. To avoid that skip this auto- provisioning if any VF has been already provisioned with VRAM. To help admin find any mistakes, add diagnostics messages about which VFs were provisioned with VRAM and which were missed. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20260218205553.3561-10-michal.wajdeczko@intel.com	2026-02-20 15:50:07 +01:00
Michal Wajdeczko	67a716b693	drm/xe/pf: Prefer guard(mutex) when doing fair LMEM provisioning We will add more code there and with guard() it will easier to avoid mistakes in unlocking. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20260218205553.3561-9-michal.wajdeczko@intel.com	2026-02-20 15:50:06 +01:00
Michal Wajdeczko	62acbb1dd5	drm/xe/pf: Don't check for empty config We already turn off VFs auto-provisioning once we detect manual VFs provisioning over the debugfs, so we can skip additional check for all VFs configs being still empty. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20260218205553.3561-8-michal.wajdeczko@intel.com	2026-02-20 15:50:05 +01:00
Michal Wajdeczko	cbe29da6f7	drm/xe/tests: Add KUnit tests for new VRAM fair provisioning Add basic test cases to check outcome of the fair VRAM provisioning for regular and admin-only PF mode. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20260218205553.3561-7-michal.wajdeczko@intel.com	2026-02-20 15:50:04 +01:00
Michal Wajdeczko	81d417d56a	drm/xe/pf: Use migration-friendly VRAM auto-provisioning Instead of trying very hard to find the largest fair VRAM (aka LMEM) size that could be allocated for VFs on the current tile, pick some smaller rounded down to power-of-two value that is more likely to be provisioned in the same manner by the other PF instances. In some cases, the outcome of above calculation might not be optimal, but it's expected that admin will do fine-tuning using sysfs files. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20260218205553.3561-6-michal.wajdeczko@intel.com	2026-02-20 15:50:03 +01:00
Michal Wajdeczko	b1d2746aa5	drm/xe/pf: Allow to change VFs VRAM quota using sysfs On current discrete platforms, PF will provision all VFs with a fair amount of the VRAM (LMEM) during VFs enabling. However, in some cases this automatic VRAM provisioning might be either non-reproducible or sub-optimal. This could break VF's migration or impact performance. Expose per-VF VRAM quota read-write sysfs attributes to allow admin change default VRAM provisioning performed by the PF. /sys/bus/pci/drivers/xe/BDF/ ├── sriov_admin/ ├── .bulk_profile │ └── vram_quota [RW] unsigned integer ├── vf1/ │ └── profile │ └── vram_quota [RW] unsigned integer ├── vf2/ │ └── profile │ └── vram_quota [RW] unsigned integer Above values represent total provisioned VRAM from all tiles where VFs were assigned, and currently it's from all tiles always. Note that changing VRAM provisioning is only possible when VF is not running, otherwise GuC will complain. To make sure that given VF is idle, triggering VF FLR might be needed. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20260218205553.3561-5-michal.wajdeczko@intel.com	2026-02-20 15:50:02 +01:00
Michal Wajdeczko	5ae3c886a1	drm/xe/pf: Add functions for VRAM provisioning We already have functions to configure VF LMEM (aka VRAM) on the tile/GT level, used by the auto-provisioning and debugfs, but we also need functions that will work on the device level that will configure VRAM on all tiles at once. We will use these new functions in upcoming patch. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20260218205553.3561-4-michal.wajdeczko@intel.com	2026-02-20 15:50:01 +01:00
Michal Wajdeczko	146f25b40c	drm/xe/pf: Add locked variants of VRAM configuration functions We already have few functions to configure LMEM (aka VRAM) but they all are taking master mutex. Split them and expose locked variants to allow use by the caller who already hold this mutex. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20260218205553.3561-3-michal.wajdeczko@intel.com	2026-02-20 15:49:59 +01:00
Michal Wajdeczko	2d892455f3	drm/xe/pf: Expose LMTT page size The underlying LMTT implementation already provides the info about the page size it is using. There is no need to have a separate helper function that is making assumption about the required size. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patch.msgid.link/20260218205553.3561-2-michal.wajdeczko@intel.com	2026-02-20 15:49:58 +01:00
Tomasz Lis	c2366539d3	drm/xe/guc: Increase GuC log sizes in debug builds Increase event log size for GuC debug to 16MB, and for general debug to 8MB. This allows for useful debug even if performance-affecting DRM_XE_DEBUG_GUC is not enabled. Without this change, GuC logs gathered by CI are useless for debug due to limited size, which translates to time frame not even able to cover cleanup after test. Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260213140008.1473400-1-tomasz.lis@intel.com	2026-02-20 13:11:09 +01:00
Harish Chegondi	7c9b2de8a9	drm/xe/xe2lpg: Extend Wa_18041344222 to graphics IP 20.04 Apply WA 18041344222 to Xe2 LPG graphics IP version 20.04 too. Bspec: 56024 Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/6e66746246439249a278f3d157f06071d83504b6.1770760591.git.harish.chegondi@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-02-19 16:04:24 -08:00
Harish Chegondi	0ffe9dcf26	drm/xe/xe3: Remove SRIOV VF check for Wa_18041344222 Engine WAs are not applied for SRIOV VF, even though they are processed. Remove the SRIOV VF check. Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/5879396bf202b64d9b5c4cb8c720f3e65d358fc1.1770760591.git.harish.chegondi@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-02-19 16:04:24 -08:00
Harish Chegondi	a800b95c24	drm/xe/xe2hpg: Remove SRIOV VF check for Wa_18041344222 Engine WAs are not applied for SRIOV VF, even though they are processed. Remove the SRIOV VF check. Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/4043a30d6a971cda3c13145e081e4eed7cc4e440.1770760591.git.harish.chegondi@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-02-19 16:04:24 -08:00
Nitin Gote	9812865cc6	drm/xe/xe3p_lpg: Add Wa_14026781792 Wa_14026781792 applies Xe3p_LPG graphics version 35.10. Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260219082931.2199618-2-nitin.r.gote@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-02-19 15:56:27 -08:00
Matt Roper	764af38af2	drm/xe/reg_sr: Allow register_save_restore_check debugfs to verify LRC values reg_sr programming that applies to an engines LRC cannot be verified by a simple CPU-based register readout because the reg_sr's values may not be in effect if no context is executing on the hardware at the time we check. Instead, we should verify correct reg_sr application by searching for the register in the default_lrc. Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/20260218-sr_verify-v4-4-35d6deeb3421@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-02-19 07:31:05 -08:00
Matt Roper	e950b06014	drm/xe: Add facility to lookup the value of a register in a default LRC An LRC is stored in memory as a special batchbuffer that hardware will execute to re-load state when switching to the context; it's a collection of register values (encoded as MI_LOAD_REGISTER_IMM commands) and other state instructions (e.g., 3DSTATE_*). The value that will be loaded for a given register can be determined by parsing the batchbuffer to find MI_LRI commands and extracting the value from the offset/value pairs it contains. Add functions to do this, which will be used in a future patch to help verify that our expected reg_sr programming is in place. The implementation here returns the value as soon as it finds a match in the LRC. Technically a register could appear multiple times (either due to memory corruption or a hardware defect) and the last value encountered would be the one in effect when the context resumes execution. We can adjust the logic to keep looking and return the last match instead of first in the future if we encounter real-world cases where this would assist with debugging. Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/20260218-sr_verify-v4-3-35d6deeb3421@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-02-19 07:31:05 -08:00
Matt Roper	d389489225	drm/xe/reg_sr: Add debugfs to verify status of reg_sr programming When applying save-restore register programming for workarounds, tuning settings, and general device configuration we assume the programming was successful. However there are a number of cases where the desired reg_sr programming can become lost: - workarounds implemented on the wrong RTP table might not get saved/restored at the right time leading to, for example, failure to re-apply the programming after engine resets - some hardware registers become "locked" and can no longer be updated after firmware or the driver finishes initializing them - sometimes the hardware teams just made a mistake when documenting the register and/or bits that needed to be programmed Add a debugfs entry that will read back the registers referenced on a GT's save-restore lists and print any cases where the desired programming is no longer in effect. Such cases might indicate the presence of a driver/firmware bug, might indicate that the documentation we were following has a mistake, or might be benign (occasionally registers have broken read-back capability preventing verification, but previous writes were still successful and effective). For now we only verify the GT and engine reg_sr lists. Verifying the LRC list will require checking the expected programming against the default_lrc contents, not the live registers (which may not reflect the reg_sr programming if no context is actively running). Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/20260218-sr_verify-v4-2-35d6deeb3421@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-02-19 07:31:05 -08:00
Matt Roper	a41ee215b5	drm/xe/reg_sr: Don't process gt/hwe lists in VF There are a few different reg_sr lists managed by the driver for workarounds/tuning: - gt->reg_sr - hwe->reg_sr - hwe->reg_lrc The first two are not relevant to SRIOV VFs; a VF KMD does not have access to the registers that appear on this list and it is the PF KMD's responsibility to apply such programming on behalf of the entire system. However the third list contains per-client values that the VF KMD needs to ensure are incorporated whenever a new LRC is created. Handling of reg_sr lists comes in two steps: processing an RTP table to build a reg_sr from the relevant entries, and then applying the contents of the reg_sr. Skipping the RTP processing (resulting in an empty reg_sr) or skipping the application of a reg_sr are both valid ways to avoid having a VF accidentally try to write registers it doesn't have access to. In commit `c19e705ec9` ("drm/xe/vf: Stop applying save-restore MMIOs if VF") and commit `92a5bd3024` ("drm/xe/vf: Unblock xe_rtp_process_to_sr for VFs") we adjusted the drivers behavior to always process the RTP table into a reg_sr and just skipped the application step. This works fine functionally, but can lead to confusion during debugging since facilities like the debugfs 'register-save-restore' will still report a bunch of registers that the VF KMD isn't actually trying to handle. It will also mislead other upcoming debug changes. Let's go back to skipping the RTP => reg_sr processing step, but only for GT / hwe tables this time. This will allow LRC reg_sr handling to continue to work, but will ensure that gt->reg_sr and hwe->reg_sr remain empty and that debugfs reporting more accurately reflects the KMD's behavior. v2: - Also skip the hwe processing in hw_engine_setup_default_state() and xe_reg_whitelist_process_engine(). v3: - Handle skipping via an additional parameter passed to xe_rtp_process_to_sr() rather than adding conditions at each callsite. (Ashutosh) Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Cc: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/20260218-sr_verify-v4-1-35d6deeb3421@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-02-19 07:31:05 -08:00
Matt Roper	6c2e331c91	drm/xe/wa: Steer RMW of MCR registers while building default LRC When generating the default LRC, if a register is not masked, we apply any save-restore programming necessary via a read-modify-write sequence that will ensure we only update the relevant bits/fields without clobbering the rest of the register. However some of the registers that need to be updated might be MCR registers which require steering to a non-terminated instance to ensure we can read back a valid, non-zero value. The steering of reads originating from a command streamer is controlled by register CS_MMIO_GROUP_INSTANCE_SELECT. Emit additional MI_LRI commands to update the steering before any RMW of an MCR register to ensure the reads are performed properly. Note that needing to perform a RMW of an MCR register while building the default LRC is pretty rare. Most of the MCR registers that are part of an engine's LRCs are also masked registers, so no MCR is necessary. Fixes: `f2f90989cc` ("drm/xe: Avoid reading RMW registers in emit_wa_job") Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patch.msgid.link/20260206223058.387014-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-02-19 07:09:56 -08:00
Matthew Brost	9ff885ef8b	drm/xe: Convert GT stats to per-cpu counters Current GT statistics use atomic64_t counters. Atomic operations incur a global coherency penalty. Transition to dynamic per-cpu counters using alloc_percpu(). This allows stats to be incremented via this_cpu_add(), which compiles to a single non-locking instruction. This approach keeps the hot-path updates local to the CPU, avoiding expensive cross-core cache invalidation traffic. Use for_each_possible_cpu() during aggregation and clear operations to ensure data consistency across CPU hotplug events. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20260217200552.596718-1-matthew.brost@intel.com	2026-02-17 18:12:10 -08:00
Karthik Poosa	48eb073c7d	drm/xe/hwmon: Prevent unintended VRAM channel creation Remove the unnecessary VRAM channel entry introduced in xe_hwmon_channel. Without this, adding any new hwmon channel causes extra VRAM channel to appear. This remained unnoticed earlier because VRAM was the final xe hwmon channel. v2: Use MAX_VRAM_CHANNELS with in_range() instead of CHANNEL_VRAM_N_MAX. (Raag) Fixes: `49a4983384` ("drm/xe/hwmon: Expose individual VRAM channel temperature") Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Reviewed-by: Raag Jadav <raag.jadav@intel.com> Link: https://patch.msgid.link/20260206081655.2115439-1-karthik.poosa@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-02-17 19:21:30 -05:00
Arnd Bergmann	95162db020	drm/pagemap: pass pagemap_addr by reference Passing a structure by value into a function is sometimes problematic, for a number of reasons. Of of these is a warning from the 32-bit arm compiler: drivers/gpu/drm/drm_gpusvm.c: In function '__drm_gpusvm_unmap_pages': drivers/gpu/drm/drm_gpusvm.c:1152:33: note: parameter passing for argument of type 'struct drm_pagemap_addr' changed in GCC 9.1 1152 \| dpagemap->ops->device_unmap(dpagemap, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1153 \| dev, *addr); \| ~~~~~~~~~~~ This particular problem is harmless since we are not mixing compiler versions inside of the compiler. However, passing this by reference avoids the warning along with providing slightly better calling conventions as it avoids an extra copy on the stack. Fixes: `75af93b3f5` ("drm/pagemap, drm/xe: Support destination migration over interconnect") Fixes: `2df55d9e66` ("drm/xe: Support pcie p2p dma as a fast interconnect") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20260216134644.1025365-1-arnd@kernel.org Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>	2026-02-17 13:10:52 +01:00
Maarten Lankhorst	08d05c7366	drm/xe: Remove xe_ggtt_node_allocated With the intermediate state gone, no longer useful. Just check against NULL where needed. After looking carefully, the check for allocated in xe_fb_pin.c is unneeded. vma->node is never NULL. The check is specifically only to check if vma->node == the bo's root tile ggtt_obj. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260206112108.1453809-12-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>	2026-02-16 14:20:24 +01:00
Maarten Lankhorst	a4eac88e31	drm/xe: Make xe_ggtt_node_insert return a node This extra step is easier to handle inside xe_ggtt.c and makes xe_ggtt_node_allocated a simple null check instead, as the intermediate state 'allocated but not inserted' is no longer used. Privatize xe_ggtt_node_fini() and init() as they're no longer used outside of xe_ggtt.c Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v1 Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260206112108.1453809-11-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>	2026-02-16 14:20:24 +01:00
Maarten Lankhorst	95f5f9a96d	drm/xe: Move struct xe_ggtt to xe_ggtt.c No users left outside of xe_ggtt.c, so we can make the struct private. This prevents us from accidentally touching it before init. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260206112108.1453809-10-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>	2026-02-16 14:20:24 +01:00
Maarten Lankhorst	e904c56ba6	drm/xe: Rewrite GGTT VF initialization The previous code was using a complicated system with 2 balloons to set GGTT size and adjust GGTT offset. While it works, it's overly complicated. A better approach is to set the offset and size when initializing GGTT, this removes the need for adding balloons. The resize function only needs readjust ggtt->start to have GGTT at the new offset. This removes the need to manipulate the internals of xe_ggtt outside of xe_ggtt, and cleans up a lot of now unneeded code. Co-developed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260206112108.1453809-9-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>	2026-02-16 14:20:24 +01:00
Maarten Lankhorst	7feebdb041	drm/xe: Make xe_ggtt_node offset relative to starting offset Fix all functions that use node->start to use xe_ggtt_node_addr, and add ggtt->start to node->start. This will make node shifting for SR-IOV VF a one-liner, instead of manually changing each GGTT node's base address. Also convert some uses of mutex_lock/unlock to mutex guards. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260206112108.1453809-8-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>	2026-02-16 14:20:09 +01:00
Jani Nikula	4a175759e3	drm/xe: remove unnecessary struct dram_info forward declaration There's no longer any need for the struct dram_info forward declaration. Remove it. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260212131206.1804113-1-jani.nikula@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-02-13 15:01:43 -08:00
Matthew Brost	2405ba53ff	drm/xe: Avoid touching consumer fields in GuC pagefault ack The GuC pagefault acknowledgment code is designed to extract the fields needed for the acknowledgment from the producer-stored message so that the consumer fields can be overloaded to return additional information. The ASID is stored in the producer message; extract it from there to future‑proof this logic. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20260212204227.2764054-3-matthew.brost@intel.com	2026-02-13 12:03:47 -08:00
Matthew Brost	68be2bfe4b	drm/xe: Pack fault type and level into a u8 Pack the fault type and level fields into a single u8 to save space in struct xe_pagefault. This also makes future extensions easier. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://patch.msgid.link/20260212204227.2764054-2-matthew.brost@intel.com	2026-02-13 12:03:39 -08:00
Arvind Yadav	2882094e0d	drm/xe/xe2: Apply Wa_14024997852 Applied Wa_14024997852 to Graphics version 20.01 to 20.04 Whitelist registers needed for userspace to control autostrip on xe2. v2: - set Bit 31 of FF_MODE, for TE autostrip disable (Nitin) v3: - Need to whitelist these for Xe2 IPs (MATT R) v4: - Combine these into a single range for simplicity:(2001, 3005) (MATT R) Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Arvind Yadav <arvind.yadav@intel.com> Link: https://patch.msgid.link/20260212065920.1815979-1-arvind.yadav@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-02-13 09:26:14 -08:00
Raag Jadav	c020fff70d	drm/xe/bo: Redirect faults to dummy page for wedged device As per uapi documentation[1], the prerequisite for wedged device is to redirected page faults to a dummy page. Follow it. [1] Documentation/gpu/drm-uapi.rst v2: Add uapi reference and fixes tag (Matthew Brost) Fixes: `7bc00751f8` ("drm/xe: Use device wedged event") Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260212055622.2054991-1-raag.jadav@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-02-13 09:21:22 -08:00
Piotr Piórkowski	aafbb42be5	drm/xe: Force EXEC_QUEUE_FLAG_KERNEL for kernel internal VMs VMs created without an associated xe_file originate from kernel contexts and should use kernel exec queues. Ensure such VMs create bind exec queues with EXEC_QUEUE_FLAG_KERNEL set. Let's ensure bind exec queues created for kernel VMs are always marked with EXEC_QUEUE_FLAG_KERNEL. Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260211171441.3246686-1-piotr.piorkowski@intel.com	2026-02-13 13:16:36 +01:00
Matt Roper	1ff4b1730c	drm/xe: Stop applying Wa_16018737384 from Xe3 onward Wa_16018737384 is one of the rare cases where the hardware teams mark a workaround as "driver change required" rather than "permanent/temporary workaround" in the internal workaround database, signifying that the implementation details of the workaround should just be considered standard programming instructions on all platforms going forward. Cases like this are the only time that using XE_RTP_END_VERSION_UNDEFINED as an upper bound for a workaround's IP range is warranted and correct. However in this specific case, the register bit in question (0xE4F0[1]) simply no longer exists in hardware from Xe3 onward. Trying to write to that bit on Xe3 or Xe3p platforms is harmless and just doesn't have any effect, but it's possible that the register bit could get repurposed to control something else down the road on future platforms. To avoid any surprises in the future we should replace the unbounded upper bound in our RTP table with a value that accurately reflects that Wa_16018737384 can only apply to Xe2 platforms. Bspec: 56849 Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Link: https://patch.msgid.link/20260211234735.620087-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-02-12 16:19:18 -08:00
Matt Roper	b5b55d0932	drm/xe/xe3p_xpc: Add new XeCore fuse registers to VF runtime regs SRIOV VFs do not automatically have access to the XeCore fuse registers. Add the two new registers that show up on Xe3p_XPC to the runtime register list to grant VFs access. Since there's a single runtime register list for all Xe3p, this will technically also grant access on Xe3p_LPG platforms where the registers don't exist, but that should be harmless since even if a VF tries to read a non-existent register on those platforms it will just get back a sensible value of 0x0. Fixes: `e8100643ff` ("drm/xe/xe3p_xpc: XeCore mask spans four registers") Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Ngai-Mint Kwan <ngai-mint.kwan@linux.intel.com> Link: https://patch.msgid.link/20260210182519.206952-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-02-12 11:01:41 -08:00
Raag Jadav	6d83ef1ada	drm/xe: Update xe_device_declare_wedged() error log Since the introduction of DRM wedged event, there are now a few different procedures to recover the device depending on selected recovery method. Update the error log to reflect this and point the user to correct documentation for it. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20260205113424.1629204-1-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-02-12 08:55:57 -05:00
Thomas Hellström	1a3c0049b3	Revert "drm/pagemap: Disable device-to-device migration" With commit `a69d1ab971` ("mm: Fix a hmm_range_fault() livelock / starvation problem") device-to-device migration is not functional again and the disabling can be reverted. Add the above commit as a Fixes: tag in order for the revert to not take place unless that commit is present. This reverts commit `10dd1eaa80`. Cc: Matthew Brost <matthew.brost@intel.com> Fixes: `a69d1ab971` ("mm: Fix a hmm_range_fault() livelock / starvation problem") Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260211104159.114947-1-thomas.hellstrom@linux.intel.com	2026-02-12 11:12:40 +01:00
Shuicheng Lin	25c9aa4dcb	drm/xe: Make xe_modparam.force_vram_bar_size signed vram_bar_size is registered as an int module parameter and is documented to accept negative values to disable BAR resizing. Store it as an int in xe_modparam as well, so negative values work as intended and the module_param type matches. Fixes: `80742a1aa2` ("drm/xe: Allow to drop vram resizing") Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://patch.msgid.link/20260202181853.1095736-2-shuicheng.lin@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-02-11 11:00:09 -08:00
Piotr Piórkowski	0bcacf56dc	drm/xe/vf: Avoid reading media version when media GT is disabled When the media GT is not allowed, a VF must not attempt to read the media version from the GuC. The GuC may not be loaded, and any attempt to communicate with it would result in a timeout and a VF probe failure: (...) [ 1912.406046] xe 0000:01:00.1: [drm] ERROR Tile0: GT1: GuC mmio request 0x5507: no reply 0x5507 [ 1912.407277] xe 0000:01:00.1: [drm] ERROR Tile0: GT1: [GUC COMMUNICATION] MMIO send failed (-ETIMEDOUT) [ 1912.408689] xe 0000:01:00.1: [drm] ERROR VF: Tile0: GT1: Failed to reset GuC state (-ETIMEDOUT) [ 1912.413986] xe 0000:01:00.1: probe with driver xe failed with error -110 Let's skip reading the media version for VFs when the media GT is not allowed. v2: move the condition directly to the VF path Fixes: `7abd69278b` ("drm/xe/configfs: Add attribute to disable GT types") Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://patch.msgid.link/20260202115041.2863357-1-piotr.piorkowski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>	2026-02-11 11:26:59 +01:00
Thomas Hellström	a69d1ab971	mm: Fix a hmm_range_fault() livelock / starvation problem If hmm_range_fault() fails a folio_trylock() in do_swap_page, trying to acquire the lock of a device-private folio for migration, to ram, the function will spin until it succeeds grabbing the lock. However, if the process holding the lock is depending on a work item to be completed, which is scheduled on the same CPU as the spinning hmm_range_fault(), that work item might be starved and we end up in a livelock / starvation situation which is never resolved. This can happen, for example if the process holding the device-private folio lock is stuck in migrate_device_unmap()->lru_add_drain_all() sinc lru_add_drain_all() requires a short work-item to be run on all online cpus to complete. A prerequisite for this to happen is: a) Both zone device and system memory folios are considered in migrate_device_unmap(), so that there is a reason to call lru_add_drain_all() for a system memory folio while a folio lock is held on a zone device folio. b) The zone device folio has an initial mapcount > 1 which causes at least one migration PTE entry insertion to be deferred to try_to_migrate(), which can happen after the call to lru_add_drain_all(). c) No or voluntary only preemption. This all seems pretty unlikely to happen, but indeed is hit by the "xe_exec_system_allocator" igt test. Resolve this by waiting for the folio to be unlocked if the folio_trylock() fails in do_swap_page(). Rename migration_entry_wait_on_locked() to softleaf_entry_wait_unlock() and update its documentation to indicate the new use-case. Future code improvements might consider moving the lru_add_drain_all() call in migrate_device_unmap() to be called after all pages have migration entries inserted. That would eliminate also b) above. v2: - Instead of a cond_resched() in hmm_range_fault(), eliminate the problem by waiting for the folio to be unlocked in do_swap_page() (Alistair Popple, Andrew Morton) v3: - Add a stub migration_entry_wait_on_locked() for the !CONFIG_MIGRATION case. (Kernel Test Robot) v4: - Rename migrate_entry_wait_on_locked() to softleaf_entry_wait_on_locked() and update docs (Alistair Popple) v5: - Add a WARN_ON_ONCE() for the !CONFIG_MIGRATION version of softleaf_entry_wait_on_locked(). - Modify wording around function names in the commit message (Andrew Morton) Suggested-by: Alistair Popple <apopple@nvidia.com> Fixes: `1afaeb8293` ("mm/migrate: Trylock device page in do_swap_page") Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Leon Romanovsky <leon@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Matthew Brost <matthew.brost@intel.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: linux-mm@kvack.org Cc: <dri-devel@lists.freedesktop.org> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: <stable@vger.kernel.org> # v6.15+ Reviewed-by: John Hubbard <jhubbard@nvidia.com> #v3 Reviewed-by: Alistair Popple <apopple@nvidia.com> Link: https://patch.msgid.link/20260210115653.92413-1-thomas.hellstrom@linux.intel.com	2026-02-11 11:03:01 +01:00
Maciej Patelczyk	d287dee565	drm/gpusvm: Fix unbalanced unlock in drm_gpusvm_scan_mm() There is a unbalanced lock/unlock to gpusvm notifier lock: [ 931.045868] ===================================== [ 931.046509] WARNING: bad unlock balance detected! [ 931.047149] 6.19.0-rc6+xe-**************** #9 Tainted: G U [ 931.048150] ------------------------------------- [ 931.048790] kworker/u5:0/51 is trying to release lock (&gpusvm->notifier_lock) at: [ 931.049801] [<ffffffffa090c0d8>] drm_gpusvm_scan_mm+0x188/0x460 [drm_gpusvm_helper] [ 931.050802] but there are no more locks to release! [ 931.051463] The drm_gpusvm_notifier_unlock() sits under err_free label and the first jump to err_free is just before calling the drm_gpusvm_notifier_lock() causing unbalanced unlock. Fixes: `f1d08a5864` ("drm/gpusvm: Introduce a function to scan the current migration state") Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260209123433.1271053-1-maciej.patelczyk@intel.com	2026-02-10 10:52:49 -08:00
Matt Roper	e04c609eed	drm/xe/xe2_hpg: Fix handling of Wa_14019988906 & Wa_14019877138 The PSS_CHICKEN register has been part of the RCS engine's LRC since it was first introduced in Xe_LP. That means that any workarounds that adjust its value (such as Wa_14019988906 and Wa_14019877138) need to be implemented in the lrc_was[] table so that they become part of the default LRC from which all subsequent LRCs are copied. Although these workarounds were implemented correctly on most platforms, they were incorrectly placed on the engine_was[] table for Xe2_HPG. Move the workarounds to the proper lrc_was[] table and switch the 'xe_rtp_match_first_render_or_compute' rule to specifically match the RCS since that's the engine whose LRC manages the register. Bspec: 65182 Fixes: `7f3ee7d880` ("drm/xe/xe2hpg: Add initial GT workarounds") Reviewed-by: Shekhar Chauhan <shekhar.chauhan@intel.com> Link: https://patch.msgid.link/20260205220508.51905-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2026-02-10 07:41:48 -08:00
Gustavo Sousa	d2e0540a62	drm/xe/nvlp: Bump maximum WOPCM size On NVL-P, the primary GT's WOPCM gained an extra 8MiB for the Memory URB. As such, we need to bump the maximum size in the driver so that the driver is able to load without erroring out thinking that the WOPCM is too small. FIXME: The wopcm code in xe driver is a bit confusing. For the case where the offsets for GUC WOPCM are already locked, it appears we are using the maximum overall WOPCM size instead of the sizes relative to each type of GT. The function __check_layout() should be checking against the latter. Bspec: 67090 Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-15-636e1ad32688@intel.com Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>	2026-02-10 10:09:25 -03:00
Matt Roper	d59d94f91f	drm/i915/nvlp: Hook up display support Although NVL-S and NVL-P are quite different on the GT side, they use identical Xe3p_LPD display IP and should take all the same codepaths. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-14-636e1ad32688@intel.com Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>	2026-02-10 10:09:23 -03:00
Dnyaneshwar Bhadane	b9006dacb8	drm/xe/nvlp: Attach MOCS table for nvlp The MOCS table for NVL-P is same as for Xe2/Xe3 platforms. Signed-off-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-13-636e1ad32688@intel.com Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>	2026-02-10 10:09:21 -03:00

1 2 3 4 5 ...

1413081 Commits