Don't allow new CPU mmaps to BOs marked DONTNEED or PURGED.
DONTNEED BOs can have their contents discarded at any time, making
CPU access undefined behavior. PURGED BOs have no backing store and
are permanently invalid.
Return -EBUSY for DONTNEED BOs (temporary purgeable state) and
-EINVAL for purged BOs (permanent, no backing store).
The mmap offset ioctl now checks the BO's purgeable state before
allowing userspace to establish a new CPU mapping. This prevents
the race where userspace gets a valid offset but the BO is purged
before actual faulting begins.
Existing mmaps (established before DONTNEED) may still work until
pages are purged, at which point CPU faults fail with SIGBUS.
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Arvind Yadav <arvind.yadav@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260326130843.3545241-9-arvind.yadav@intel.com
Track purgeable state per-VMA instead of using a coarse shared
BO check. This prevents purging shared BOs until all VMAs across
all VMs are marked DONTNEED.
Add xe_bo_all_vmas_dontneed() to check all VMAs before marking
a BO purgeable. Add xe_bo_recheck_purgeable_on_vma_unbind() to
handle state transitions when VMAs are destroyed - if all
remaining VMAs are DONTNEED the BO can become purgeable, or if
no VMAs remain it transitions to WILLNEED.
The per-VMA purgeable_state field stores the madvise hint for
each mapping. Shared BOs can only be purged when all VMAs
unanimously indicate DONTNEED.
This prevents the bug where unmapping the last VMA would incorrectly
flip a DONTNEED BO back to WILLNEED. The enum-based state check
preserves BO state when no VMAs remain, only updating when VMAs provide
explicit hints.
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Arvind Yadav <arvind.yadav@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260326130843.3545241-7-arvind.yadav@intel.com
Add purge checking to vma_lock_and_validate() to block new mapping
operations on purged BOs while allowing cleanup operations to proceed.
Purged BOs have their backing pages freed by the kernel. New
mapping operations (MAP, PREFETCH, REMAP) must be rejected with
-EINVAL to prevent GPU access to invalid memory. Cleanup
operations (UNMAP) must be allowed so applications can release
resources after detecting purge via the retained field.
REMAP operations require mixed handling - reject new prev/next
VMAs if the BO is purged, but allow the unmap portion to proceed
for cleanup.
The check_purged flag in struct xe_vma_lock_and_validate_flags
distinguishes between these cases: true for new mappings (must reject),
false for cleanup (allow).
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Arvind Yadav <arvind.yadav@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260326130843.3545241-6-arvind.yadav@intel.com
Block CPU page faults to buffer objects marked as purgeable (DONTNEED)
or already purged. Once a BO is marked DONTNEED, its contents can be
discarded by the kernel at any time, making access undefined behavior.
Return VM_FAULT_SIGBUS immediately to fail consistently instead of
allowing erratic behavior where access sometimes works (if not yet
purged) and sometimes fails (if purged).
For DONTNEED BOs:
- Block new CPU faults with SIGBUS to prevent undefined behavior.
- Existing CPU PTEs may still work until TLB flush, but new faults
fail immediately.
For PURGED BOs:
- Backing store has been reclaimed, making CPU access invalid.
- Without this check, accessing existing mmap mappings would trigger
xe_bo_fault_migrate() on freed backing store, causing kernel hangs
or crashes.
The purgeable check is added to both CPU fault paths:
- Fastpath (xe_bo_cpu_fault_fastpath): Returns VM_FAULT_SIGBUS
immediately under dma-resv lock, preventing attempts to
migrate/validate DONTNEED/purged pages.
- Slowpath (xe_bo_cpu_fault): Returns -EFAULT under drm_exec lock,
converted to VM_FAULT_SIGBUS.
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Arvind Yadav <arvind.yadav@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260326130843.3545241-5-arvind.yadav@intel.com
This allows userspace applications to provide memory usage hints to
the kernel for better memory management under pressure:
Add the core implementation for purgeable buffer objects, enabling
memory reclamation of user-designated DONTNEED buffers during eviction.
This patch implements the purge operation and state machine transitions:
Purgeable States (from xe_madv_purgeable_state):
- WILLNEED (0): BO should be retained, actively used
- DONTNEED (1): BO eligible for purging, not currently needed
- PURGED (2): BO backing store reclaimed, permanently invalid
Design Rationale:
- Async TLB invalidation via trigger_rebind (no blocking
xe_vm_invalidate_vma)
- i915 compatibility: retained field, "once purged always purged"
semantics
- Shared BO protection prevents multi-process memory corruption
- Scratch PTE reuse avoids new infrastructure, safe for fault mode
Note: The madvise_purgeable() function is implemented but not hooked
into the IOCTL handler (madvise_funcs[] entry is NULL) to maintain
bisectability. The feature will be enabled in the final patch when all
supporting infrastructure (shrinker, per-VMA tracking) is complete.
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Arvind Yadav <arvind.yadav@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260326130843.3545241-4-arvind.yadav@intel.com
Add infrastructure for tracking purgeable state of buffer objects.
This includes:
Introduce enum xe_madv_purgeable_state with three states:
- XE_MADV_PURGEABLE_WILLNEED (0): BO is needed and should not be
purged. This is the default state for all BOs.
- XE_MADV_PURGEABLE_DONTNEED (1): BO is not currently needed and
can be purged by the kernel under memory pressure to reclaim
resources. Only non-shared BOs can be marked as DONTNEED.
- XE_MADV_PURGEABLE_PURGED (2): BO has been purged by the kernel.
Accessing a purged BO results in error. Follows i915 semantics
where once purged, the BO remains permanently invalid ("once
purged, always purged").
Add madv_purgeable field to struct xe_bo for state tracking
of purgeable state across concurrent access paths
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Arvind Yadav <arvind.yadav@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260326130843.3545241-3-arvind.yadav@intel.com
Extend the DRM_XE_MADVISE ioctl to support purgeable buffer object
management by adding DRM_XE_VMA_ATTR_PURGEABLE_STATE attribute type.
This allows userspace applications to provide memory usage hints to
the kernel for better memory management under pressure:
- WILLNEED: Buffer is needed and should not be purged. If the BO was
previously purged, retained field returns 0 indicating backing store
was lost (once purged, always purged semantics matching i915).
- DONTNEED: Buffer is not currently needed and may be purged by the
kernel under memory pressure to free resources. Only applies to
non-shared BOs.
To prevent undefined behavior, the following operations are blocked
while a BO is in DONTNEED state:
- New mmap() operations return -EBUSY
- VM_BIND operations return -EBUSY
- New dma-buf exports return -EBUSY
- CPU page faults return SIGBUS
- GPU page faults fail with -EACCES
This ensures applications cannot use a BO while marked as DONTNEED,
preventing erratic behavior when the kernel purges the backing store.
The implementation includes a 'retained' output field (matching i915's
drm_i915_gem_madvise.retained) that indicates whether the BO's backing
store still exists (1) or has been purged (0).
Added DRM_XE_QUERY_CONFIG_FLAG_HAS_PURGING_SUPPORT flag to allow
userspace to detect kernel support for purgeable buffer objects
before attempting to use the feature.
Cc: Matthew Brost <matthew.brost@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Arvind Yadav <arvind.yadav@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260326130843.3545241-2-arvind.yadav@intel.com
Fix include guard macros that don't match their respective file names:
- xe_gt_idle_types.h: _XE_GT_IDLE_SYSFS_TYPES_H_ -> _XE_GT_IDLE_TYPES_H_
- xe_guc_exec_queue_types.h: _XE_GUC_ENGINE_TYPES_H_ -> _XE_GUC_EXEC_QUEUE_TYPES_H_
- xe_heci_gsc.h: __XE_HECI_GSC_DEV_H__ -> _XE_HECI_GSC_H_
- xe_hw_engine_class_sysfs.h: _XE_ENGINE_CLASS_SYSFS_H_ -> _XE_HW_ENGINE_CLASS_SYSFS_H_
- xe_late_bind_fw_types.h: _XE_LATE_BIND_TYPES_H_ -> _XE_LATE_BIND_FW_TYPES_H_
- xe_platform_types.h: _XE_PLATFORM_INFO_TYPES_H_ -> _XE_PLATFORM_TYPES_H_
- xe_tile_printk.h: _xe_tile_printk_H_ -> _XE_TILE_PRINTK_H_
These guards appear to be leftovers from file renames or copy-paste
errors. Correcting them to follow the standard convention of matching
the file name prevents potential include guard collisions.
No functional change expected.
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Nitin Gote <nitin.r.gote@intel.com>
Link: https://patch.msgid.link/20260316160451.1688247-2-shuicheng.lin@intel.com
Async work (e.g., GuC queue teardowns) can call ggtt_node_remove, so the
operation must be performed under the GGTT lock to ensure the GGTT
online check remains stable. GGTT insertion and removal are heavyweight
operations (e.g., queue create/destroy), so the additional serialization
cost is negligible compared to ensuring correctness.
Fixes: 4f3a998a17 ("drm/xe: Open-code GGTT MMIO access protection")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Link: https://patch.msgid.link/20260326011207.62373-1-matthew.brost@intel.com
Add support for userspace to request a list of observed faults
from a specified VM.
v2:
- Only allow querying of failed pagefaults (Matt Brost)
v3:
- Remove unnecessary size parameter from helper function, as it
is a property of the arguments. (jcavitt)
- Remove unnecessary copy_from_user (Jainxun)
- Set address_precision to 1 (Jainxun)
- Report max size instead of dynamic size for memory allocation
purposes. Total memory usage is reported separately.
v4:
- Return int from xe_vm_get_property_size (Shuicheng)
- Fix memory leak (Shuicheng)
- Remove unnecessary size variable (jcavitt)
v5:
- Rename ioctl to xe_vm_get_faults_ioctl (jcavitt)
- Update fill_property_pfs to eliminate need for kzalloc (Jianxun)
v6:
- Repair and move fill_faults break condition (Dan Carpenter)
- Free vm after use (jcavitt)
- Combine assertions (jcavitt)
- Expand size check in xe_vm_get_faults_ioctl (jcavitt)
- Remove return mask from fill_faults, as return is already -EFAULT or 0
(jcavitt)
v7:
- Revert back to using xe_vm_get_property_ioctl
- Apply better copy_to_user logic (jcavitt)
v8:
- Fix and clean up error value handling in ioctl (jcavitt)
- Reapply return mask for fill_faults (jcavitt)
v9:
- Future-proof size logic for zero-size properties (jcavitt)
- Add access and fault types (Jianxun)
- Remove address type (Jianxun)
v10:
- Remove unnecessary switch case logic (Raag)
- Compress size get, size validation, and property fill functions into a
single helper function (jcavitt)
- Assert valid size (jcavitt)
v11:
- Remove unnecessary else condition
- Correct backwards helper function size logic (jcavitt)
v12:
- Use size_t instead of int (Raag)
v13:
- Remove engine class and instance (Ivan)
v14:
- Map access type, fault type, and fault level to user macros (Matt
Brost, Ivan)
v15:
- Remove unnecessary size assertion (jcavitt)
v16:
- Nit fixes (Matt Brost)
v17:
- Rebase and refactor (jcavitt)
v18:
- Do not copy_to_user in critical section (Matt Brost)
- Assert args->size is multiple of sizeof(struct xe_vm_fault) (Matt
Brost)
v19:
- Remove unnecessary memset (Matt Brost)
v20:
- Report canonicalized address (Jose)
- Mask out prefetch data from access type (Jose, jcavitt)
v21:
- s/uAPI/Link in the commit log links
- Align debug parameters
Link: https://github.com/intel/compute-runtime/pull/878
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Cc: Jainxun Zhang <jianxun.zhang@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Cc: Raag Jadav <raag.jadav@intel.com>
Cc: Ivan Briano <ivan.briano@intel.com>
Cc: Jose Souza <jose.souza@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260324152935.72444-10-jonathan.cavitt@intel.com
Add additional information to each VM so they can report up to the first
50 seen faults. Only pagefaults are saved this way currently, though in
the future, all faults should be tracked by the VM for future reporting.
Additionally, of the pagefaults reported, only failed pagefaults are
saved this way, as successful pagefaults should recover silently and not
need to be reported to userspace.
v2:
- Free vm after use (Shuicheng)
- Compress pf copy logic (Shuicheng)
- Update fault_unsuccessful before storing (Shuicheng)
- Fix old struct name in comments (Shuicheng)
- Keep first 50 pagefaults instead of last 50 (Jianxun)
v3:
- Avoid unnecessary execution by checking MAX_PFS earlier (jcavitt)
- Fix double-locking error (jcavitt)
- Assert kmemdump is successful (Shuicheng)
v4:
- Rename xe_vm.pfs to xe_vm.faults (jcavitt)
- Store fault data and not pagefault in xe_vm faults list (jcavitt)
- Store address, address type, and address precision per fault (jcavitt)
- Store engine class and instance data per fault (Jianxun)
- Add and fix kernel docs (Michal W)
- Properly handle kzalloc error (Michal W)
- s/MAX_PFS/MAX_FAULTS_SAVED_PER_VM (Michal W)
- Store fault level per fault (Micahl M)
v5:
- Store fault and access type instead of address type (Jianxun)
v6:
- Store pagefaults in non-fault-mode VMs as well (Jianxun)
v7:
- Fix kernel docs and comments (Michal W)
v8:
- Fix double-locking issue (Jianxun)
v9:
- Do not report faults from reserved engines (Jianxun)
v10:
- Remove engine class and instance (Ivan)
v11:
- Perform kzalloc outside of lock (Auld)
v12:
- Fix xe_vm_fault_entry kernel docs (Shuicheng)
v13:
- Rebase and refactor (jcavitt)
v14:
- Correctly ignore fault mode in save_pagefault_to_vm (jcavitt)
v15:
- s/save_pagefault_to_vm/xe_pagefault_save_to_vm (Matt Brost)
- Use guard instead of spin_lock/unlock (Matt Brost)
- GT was added to xe_pagefault struct. Use xe_gt_hw_engine
instead of creating a new helper function (Matt Brost)
v16:
- Set address precision programmatically (Matt Brost)
v17:
- Set address precision to fixed value (Matt Brost)
v18:
- s/uAPI/Link in commit log links
- Use kzalloc_obj
Link: https://github.com/intel/compute-runtime/pull/878
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Cc: Jianxun Zhang <jianxun.zhang@intel.com>
Cc: Michal Wajdeczko <Michal.Wajdeczko@intel.com>
Cc: Michal Mzorek <michal.mzorek@intel.com>
Cc: Ivan Briano <ivan.briano@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260324152935.72444-9-jonathan.cavitt@intel.com
During 3D workload, user is reporting hitting:
[ 413.361679] WARNING: drivers/gpu/drm/xe/xe_vm.c:1217 at vm_bind_ioctl_ops_unwind+0x1e2/0x2e0 [xe], CPU#7: vkd3d_queue/9925
[ 413.361944] CPU: 7 UID: 1000 PID: 9925 Comm: vkd3d_queue Kdump: loaded Not tainted 7.0.0-070000rc3-generic #202603090038 PREEMPT(lazy)
[ 413.361949] RIP: 0010:vm_bind_ioctl_ops_unwind+0x1e2/0x2e0 [xe]
[ 413.362074] RSP: 0018:ffffd4c25c3df930 EFLAGS: 00010282
[ 413.362077] RAX: 0000000000000000 RBX: ffff8f3ee817ed10 RCX: 0000000000000000
[ 413.362078] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 413.362079] RBP: ffffd4c25c3df980 R08: 0000000000000000 R09: 0000000000000000
[ 413.362081] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8f41fbf99380
[ 413.362082] R13: ffff8f3ee817e968 R14: 00000000ffffffef R15: ffff8f43d00bd380
[ 413.362083] FS: 00000001040ff6c0(0000) GS:ffff8f4696d89000(0000) knlGS:00000000330b0000
[ 413.362085] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
[ 413.362086] CR2: 00007ddfc4747000 CR3: 00000002e6262005 CR4: 0000000000f72ef0
[ 413.362088] PKRU: 55555554
[ 413.362089] Call Trace:
[ 413.362092] <TASK>
[ 413.362096] xe_vm_bind_ioctl+0xa9a/0xc60 [xe]
Which seems to hint that the vma we are re-inserting for the ops unwind
is either invalid or overlapping with something already inserted in the
vm. It shouldn't be invalid since this is a re-insertion, so must have
worked before. Leaving the likely culprit as something already placed
where we want to insert the vma.
Following from that, for the case where we do something like a rebind in
the middle of a vma, and one or both mapped ends are already compatible,
we skip doing the rebind of those vma and set next/prev to NULL. As well
as then adjust the original unmap va range, to avoid unmapping the ends.
However, if we trigger the unwind path, we end up with three va, with
the two ends never being removed and the original va range in the middle
still being the shrunken size.
If this occurs, one failure mode is when another unwind op needs to
interact with that range, which can happen with a vector of binds. For
example, if we need to re-insert something in place of the original va.
In this case the va is still the shrunken version, so when removing it
and then doing a re-insert it can overlap with the ends, which were
never removed, triggering a warning like above, plus leaving the vm in a
bad state.
With that, we need two things here:
1) Stop nuking the prev/next tracking for the skip cases. Instead
relying on checking for skip prev/next, where needed. That way on the
unwind path, we now correctly remove both ends.
2) Undo the unmap va shrinkage, on the unwind path. With the two ends
now removed the unmap va should expand back to the original size again,
before re-insertion.
v2:
- Update the explanation in the commit message, based on an actual IGT of
triggering this issue, rather than conjecture.
- Also undo the unmap shrinkage, for the skip case. With the two ends
now removed, the original unmap va range should expand back to the
original range.
v3:
- Track the old start/range separately. vma_size/start() uses the va
info directly.
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7602
Fixes: 8f33b4f054 ("drm/xe: Avoid doing rebinds")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260318100208.78097-2-matthew.auld@intel.com
Now that we have implemented all the related missing bits we can enable
the AuxCCS compressed modifiers which were disabled in
cf48bddd31 ("drm/i915/display: Disable AuxCCS framebuffers if built for Xe").
Tested with KDE Wayland, on Lenovo Carbon X1 ADL-P:
[PLANE:32:plane 1A]: type=PRI
uapi: [FB:242] AR30 little-endian (0x30335241),0x100000000000008,2880x1800, visible=visible, src=28
hw: [FB:242] AR30 little-endian (0x30335241),0x100000000000008,2880x1800, visible=yes, src=2880.000
Display is working fine - no artefacts, no DMAR/PIPE faults.
v2:
* Adjust patch title. (Rodrigo)
v3:
* Complete rewrite based on the display parent interface.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
References: cf48bddd31 ("drm/i915/display: Disable AuxCCS framebuffers if built for Xe")
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-13-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The fallback case for DPT backing store is a buffer object in system
memory buffer, which by default use a write-back CPU caching policy.
If this fallback gets triggered, and since there is currently no flushing,
the DPT writes made when pinning a buffer to display are not guaranteed to
be seen by the display engine.
To fix this, since both the local memory and the stolen memory DPT
placements already use write-combine, let us make the system memory option
follow suit by passing down the appropriate flag.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260324084018.20353-3-tvrtko.ursulin@igalia.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Our xe-vfio-pci component relies on the confirmation from the PF
that VF FLR processing has finished, but due to the notification
latency on the HW/FW side, PF might be unaware yet of the already
triggered VF FLR.
Update VF state machine with new FLR_PREPARE state that indicate
imminent VF FLR notification and treat that as a begin of the FLR
sequence. Also introduce function that xe-vfio-pci should call to
guarantee correct synchronization.
v2: move PREPARE into WIP, update commit msg (Michal)
Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Co-developed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patch.msgid.link/20260309152449.910636-2-piotr.piorkowski@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
When set, starting xe3p_lpg, the L2 flush optimization
feature will control whether L2 is in Persistent or
Transient mode through monitoring of media activity.
To enable L2 flush optimization include new feature flag
GUC_CTL_ENABLE_L2FLUSH_OPT for Novalake platforms when
media type is detected.
Tighten UAPI validation to restrict userptr, svm and
dmabuf mappings to be either 2WAY or XA+1WAY
V5(Thomas): logic correction
V4(MattA): Modify uapi doc and commit
V3(MattA): check valid op and pat_index value
V2(MattA): validate dma-buf bos and madvise pat-index
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Acked-by: Carl Zhang <carl.zhang@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260305121902.1892593-9-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Defining 2way (two-way coherency) is critical for
Xe3p_LPG (Nova Lake P) platforms to support L2 flush
optimization safely.
This mode allows the driver to skip certain manual cache
flushes (L2 flush optimization) without risking memory
corruption because the hardware ensures the most recent
data is visible to both entities.
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260305121902.1892593-8-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
XA, new pat_index introduced post xe3p_lpg, is memory shared between the
CPU and GPU is treated differently from other GPU memory when the Media
engine is power-gated.
XA is *always* flushed, like at the end-of-submssion (and maybe other
places), just that internally as an optimisation hw doesn't need to make
that a full flush (which will also include XA) when Media is
off/powergated, since it doesn't need to worry about GT caches vs Media
coherency, and only CPU vs GPU coherency, so can make that flush a
targeted XA flush, since stuff tagged with XA now means it's shared with
the CPU. The main implication is that we now need to somehow flush non-XA
before freeing system memory pages, otherwise dirty cachelines could be
flushed after the free (like if Media suddenly turns on and does a full
flush)
V4: Add comments for L2 flush path
V3(Thomas/MattA/MattR): Restrict userptr with non-xa, then no need to
flush manually
V2(MattA): Expand commit description
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20260305121902.1892593-7-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
There is a small risk that when fetching a NULL context image the
VF may get a tweaked context image prepared by another VF that was
previously running on the engine before the GuC scheduler switched
the VFs.
To avoid that risk, without forcing GuC scheduler to trigger costly
engine reset on every VF switch, use a watchdog mechanism that when
configured with impossible condition, triggers an interrupt, which
GuC will handle by doing an engine reset. Also adjust job size to
account for additional dwords with watchdog setup.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patch.msgid.link/20260303201354.17948-4-michal.wajdeczko@intel.com
GCC and clang warn (or error with CONFIG_WERROR=y / W=e) several times
when targeting 32-bit platforms along the lines of
drivers/gpu/drm/xe/xe_lrc.c: In function 'dump_mi_command':
drivers/gpu/drm/xe/xe_lrc.c:1921:40: error: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'int' [-Werror=format=]
1921 | drm_printf(p, "LRC[%#5lx] = [%#010x] MI_NOOP (%d dwords)\n",
| ~~~~^
| |
| long unsigned int
| %#5x
1922 | dw - num_noop - start, inst_header, num_noop);
| ~~~~~~~~~~~~~~~~~~~~~
| |
| int
drivers/gpu/drm/xe/xe_lrc.c:1922:7: error: format specifies type 'unsigned long' but the argument has type '__ptrdiff_t' (aka 'int') [-Werror,-Wformat]
1921 | drm_printf(p, "LRC[%#5lx] = [%#010x] MI_NOOP (%d dwords)\n",
| ~~~~~
| %#5tx
1922 | dw - num_noop - start, inst_header, num_noop);
| ^~~~~~~~~~~~~~~~~~~~~
Use the '%tx' specifier for printing pointer differences, which clears
up the warnings for 32-bit platforms while introducing no regressions
for 64-bit platforms.
Fixes: 65fcf19cb3 ("drm/xe: Include running dword offset in default_lrc dumps")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260316-drm-xe-fix-32-bit-wformat-ptrdiff-v1-1-0108b10b2b6b@kernel.org
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Wa_14026781792 applies to all graphics versions from 30.00
through 35.10 (inclusive). Since there are no IPs between
30.05 and 35.10, consolidate the RTP rules into a single
GRAPHICS_VERSION_RANGE(3000, 3510).
v2: (Matt)
- There are no IPs between 30.05 and 35.10 either,
So, consolidate this into a single GRAPHICS_VERSION_RANGE(3000, 3510)
- Also move it up to the top part of the table
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260317080059.1275116-2-nitin.r.gote@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Getting engine specific CTX TIMESTAMP register can fail. In that case,
if the context is active, new_ts is uninitialized. Fix that case by
initializing new_ts to the last value that was sampled in SW -
lrc->ctx_timestamp.
Flagged by static analysis.
v2: Fix new_ts initialization (Ashutosh)
Fixes: bb63e7257e ("drm/xe: Avoid toggling schedule state to check LRC timestamp in TDR")
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patch.msgid.link/20260312125308.3126607-2-umesh.nerlige.ramappa@intel.com