Fix this warning while building the documentation:
Documentation/gpu/xe/xe_configfs:9: drivers/gpu/drm/xe/xe_configfs.c:138:
WARNING: Definition list ends without a blank line; unexpected unindent.
That also makes it better formatted in the output.
While at it, also fix the underline length in "Overview".
Fixes: e2b33fce5e ("drm/xe/configfs: Improve documentation steps")
Reviewed-by: Raag Jadav <raag.jadav@intel.com>
Link: https://lore.kernel.org/r/20250911-wa-bb-cmds-v4-2-c8f7e48f7eae@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Refactor: eliminate type casts by using proper u32
declarations.
v2:
- Address review comments. (Karthik)
v3:
- Use the proper u32 type and drop cast. (Lucas De Marchi)
- Modify variable when actually using u64 value.
- Change r value to reg_value with u32 type.
v4:
- Remove newline between trailer and Signed-off-by. (Lucas De Marchi)
- Change reg_val to val for more user-friendly logging.
- Use mul_u32_u32 function since both values are u32.
v5:
- mul_u32_u32 function with shift. (Lucas De Marchi)
Fixes: 7596d839f6 ("drm/xe/hwmon: Add support to manage power limits though mailbox")
Signed-off-by: Mallesh Koujalagi <mallesh.koujalagi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250912113458.2815172-1-mallesh.koujalagi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
We already have dedicated helper macros for printing GT-oriented
messages but we don't have any to print messages that are tile
oriented and we wrongly try to use plain drm or GT-oriented ones.
Add tile-oriented printk messages and to provide similar coverage
as we have with xe_assert() macros. Also add set of simple macros
for the top level xe_device, which we could easily tweak to include
extra device specific info if needed.
Typical output of our printk macros will look like:
[drm] this is xe_WARN()
[drm] *ERROR* this is xe_err()
[drm] *ERROR* this is xe_err_printer()
[drm] this is xe_info()
[drm] this is xe_info_printer()
[drm:printk_demo.cold] this is xe_dbg()
[drm:printk_demo.cold] this is xe_dbg_printer()
[drm] Tile0: this is xe_tile_WARN()
[drm] *ERROR* Tile0: this is xe_tile_err()
[drm] *ERROR* Tile0: this is xe_tile_err_printer()
[drm] Tile0: this is xe_tile_info()
[drm] Tile0: this is xe_tile_info_printer()
[drm:printk_demo.cold] Tile0: this is xe_tile_dbg()
[drm:printk_demo.cold] Tile0: this is xe_tile_dbg_printer()
[drm] Tile0: GT0: this is xe_gt_WARN()
[drm] *ERROR* Tile0: GT0: this is xe_gt_err()
[drm] *ERROR* Tile0: GT0: this is xe_gt_err_printer()
[drm] Tile0: GT0: this is xe_gt_info()
[drm] Tile0: GT0: this is xe_gt_info_printer()
[drm:printk_demo.cold] Tile0: GT0: this is xe_gt_dbg()
[drm:printk_demo.cold] Tile0: GT0: this is xe_gt_dbg_printer()
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250909165941.31730-5-michal.wajdeczko@intel.com
All recent platforms (including all the ones officially supported by the
Xe driver) do not allow concurrent execution of RCS and CCS workloads
from different address spaces, with the HW blocking the context switch
when it detects such a scenario.
The DUAL_QUEUE flag helps with this, by causing the GuC to not submit a
context it knows will not be able to execute. This, however, causes a new
problem: if RCS and CCS queues have pending workloads from different
address spaces, the GuC needs to choose from which of the 2 queues to
pick the next workload to execute. By default, the GuC prioritizes RCS
submissions over CCS ones, which can lead to CCS workloads being
significantly (or completely) starved of execution time.
The driver can tune this by setting a dedicated scheduling policy KLV;
this KLV allows the driver to specify a quantum (in ms) and a ratio
(percentage value between 0 and 100), and the GuC will prioritize the CCS
for that percentage of each quantum.
Given that we want to guarantee enough RCS throughput to avoid missing
frames, we set the yield policy to 20% of each 80ms interval.
v2: updated quantum and ratio, improved comment, use xe_guc_submit_disable
in gt_sanitize
Fixes: d9a1ae0d17 ("drm/xe/guc: Enable WA_DUAL_QUEUE for newer platforms")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Tested-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://lore.kernel.org/r/20250905235632.3333247-2-daniele.ceraolospurio@intel.com
This effectively reverts commit 4c3fe5eae4 ("drm/xe/pf: Limit
fair VF LMEM provisioning") since we don't need it any more after
non-contig VRAM allocations were fixed. This allows larger LMEM
auto-provisioning for VFs, so instead:
[ ] GT0: PF: LMEM available(14096M) fair(1 x 8192M)
[ ] GT0: PF: VF1 provisioned with 8589934592 (8.00 GiB) LMEM
or
[ ] GT0: PF: LMEM available(14096M) fair(2 x 4096M)
[ ] GT0: PF: VF1..VF2 provisioned with 4294967296 (4.00 GiB) LMEM
we may get:
[ ] GT0: PF: LMEM available(14096M) fair(1 x 14096M)
[ ] GT0: PF: VF1 provisioned with 14780727296 (13.8 GiB) LMEM
and
[ ] GT0: PF: LMEM available(14096M) fair(2 x 7048M)
[ ] GT0: PF: VF1..VF2 provisioned with 7390363648 (6.88 GiB) LMEM
Fixes: 1e32ffbc9d ("drm/xe/sriov: support non-contig VRAM provisioning")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20250910222439.32869-1-michal.wajdeczko@intel.com
GuC has an interface to set a power profile for the SLPC algorithm.
Base mode is default and ensures a balanced performance, power_saving
mode has conservative up/down thresholds and is suitable for use with
apps that typically need to be power efficient. This will result in
lower GT frequencies, thus consuming lower power.
Selected power profile will be displayed in this format:
$ cat power_profile
[base] power_saving
$ echo power_saving > power_profile
$ cat power_profile
base [power_saving]
v2: Address review comments (Rodrigo)
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250903232120.390190-1-vinay.belgaumkar@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
A common pattern is to create a locked bo, pin it without mapping
and then unlock it. Add a function to do that, which internally
uses xe_validation_guard().
With that we can remove xe_bo_create_locked_range() and add
exhaustive eviction to stolen, pf_provision_vf_lmem and
psmi_alloc_object.
v4:
- New patch after reorganization.
v5:
- Replace DRM_XE_GEM_CPU_CACHING_WB with 0. (CI)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250908101246.65025-13-thomas.hellstrom@linux.intel.com
Introduce an xe_bo_create_pin_map_novm() function that does not
take the drm_exec paramenter to simplify the conversion of many
callsites.
For the rest, ensure that the same drm_exec context that was used
for locking the vm is passed down to validation.
Use xe_validation_guard() where appropriate.
v2:
- Avoid gotos from within xe_validation_guard(). (Matt Brost)
- Break out the change to pf_provision_vf_lmem8 to a separate
patch.
- Adapt to signature change of xe_validation_guard().
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250908101246.65025-12-thomas.hellstrom@linux.intel.com
Most users of xe_bo_create_pin_map_at() and
xe_bo_create_pin_map_at_aligned() are not using the vm parameter,
and that simplifies conversion. Introduce an
xe_bo_create_pin_map_at_novm() function and make the _aligned()
version static. Use xe_validation_guard() for conversion.
v2:
- Adapt to signature change of xe_validation_guard(). (Matt Brost)
- Fix up documentation.
v4:
- Postpone the change to i915_gem_stolen_insert_node_in_range() to
a later patch.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250908101246.65025-11-thomas.hellstrom@linux.intel.com
Convert dma-buf migration to XE_PL_TT and dma-buf import to
support exhaustive eviction, using xe_validation_guard().
It seems unlikely that the import would result in an -ENOMEM,
but convert import anyway for completeness.
The dma-buf map_attachment() functionality unfortunately doesn't
support passing a drm_exec, which means that foreign devices
validating a dma-buf that we exported will not, unless they are
xeKMD devices, participate in the exhaustive eviction scheme.
v2:
- Avoid gotos from within xe_validation_guard(). (Matt Brost)
- Adapt to signature change of xe_validation_guard(). (Matt Brost)
- Remove an unneeded (void)ret. (Matt Brost)
- Fix up an error path.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250908101246.65025-9-thomas.hellstrom@linux.intel.com
The CPU fault handler may populate bos and migrate, and in doing
so might interfere with other tasks validating.
Rework the CPU fault handler completely into a fastpath
and a slowpath. The fastpath trylocks only the validation lock
in read-mode. If that fails, there's a fallback to the
slowpath, where we do a full validation transaction.
This mandates open-coding of bo locking, bo idling and
bo populating, but we still call into TTM for fault
finalizing.
v2:
- Rework the CPU fault handler to actually take part in
the exhaustive eviction scheme (Matthew Brost).
v3:
- Don't return anything but VM_FAULT_RETRY if we've dropped the
mmap_lock. Not even if a signal is pending.
- Rebase on gpu_madvise() and split out fault migration.
- Wait for idle after migration.
- Check whether the resource manager uses tts to determine
whether to map the tt or iomem.
- Add a number of asserts.
- Allow passing a ttm_operation_ctx to xe_bo_migrate() so that
it's possible to try non-blocking migration.
- Don't fall through to TTM on migration / population error
Instead remove the gfp_retry_mayfail in mode 2 where we
must succeed. (Matthew Brost)
v5:
- Don't allow faulting in the imported bo case (Matthew Brost)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthews Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250908101246.65025-7-thomas.hellstrom@linux.intel.com
Convert existing drm_exec transactions, like GT pagefault validation,
non-LR exec() IOCTL and the rebind worker to support
exhaustive eviction using the xe_validation_guard().
v2:
- Adapt to signature change in xe_validation_guard() (Matt Brost)
- Avoid gotos from within xe_validation_guard() (Matt Brost)
- Check error return from xe_validation_guard()
v3:
- Rebase on gpu_madvise()
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v1
Link: https://lore.kernel.org/r/20250908101246.65025-6-thomas.hellstrom@linux.intel.com
Introduce a validation wrapper xe_validation_guard() as a helper
intended to be used around drm_exec transactions what perform
validations. Once TTM can handle exhaustive eviction we could
remove this wrapper or make it mostly a NO-OP unless other
functionality is added to it.
Currently the wrapper takes a read lock upon entry and if the
transaction hits an OOM, all locks are released and the
transaction is retried with a write-lock. If all other
validations participate in this scheme, the transaction with
the write lock will be the only transaction validating and
should have access to all available non-pinned memory.
There is currently a problem in that TTM converts -EDEADLOCKS to
-ENOMEM, and with ww_mutex slowpath error injections, we can hit
-ENOMEMs without having actually ran out of memory. We abuse
ww_mutex internals to detect such situations until TTM is fixes
to not convert the error code. In the meantime, injecting
ww_mutex slowpath -EDEADLOCKs is a good way to test
the implementation in the absence of real OOMs.
Just introduce the wrapper in this commit. It will be hooked up
to the driver in following commits.
v2:
- Mark class_xe_validation conditional so that the loop is
skipped on initialization error.
- Argument sanitation (Matt Brost)
- Fix conditional execution of xe_validation_ctx_fini()
(Matt Brost)
- Add a no_block mode for upcoming use in the CPU fault handler.
v4:
- Update kerneldoc. (Xe CI).
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250908101246.65025-3-thomas.hellstrom@linux.intel.com
We want all validation (potential backing store allocation) to be part
of a drm_exec transaction. Therefore add a drm_exec pointer argument
to xe_bo_validate() and ___xe_bo_create_locked(). Upcoming patches
will deal with making all (or nearly all) calls to these functions
part of a drm_exec transaction. In the meantime, define special values
of the drm_exec pointer:
XE_VALIDATION_UNIMPLEMENTED: Implementation of the drm_exec transaction
has not been done yet.
XE_VALIDATION_UNSUPPORTED: Some Middle-layers (dma-buf) doesn't allow
the drm_exec context to be passed down to map_attachment where
validation takes place.
XE_VALIDATION_OPT_OUT: May be used only for kunit tests where exhaustive
eviction isn't crucial and the ROI of converting those is very
small.
For XE_VALIDATION_UNIMPLEMENTED and XE_VALIDATION_OPT_OUT there is also
a lockdep check that a drm_exec transaction can indeed start at the
location where the macro is expanded. This is to encourage
developers to take this into consideration early in the code
development process.
v2:
- Fix xe_vm_set_validation_exec() imbalance. Add an assert that
hopefully catches future instances of this (Matt Brost)
v3:
- Extend to psmi_alloc_object
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v3
Link: https://lore.kernel.org/r/20250908101246.65025-2-thomas.hellstrom@linux.intel.com
During second CT initialization step, known as post_hwconfig, we
want to replace previously registered CT disable devm action to
make sure it will be invoked prior to releasing underlying BO.
But to replace this action we don't need to execute it right away
since we know that CT was disabled prior to that late init step
and we already assert that. Use devm_remove_action() instead to
avoid extra message about 'disabling CT' that could be seen now:
[drm:guc_ct_change_state [xe]] GT0: GuC CT communication channel disabled
...
DEVRES REM ff11000149320940 guc_action_disable_ct (16 bytes)
[drm:guc_ct_change_state [xe]] GT0: GuC CT communication channel disabled
DEVRES ADD ff110001664fc040 guc_action_disable_ct (16 bytes)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Link: https://lore.kernel.org/r/20250908102053.539-3-michal.wajdeczko@intel.com
On DGFX, during init_post_hwconfig() step, we are reinitializing
CTB BO in VRAM and we have to replace cleanup action to disable CT
communication prior to release of underlying BO.
But that introduces some discrepancy between DGFX and iGFX, as for
iGFX we keep previously added disable CT action that would be called
during unwind much later.
To keep the same flow on both types of platforms, always replace old
cleanup action and register new one.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Link: https://lore.kernel.org/r/20250908102053.539-2-michal.wajdeczko@intel.com
We currently report an L3 bank mask as part of the GT topology on both
GTs (primary and media) because a copy of the L3 bank fuse register
exists on both GTs (e.g., $gsi_offset + 0x9130 on Xe3). After recent
discussions it's come to light that the only known userspace software
that uses this part of the uapi (the compute UMD and Mesa) only uses the
value reported for the primary GT; the value reported for the media GT
is ignored by both projects, and the media UMDs don't have any use for
L3 information today. Since we always strive to have our uapi match the
specific needs of userspace and not include additional unused baggage,
let's officially drop L3 bank reporting on the media GT going forward
and only keep it around for the primary GT where it actually gets used.
This change will only apply to future platforms (Xe3 and later); even
though it would probably be safe to remove it from Xe1/Xe2 as well, we
don't want to take any chances with changing existing ABI.
Note that we'd already disabled reading/reporting of the L3 bank for the
media GT on PTL in commit 9ab440a9d0 ("drm/xe/ptl: L3bank mask is not
available on the media GT") because it was discovered that the copy of
the fuse registers on the media GT were just reporting a bogus ~0 value
rather than an accurate mask. So this is just extending that PTL
behavior forward to WCL and other future platforms. Note that we're
also free to reinstate this part of the uapi in the future if/when some
new userspace consumer emerges that _does_ have a use for media-specific
L3 bank masks.
Cc: Fei Yang <fei.yang@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20250905215614.796247-3-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
In addition of checking if we are running on the BATTLEMAGE,
we should also check for not being a VF driver, as VFs can't
access necessary registers, and doing so leads to:
| .. [drm] GT0: VF is trying to read an inaccessible register 0x35b004+0x0
| RIP: 0010:xe_gt_sriov_vf_read32+0x5e2/0x8a0 [xe]
| Call Trace:
| xe_mmio_read32+0x110/0x280 [xe]
| read_residency_counter+0x42/0xd0 [xe]
| dgfx_pkg_residencies_show+0x115/0x190 [xe]
| .. [drm] Package G2 counter failed to read, ret -19
or
| .. [drm] GT0: VF is trying to read an inaccessible register 0x35b004+0x0
| RIP: 0010:xe_gt_sriov_vf_read32+0x5e2/0x8a0 [xe]
| Call Trace:
| xe_mmio_read32+0x110/0x280 [xe]
| read_residency_counter+0x42/0xd0 [xe]
| dgfx_pcie_link_residencies_show+0xe7/0x160 [xe]
| .. [drm] PCIE LINK L0 RESIDENCY counter failed to read, ret -19
Similarly, there is no point to expose inject_csc_hw_error on VFs,
as HW errors support is already disabled for VFs.
Bspec: 53221
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Soham Purkait <soham.purkait@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250905173625.8398-1-michal.wajdeczko@intel.com