Michal Wajdeczko
33f17e2cbd
drm/xe/pf: Reset GuC VF config when unprovisioning critical resource
...
GuC firmware counts received VF configuration KLVs and may start
validation of the complete VF config even if some resources where
unprovisioned in the meantime, leading to unexpected errors like:
$ echo 1 | sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/contexts_quota
$ echo 0 | sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/contexts_quota
$ echo 1 | sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/doorbells_quota
$ echo 0 | sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/doorbells_quota
$ echo 1 | sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/ggtt_quota
tee: '/sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/ggtt_quota': Input/output error
To mitigate this problem trigger explicit VF config reset after
unprovisioning any of the critical resources (GGTT, context or
doorbell IDs) that GuC is monitoring.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250129195947.764-3-michal.wajdeczko@intel.com
2025-01-30 17:10:41 +01:00
Michal Wajdeczko
21ccac0e22
drm/xe/pf: Don't send BEGIN_ID if VF has no context/doorbells
...
It turned out that GuC validates VF configuration immediately
after receiving "some" set of configuration KLVs and complains
if one of the critical, from GuC understanding, resource is left
unprovisioned, even if PF should be still allowed to make late VF
config adjustments, since VF was not yet started.
This issue was discovered after we decided to asynchronously
re-send configuration KLVs after GT reset/resume, as then fair
VF auto-provisioning could already allocate some of the resources,
which was a prerequiste for sending those config KLVs:
# fair GGTT provisioning
[] xe 0000:00:02.0: [drm] GT0: PF: pushed VF1 config with 2 KLVs:
[] xe 0000:00:02.0: [drm] GT0: { key 0x0001 : 64b value 0x176a000 } # ggtt_start
[] xe 0000:00:02.0: [drm] GT0: { key 0x0002 : 64b value 0xfd696000 } # ggtt_size
[] xe 0000:00:02.0: [drm] GT0: PF: VF1 provisioned with 4251541504 (3.96 GiB) GGTT
# re-provisioning worker
[] xe 0000:00:02.0: [drm] *ERROR* GT0: H2G request 0x5503 failed: error 0x60 hint 0x0
[] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF1 14 config KLVs (-EIO)
[] xe 0000:00:02.0: [drm] GT0: { key 0x0001 : 64b value 0x176a000 } # ggtt_start
[] xe 0000:00:02.0: [drm] GT0: { key 0x0002 : 64b value 0xfd696000 } # ggtt_size
[] xe 0000:00:02.0: [drm] GT0: { key 0x8a0b : 32b value 0 } # begin_ctx_id
[] xe 0000:00:02.0: [drm] GT0: { key 0x0004 : 32b value 0 } # num_contexts
[] xe 0000:00:02.0: [drm] GT0: { key 0x8a0a : 32b value 0 } # begin_db_id
[] xe 0000:00:02.0: [drm] GT0: { key 0x0006 : 32b value 0 } # num_doorbells
[] xe 0000:00:02.0: [drm] GT0: { key 0x8a01 : 32b value 0 } # exec_quantum
[] xe 0000:00:02.0: [drm] GT0: { key 0x8a02 : 32b value 0 } # preempt_timeout
[] xe 0000:00:02.0: [drm] GT0: { key 0x8a03 : 32b value 0 } # cat_error_count
[] xe 0000:00:02.0: [drm] GT0: { key 0x8a04 : 32b value 0 } # engine_reset_count
[] xe 0000:00:02.0: [drm] GT0: { key 0x8a05 : 32b value 0 } # page_fault_count
[] xe 0000:00:02.0: [drm] GT0: { key 0x8a06 : 32b value 0 } # guc_time_us
[] xe 0000:00:02.0: [drm] GT0: { key 0x8a07 : 32b value 0 } # irq_time_us
[] xe 0000:00:02.0: [drm] GT0: { key 0x8a08 : 32b value 0 } # doorbell_time_us
[] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF1 configuration (-EIO)
To avoid such errors stop sending BEGIN_CONTEXT/DOORBELL_ID KLVs
if no GuC context/doorbell IDs were provisioned to VF.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4176
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250129195947.764-2-michal.wajdeczko@intel.com
2025-01-30 17:10:39 +01:00
Francois Dugast
8f6ddb4ab5
drm/xe/gt_pagefault: Print engine class string
...
The engine class index which is printed here is an internal representation
for debugging. It is _not_ an index based on DRM_XE_ENGINE_CLASS_* values
provided in the uAPI. Add the string representation of the engine class to
the output in order to limit possible confusion by users when analyzing the
logs.
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250129175241.338043-1-francois.dugast@intel.com
Signed-off-by: Francois Dugast <francois.dugast@intel.com >
2025-01-30 09:41:06 +01:00
Lucas De Marchi
7748289df5
drm/xe/guc: Fix size_t print format
...
Use %zx format to print size_t to remove the following warning when
building for i386:
>> drivers/gpu/drm/xe/xe_guc_ct.c:1727:43: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
1727 | drm_printf(p, "[CTB].length: 0x%lx\n", snapshot->ctb_size);
| ~~~ ^~~~~~~~~~~~~~~~~~
| %zx
Cc: José Roberto de Souza <jose.souza@intel.com >
Reported-by: kernel test robot <lkp@intel.com >
Closes: https://lore.kernel.org/oe-kbuild-all/202501281627.H6nj184e-lkp@intel.com/
Fixes: cb1f868ca1 ("drm/xe: Make GUC binaries dump consistent with other binaries in devcoredump")
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250128154242.3371687-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2025-01-29 10:53:21 -08:00
Rodrigo Vivi
55d4b69861
Revert "drm/xe/lnl: Enable GuC SLPC DCC task"
...
This reverts commit 50554bf3e5 .
DCC in LNL should be disabled. It was a mistake to decide
to go against GuC platform defaults in this case and this
could lead to regressions in some TDP limited scenarios
instead of helping.
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com >
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com >
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250128223248.660748-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
2025-01-29 09:10:59 -05:00
Matt Atwood
16016ade13
drm/xe/ptl: Update the PTL pci id table
...
Update to current bspec table.
Bspec: 72574
Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com >
Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250128175102.45797-1-matthew.s.atwood@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
2025-01-28 18:27:46 -05:00
Shekhar Chauhan
fa8ffaae1b
drm/xe/bmg: Add new PCI IDs
...
Add 3 new PCI IDs for BMG.
v2: Fix typo -> Replace '.' with ','
Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com >
Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250128162015.3288675-1-shekhar.chauhan@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
2025-01-28 17:47:44 -05:00
Riana Tauro
8b47c9cdb6
drm/xe: Initialize mei-gsc and vsec in survivability mode
...
Initialize mei-gsc in survivability mode and disable HECI
interrupts. Also initialize vsec in survivability mode
Signed-off-by: Riana Tauro <riana.tauro@intel.com >
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Reviewed-by: Alexander Usyskin <alexander.usyskin@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250128095632.1294722-4-riana.tauro@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
2025-01-28 08:58:46 -05:00
Riana Tauro
256daa32c9
drm/xe: Enable Boot Survivability mode
...
Enable boot survivability mode if pcode initialization fails and
if boot status indicates a failure. In this mode, drm card is not
exposed and driver probe returns success after loading the bare minimum
to allow firmware to be flashed via mei.
v2: abstract survivability mode variable
add BMG check inside function (Jani, Rodrigo)
v3: return -EBUSY during system suspend (Anshuman)
check survivability mode in pci probe only
on error
Signed-off-by: Riana Tauro <riana.tauro@intel.com >
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250128095632.1294722-3-riana.tauro@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
2025-01-28 08:58:46 -05:00
Riana Tauro
5e940312a2
drm/xe: Add functions and sysfs for boot survivability
...
Boot Survivability is a software based workflow for recovering a system
in a failed boot state. Here system recoverability is concerned with
recovering the firmware responsible for boot.
This is implemented by loading the driver with bare minimum (no drm card)
to allow the firmware to be flashed through mei-gsc and collect telemetry.
The driver's probe flow is modified such that it enters survivability mode
when pcode initialization is incomplete and boot status denotes a failure.
In this mode, drm card is not exposed and presence of survivability_mode
entry in PCI sysfs is used to indicate survivability mode and
provide additional information required for debug
This patch adds initialization functions and exposes admin
readable sysfs entries
The new sysfs will have the below layout
/sys/bus/.../bdf
├── survivability_mode
v2: reorder headers
fix doc
remove survivability info and use mode to display information
use separate function for logging survivability information
for critical error (Rodrigo)
v3: use for loop
use dev logs instead of drm
use helper function for aux history(Rodrigo)
remove unnecessary error check of greater than max_scratch
as we are reading only 3 bit
v4: fix checkpatch warnings
fix space (Rodrigo)
rename register
Signed-off-by: Riana Tauro <riana.tauro@intel.com >
Acked-by: Ashwin Kumar Kulkarni <ashwin.kumar.kulkarni@intel.com >
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250128095632.1294722-2-riana.tauro@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
2025-01-28 08:58:45 -05:00
José Roberto de Souza
cb1f868ca1
drm/xe: Make GUC binaries dump consistent with other binaries in devcoredump
...
All other(hwsp, hwctx and vmas) binaries follow this format:
[name].length: 0x1000
[name].data: xxxxxxx
[name].error: errno
The error one is just in case by some reason it was not able to
capture the binary.
So this GuC binaries should follow the same patern.
v2:
- renamed GUC binary to LOG
Cc: John Harrison <John.C.Harrison@Intel.com >
Cc: Lucas De Marchi <lucas.demarchi@intel.com >
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com >
Signed-off-by: José Roberto de Souza <jose.souza@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250123202307.95103-3-jose.souza@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2025-01-27 19:41:07 -08:00
Lucas De Marchi
2c95bbf500
drm/xe: Fix and re-enable xe_print_blob_ascii85()
...
Commit 70fb86a85d ("drm/xe: Revert some changes that break a mesa
debug tool") partially reverted some changes to workaround breakage
caused to mesa tools. However, in doing so it also broke fetching the
GuC log via debugfs since xe_print_blob_ascii85() simply bails out.
The fix is to avoid the extra newlines: the devcoredump interface is
line-oriented and adding random newlines in the middle breaks it. If a
tool is able to parse it by looking at the data and checking for chars
that are out of the ascii85 space, it can still do so. A format change
that breaks the line-oriented output on devcoredump however needs better
coordination with existing tools.
v2: Add suffix description comment
v3: Reword explanation of xe_print_blob_ascii85() calling drm_puts()
in a loop
Reviewed-by: José Roberto de Souza <jose.souza@intel.com >
Cc: John Harrison <John.C.Harrison@Intel.com >
Cc: Julia Filipchuk <julia.filipchuk@intel.com >
Cc: José Roberto de Souza <jose.souza@intel.com >
Cc: stable@vger.kernel.org
Fixes: 70fb86a85d ("drm/xe: Revert some changes that break a mesa debug tool")
Fixes: ec1455ce7e ("drm/xe/devcoredump: Add ASCII85 dump helper function")
Link: https://patchwork.freedesktop.org/patch/msgid/20250123202307.95103-2-jose.souza@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2025-01-27 19:40:00 -08:00
Lucas De Marchi
a37934ea75
drm/xe/devcoredump: Move exec queue snapshot to Contexts section
...
Having the exec queue snapshot inside a "GuC CT" section was always
wrong. Commit c28fd6c358 ("drm/xe/devcoredump: Improve section
headings and add tile info") tried to fix that bug, but with that also
broke the mesa tool that parses the devcoredump, hence it was reverted
in commit 70fb86a85d ("drm/xe: Revert some changes that break a mesa
debug tool").
With the mesa tool also fixed, this can propagate as a fix on both
kernel and userspace side to avoid unnecessary headache for a debug
feature.
Cc: John Harrison <John.C.Harrison@Intel.com >
Cc: Julia Filipchuk <julia.filipchuk@intel.com >
Cc: José Roberto de Souza <jose.souza@intel.com >
Cc: stable@vger.kernel.org
Fixes: 70fb86a85d ("drm/xe: Revert some changes that break a mesa debug tool")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250123051112.1938193-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2025-01-27 15:10:02 -08:00
John Harrison
ef34861098
drm/xe: Upgrade complaint about missing slice info
...
The steering code needs to know slice/subslice counts and this
information should be retrieved from the hwconfig table. However,
earlier platforms don't have it, hence the KMD has a fallback path.
Newer platforms really should have the entries and if they are missing
that is a bug that needs to be fixed in the table.
So update the complaint to be an error on newer platforms and remove
it completely for older ones that we know are bad (but are not POR for
the Xe driver anyway). Also, re-word the message a little to make it
clearer what the issue is.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com >
Reviewed-by: Matt Roper <matthew.d.roper@intel.com >
Reviewed-by: Stuart Summers <stuart.summers@intel.com >
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250118005403.2960807-1-John.C.Harrison@Intel.com
2025-01-27 12:25:19 -08:00
Michal Wajdeczko
a4d1c5d0b9
drm/xe/pf: Move VFs reprovisioning to worker
...
Since the GuC is reset during GT reset, we need to re-send the
entire SR-IOV provisioning configuration to the GuC. But since
this whole configuration is protected by the PF master mutex and
we can't avoid making allocations under this mutex (like during
LMEM provisioning), we can't do this reprovisioning from gt-reset
path if we want to be reclaim-safe. Move VFs reprovisioning to a
async worker that we will start from the gt-reset path.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com >
Cc: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com >
Reviewed-by: Stuart Summers <stuart.summers@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250125215505.720-1-michal.wajdeczko@intel.com
2025-01-27 20:34:18 +01:00
Michal Wajdeczko
14b6674608
drm/xe/pf: Use GuC Buffer Cache during policy provisioning
...
Start using GuC buffer cache for the SRIOV policy configuration
actions. This is a required step before we could declare SRIOV
PF as being a reclaim safe.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com >
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250124185247.676-1-michal.wajdeczko@intel.com
2025-01-27 19:53:59 +01:00
Vinay Belgaumkar
897286f294
drm/xe/pmu: Add GT C6 events
...
Provide a PMU interface for GT C6 residency counters. The interface is
similar to the one available for i915, but gt is passed in the config
when creating the event.
Sample usage and output:
$ perf list | grep gt-c6
xe_0000_00_02.0/gt-c6-residency/ [Kernel PMU event]
$ tail /sys/bus/event_source/devices/xe_0000_00_02.0/events/gt-c6-residency*
==> /sys/bus/event_source/devices/xe_0000_00_02.0/events/gt-c6-residency <==
event=0x01
==> /sys/bus/event_source/devices/xe_0000_00_02.0/events/gt-c6-residency.unit <==
ms
$ perf stat -e xe_0000_00_02.0/gt-c6-residency,gt=0/ -I1000
# time counts unit events
1.001196056 1,001 ms xe_0000_00_02.0/gt-c6-residency,gt=0/
2.005216219 1,003 ms xe_0000_00_02.0/gt-c6-residency,gt=0/
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Reviewed-by: Riana Tauro <riana.tauro@intel.com >
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250124050411.2189060-6-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2025-01-27 08:56:27 -08:00
Lucas De Marchi
6ea5bf169a
drm/xe/pmu: Add attribute skeleton
...
Add the generic support for defining new attributes. This only adds
the macros and common infra for the event counters, but no counters
yet. This is going to be added as follow up changes.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250124050411.2189060-5-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2025-01-27 08:55:04 -08:00
Lucas De Marchi
4ee64041bc
drm/xe/pmu: Get/put runtime pm on event init
...
When the event is created, make sure runtime pm is taken and later put:
in order to read an event counter the GPU needs to remain accessible and
doing a get/put during perf's read is not possible it's holding a
raw_spinlock.
Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250124050411.2189060-4-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2025-01-27 08:55:03 -08:00
Lucas De Marchi
ef7ce39386
drm/xe/pmu: Extract xe_pmu_event_update()
...
Like other pmu drivers, keep the update separate from the read so it can
be called from other methods (like stop()) without side effects.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250124050411.2189060-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2025-01-27 08:55:03 -08:00
Lucas De Marchi
257a10c18e
drm/xe/pmu: Assert max gt
...
XE_PMU_MAX_GT needs to be used due to a circular dependency, but we
should make sure it doesn't go out of sync with XE_PMU_MAX_GT. Add a
compile check for that.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250124050411.2189060-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2025-01-27 08:55:03 -08:00
Vinay Belgaumkar
011c1e246a
drm/xe/pmu: Enable PMU interface
...
Basic PMU enabling patch. Setup the basic framework
for adding events.
Based on previous versions by Bommu Krishnaiah, Aravind Iddamsetty and
Riana Tauro, using i915 and rapl as reference implementations.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250124050411.2189060-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2025-01-27 08:54:06 -08:00
Ashutosh Dixit
d3fedff828
drm/xe/oa: Set stream->pollin in xe_oa_buffer_check_unlocked
...
We rely on stream->pollin to decide whether or not to block during
poll/read calls. However, currently there are blocking read code paths
which don't even set stream->pollin. The best place to consistently set
stream->pollin for all code paths is therefore to set it in
xe_oa_buffer_check_unlocked.
Fixes: e936f885f1 ("drm/xe/oa/uapi: Expose OA stream fd")
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com >
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com >
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250115222029.3002103-1-ashutosh.dixit@intel.com
2025-01-23 18:07:03 -08:00
Vinay Belgaumkar
dddc53806d
drm/xe/ptl: Apply Wa_13011645652
...
Extend Wa_13011645652 to PTL.
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com >
Reviewed-by: Stuart Summers <stuart.summers@intel.com >
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250116184659.384874-1-vinay.belgaumkar@intel.com
2025-01-22 14:20:27 -08:00
Lucas De Marchi
e0a4cd6ace
MAINTAINERS: Also exclude xe for drm-misc
...
When the xe driver was added, it didn't extend the exclude entries for
drm-misc, as done in commit 5a44d50f00 ("MAINTAINERS: Update drm-misc
entry to match all drivers"). Exclude it like is done for i915 and other
drivers with dedicated maintainers.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250117164529.393503-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2025-01-21 13:06:51 -08:00
Michal Wajdeczko
5994018ecf
drm/xe/guc: Fix sizeof(32) typo
...
A small typo leads to the following static code checker warning:
drivers/gpu/drm/xe/xe_guc_buf.c:81 xe_guc_buf_reserve()
warn: sizeof(NUMBER)?
Reported-by: Dan Carpenter <dan.carpenter@linaro.org >
Closes: https://lore.kernel.org/intel-xe/0d5bcbf1-79f9-4a10-a221-ddbaec9f6122@stanley.mountain/
Fixes: 696bfdf273 ("drm/xe/guc: Introduce the GuC Buffer Cache")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Dan Carpenter <dan.carpenter@linaro.org >
Cc: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Stuart Summers <stuart.summers@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250121094832.588-1-michal.wajdeczko@intel.com
2025-01-21 22:01:28 +01:00
Michal Wajdeczko
9ebb5846e1
drm/xe/pf: Fix migration initialization
...
The migration support only needs to be initialized once, but it
was incorrectly called from the xe_gt_sriov_pf_init_hw(), which
is part of the reset flow and may be called multiple times.
Fixes: d86e3737c7 ("drm/xe/pf: Add functions to save and restore VF GuC state")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Michał Winiarski <michal.winiarski@intel.com >
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250120232443.544-1-michal.wajdeczko@intel.com
2025-01-21 21:52:47 +01:00
Ashutosh Dixit
cfa9d40db8
drm/xe/oa: Preserve oa_ctrl unused bits
...
UMD's have interest in setting unused bits of the oa_ctrl register "out of
band" for certain experiments. To facilitate this, don't clobber previous
oa_ctrl unused bits, i.e. rmw the values rather than simply write them.
Fixes: e936f885f1 ("drm/xe/oa/uapi: Expose OA stream fd")
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com >
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250117032155.3048063-1-ashutosh.dixit@intel.com
2025-01-21 09:29:47 -08:00
Maarten Lankhorst
380b0cdaa7
drm/xe: Move suballocator init to after display init
...
No allocations should be done before we have had a chance to preserve
the display fb.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20241210083111.230484-4-dev@lankhorst.se
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se >
2025-01-21 14:59:38 +01:00
Rodrigo Vivi
a46ea12eca
drm/xe/uapi: Fix documentation indentation
...
Fix these issues:
Documentation/gpu/driver-uapi:29: include/uapi/drm/xe_drm.h:817: WARNING:
+Bullet list ends without a blank line; unexpected unindent.
Documentation/gpu/driver-uapi:29: include/uapi/drm/xe_drm.h:835: WARNING:
+Definition list ends without a blank line; unexpected unindent.
Fixes: 75d37750a7 ("drm/xe/mmap: Add mmap support for PCI memory barrier")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au >
Closes: https://lore.kernel.org/intel-xe/20250117164023.3fdc00b9@canb.auug.org.au/
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com >
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com >
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250117193827.91779-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
2025-01-21 08:45:28 -05:00
Maarten Lankhorst
f3b5945780
drm/xe: Do not attempt to bootstrap VF in execlists mode
...
It was mentioned in a review that there is a possibility of choosing
to load the module with VF in execlists mode.
Of course this doesn't work, just bomb out as hard as possible.
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20241210083111.230484-12-dev@lankhorst.se
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se >
2025-01-21 14:21:25 +01:00
Satyanarayana K V P
173baa1b2d
drm/xe: Suppress printing of mode when running in non-sriov mode
...
The xe_sriov_probe_early() function prints the sriov pf/vf mode on
driver probe. When running in non-sriov mode, the below debug message
is seen.
"Running in none mode".
This print does not convey any information. This commit suppresses this
debug message and shows only when running in PF/VF mode.
Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com >
Cc: Michał Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250116055617.20611-1-satyanarayana.k.v.p@intel.com
2025-01-19 00:39:45 +01:00
Michal Wajdeczko
238f96315a
drm/xe/kunit: Add KUnit tests for GuC Buffer Cache
...
Add tests to make sure that recently added GuC Buffer Cache
component is working as expected.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250114192140.1039-1-michal.wajdeczko@intel.com
2025-01-19 00:12:07 +01:00
Michal Wajdeczko
f90b552dcb
drm/xe/kunit: Allow to replace xe_managed_bo_create_pin_map()
...
We want to use replacement functions in upcoming kunit tests.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20241220194205.995-11-michal.wajdeczko@intel.com
2025-01-19 00:12:06 +01:00
Michal Wajdeczko
d8b2149ba8
drm/xe/pf: Use GuC Buffer Cache during VFs provisioning
...
Start using GuC buffer cache for the VF's configuration actions.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20241220194205.995-10-michal.wajdeczko@intel.com
2025-01-19 00:12:05 +01:00
Michal Wajdeczko
696bfdf273
drm/xe/guc: Introduce the GuC Buffer Cache
...
The purpose of the GuC Buffer Cache is to maintain a set ofreusable
buffers that could be used while sending some of the CTB H2G actions
that require separate buffer with indirect data. Currently only few
PF actions need this so initialize it only when running as a PF.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Matthew Brost <matthew.brost@intel.com >
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com >
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20241220194205.995-9-michal.wajdeczko@intel.com
2025-01-19 00:12:03 +01:00
Michal Wajdeczko
c49ca67181
drm/xe/sa: Minor header cleanups
...
Drop unused struct xe_bo forward declaration and, while around,
fix unnecessary line split in xe_sa_bo_free() declaration.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20241220194205.995-8-michal.wajdeczko@intel.com
2025-01-19 00:12:02 +01:00
Michal Wajdeczko
ae8b507fb8
drm/xe/sa: Allow creating suballocator with custom guard size
...
Actual xe_sa_manager implementation uses hardcoded 4K to exclude
it from making suballocations but in upcoming patch we want to
reuse the xe_sa_manager where such 4K guard is not needed. Add
another variant of the xe_sa_bo_manager_init() function that
accepts arbitrary guard size.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20241220194205.995-7-michal.wajdeczko@intel.com
2025-01-19 00:12:00 +01:00
Michal Wajdeczko
0e1871f61e
drm/xe/sa: Allow making suballocations using custom gfp flags
...
Actual xe_sa_manager implementation uses hardcoded GFP_KERNEL flag
during creation of suballocations but in upcoming patch we want to
reuse the xe_sa_manager in places where GFP_KERNEL is not allowed.
Add another variant of the xe_sa_bo_new() function that accepts
arbitrary gfp flags.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20241220194205.995-6-michal.wajdeczko@intel.com
2025-01-19 00:11:59 +01:00
Michal Wajdeczko
7e937cdf18
drm/xe/sa: Tidy up coding style in init()
...
There is no need to use tile_to_xe() since we already got the xe.
And we should keep all variable declarations together, no need for
separate sa_manager declaration.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20241220194205.995-5-michal.wajdeczko@intel.com
2025-01-19 00:11:58 +01:00
Michal Wajdeczko
97ee0e351f
drm/xe/sa: Improve error message on init failure
...
Instead of raw errno value we can print friendly error code and
also print size of the buffer object that we fail to prepare.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20241220194205.995-4-michal.wajdeczko@intel.com
2025-01-19 00:11:57 +01:00
Michal Wajdeczko
d29cddd49b
drm/xe/sa: Drop redundant NULL assignments
...
The sa_manager is drmm_kzalloc'ed so all members are already zero.
And in case of kvzalloc() failure we are not returning pointer to
the sa_manager at all, so no point in resetting .bo member.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20241220194205.995-3-michal.wajdeczko@intel.com
2025-01-19 00:11:57 +01:00
Michal Wajdeczko
9cd3f4efc8
drm/xe/sa: Always call drm_suballoc_manager_fini()
...
After successful call to drm_suballoc_manager_init() we should
make sure to call drm_suballoc_manager_fini() as it may include
some cleanup code even if we didn't start using it for real.
As we can abort init() early due to kvzalloc() failure, we should
either explicitly call drm_suballoc_manager_fini() or, even better,
postpone drm_suballoc_manager_init() once we finish all other
preparation steps, so we can rely on fini() that will do cleanup.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20241220194205.995-2-michal.wajdeczko@intel.com
2025-01-19 00:11:56 +01:00
Michal Wajdeczko
13265fe742
drm/xe/vf: Perform early GT MMIO initialization to read GMDID
...
VFs need to communicate with the GuC to obtain the GMDID value
and existing GuC functions used for that assume that the GT has
it's MMIO members already setup. However, due to recent refactoring
the gt->mmio is initialized later, and any attempt by the VF to use
xe_mmio_read|write() from GuC functions will lead to NPD crash due
to unset MMIO register address:
[] xe 0000:00:02.1: [drm] Running in SR-IOV VF mode
[] xe 0000:00:02.1: [drm] GT0: sending H2G MMIO 0x5507
[] BUG: unable to handle page fault for address: 0000000000190240
Since we are already tweaking the id and type of the primary GT to
mimic it's a Media GT before initializing the GuC communication,
we can also call xe_gt_mmio_init() to perform early setup of the
gt->mmio which will make those GuC functions work again.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Matt Roper <matthew.d.roper@intel.com >
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com >
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250114211347.1083-1-michal.wajdeczko@intel.com
2025-01-18 22:04:59 +01:00
Michal Wajdeczko
bbd8429264
drm/xe: Always setup GT MMIO adjustment data
...
While we believed that xe_gt_mmio_init() will be called just once
per GT, this might not be a case due to some tweaks that need to
performed by the VF driver during early probe. To avoid leaving
any stale data in case of the re-run, reset the GT MMIO adjustment
data for the non-media GT case.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Matt Roper <matthew.d.roper@intel.com >
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20241114175955.2299-2-michal.wajdeczko@intel.com
2025-01-18 22:03:42 +01:00
Francois Dugast
474c4dd29f
drm/xe: Add missing SPDX license identifiers
...
Ensure all Xe driver files have a proper SPDX license identifier, add it
in files where it was missing.
Link: https://patchwork.freedesktop.org/patch/msgid/20250116124532.1480351-1-francois.dugast@intel.com
Signed-off-by: Francois Dugast <francois.dugast@intel.com >
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com >
2025-01-17 15:25:52 +01:00
Oak Zeng
b824709ee1
drm/xe: Fix a typo in xe_vm_doc.h
...
s/vm->ttm.base.resv->lock/vm->gpuvm.r_obj->resv->lock
Signed-off-by: Oak Zeng <oak.zeng@intel.com >
Reviewed-by: Maciej Patelczyk <maciej.patelczyk@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20250113212324.3264218-1-oak.zeng@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
2025-01-17 00:01:58 +05:30
Oak Zeng
22b1a53f28
drm/xe: Print vm parameter in xe_vma trace
...
Print the vm that the vma belongs to in the vma trace.
This is useful to correlate VMA operations to the VM.
Signed-off-by: Oak Zeng <oak.zeng@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20241218164833.2364049-4-oak.zeng@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
2025-01-17 00:00:59 +05:30
Oak Zeng
861b27584d
drm/xe: Print vm flags in xe_vm trace print
...
Print vm flags in xe_vm trace print. This is helpful
to diagnosis the VM mode of operation.
Signed-off-by: Oak Zeng <oak.zeng@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20241218164833.2364049-3-oak.zeng@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
2025-01-17 00:00:59 +05:30
Oak Zeng
63060df6f7
drm/xe: trace bo create
...
Add a tracepoint to trace bo create.
Signed-off-by: Oak Zeng <oak.zeng@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20241218164833.2364049-2-oak.zeng@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
2025-01-17 00:00:59 +05:30