VFs use a relay transaction during the resume/reset flow and use
of the GFP_KERNEL flag may conflict with the reclaim:
-> #0 (fs_reclaim){+.+.}-{0:0}:
[ ] __lock_acquire+0x1874/0x2bc0
[ ] lock_acquire+0xd2/0x310
[ ] fs_reclaim_acquire+0xc5/0x100
[ ] mempool_alloc_noprof+0x5c/0x1b0
[ ] __relay_get_transaction+0xdc/0xa10 [xe]
[ ] relay_send_to+0x251/0xe50 [xe]
[ ] xe_guc_relay_send_to_pf+0x79/0x3a0 [xe]
[ ] xe_gt_sriov_vf_connect+0x90/0x4d0 [xe]
[ ] xe_uc_init_hw+0x157/0x3b0 [xe]
[ ] do_gt_restart+0x1ae/0x650 [xe]
[ ] xe_gt_resume+0xb6/0x120 [xe]
[ ] xe_pm_runtime_resume+0x15b/0x370 [xe]
[ ] xe_pci_runtime_resume+0x73/0x90 [xe]
[ ] pci_pm_runtime_resume+0xa0/0x100
[ ] __rpm_callback+0x4d/0x170
[ ] rpm_callback+0x64/0x70
[ ] rpm_resume+0x594/0x790
[ ] __pm_runtime_resume+0x4e/0x90
[ ] xe_pm_runtime_get_ioctl+0x9c/0x160 [xe]
Since we have a preallocated pool of relay transactions, which
should cover all our normal relay use cases, we may use the
GFP_NOWAIT flag when allocating new outgoing transactions.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Tested-by: Marcin Bernatowicz <marcin.bernatowicz@linux.intel.com>
Reviewed-by: Marcin Bernatowicz <marcin.bernatowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250131153713.808-1-michal.wajdeczko@intel.com
The driver needs to know if a BO is encrypted with PXP to enable the
display decryption at flip time.
Furthermore, we want to keep track of the status of the encryption and
reject any operation that involves a BO that is encrypted using an old
key. There are two points in time where such checks can kick in:
1 - at VM bind time, all operations except for unmapping will be
rejected if the key used to encrypt the BO is no longer valid. This
check is opt-in via a new VM_BIND flag, to avoid a scenario where a
malicious app purposely shares an invalid BO with a non-PXP aware
app (such as a compositor). If the VM_BIND was failed, the
compositor would be unable to display anything at all. Allowing the
bind to go through means that output still works, it just displays
garbage data within the bounds of the illegal BO.
2 - at job submission time, if the queue is marked as using PXP, all
objects bound to the VM will be checked and the submission will be
rejected if any of them was encrypted with a key that is no longer
valid.
Note that there is no risk of leaking the encrypted data if a user does
not opt-in to those checks; the only consequence is that the user will
not realize that the encryption key is changed and that the data is no
longer valid.
v2: Better commnnts and descriptions (John), rebase
v3: Properly return the result of key_assign up the stack, do not use
xe_bo in display headers (Jani)
v4: improve key_instance variable documentation (John)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250129174140.948829-11-daniele.ceraolospurio@intel.com
Userspace is required to mark a queue as using PXP to guarantee that the
PXP instructions will work. In addition to managing the PXP sessions,
when a PXP queue is created the driver will set the relevant bits in
its context control register.
On submission of a valid PXP queue, the driver will validate all
encrypted objects mapped to the VM to ensured they were encrypted with
the current key.
v2: Remove pxp_types include outside of PXP code (Jani), better comments
and code cleanup (John)
v3: split the internal PXP management to a separate patch for ease of
review. re-order ioctl checks to always return -EINVAL if parameters are
invalid, rebase on msix changes.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250129174140.948829-9-daniele.ceraolospurio@intel.com
We expect every queue that uses PXP to be marked as doing so, to allow
the driver to correctly manage the encryption status. The API for doing
this from userspace is coming in the next patch, while this patch
implement the management side of things. When a PXP queue is created,
the driver will do the following:
- Start the default PXP session if it is not already running;
- assign an rpm ref to the queue to keep for its lifetime (this is
required because PXP HWDRM sessions are killed by the HW suspend flow).
Since PXP start and termination can race each other, this patch also
introduces locking and a state machine to keep track of the pending
operations. Note that since we'll need to take the lock from the
suspend/resume paths as well, we can't do submissions while holding it,
which means we need a slightly more complicated state machine to keep
track of intermediate steps.
v4: new patch in the series, split from the following interface patch to
keep review manageable. Lock and status rework to not do submissions
under lock.
v5: Improve comments and error logs (John)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250129174140.948829-8-daniele.ceraolospurio@intel.com
PXP requires submissions to the HW for the following operations
1) Key invalidation, done via the VCS engine
2) Communication with the GSC FW for session management, done via the
GSCCS.
Key invalidation submissions are serialized (only 1 termination can be
serviced at a given time) and done via GGTT, so we can allocate a simple
BO and a kernel queue for it.
Submissions for session management are tied to a PXP client (identified
by a unique host_session_id); from the GSC POV this is a user-accessible
construct, so all related submission must be done via PPGTT. The driver
does not currently support PPGTT submission from within the kernel, so
to add this support, the following changes have been included:
- a new type of kernel-owned VM (marked as GSC), required to ensure we
don't use fault mode on the engine and to mark the different lock
usage with lockdep.
- a new function to map a BO into a VM from within the kernel.
v2: improve comments and function name, remove unneeded include (John)
v3: fix variable/function names in documentation
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250129174140.948829-3-daniele.ceraolospurio@intel.com
Now that interrupts are disabled for xe_display_init_noaccel,
both xe_display_init_noirq and xe_display_init_noaccel run in the same
context.
This means that we can get rid of the 3 different init calls. Without
interrupts, nothing is touching display up to this point.
Unify those 3 early display calls into a single xe_display_init_early(),
this makes the init sequence cleaner, and display less tangled during
init.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250121142850.4960-3-dev@lankhorst.se
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
We're changing the driver to have no interrupts during early init for
Xe, so we poll the PIPE_FRMSTMSMP counter instead.
Interrupts cannot be enabled during FB readout because memirq's requires
an allocation. This would overwrite the FB we want to read out.
While it might be possible to also run do the same in i915 and run
it without interrupts, the platforms i915 supports had a less clear
distinction between display and graphics. For this reason I choose
only to touch Xe for now.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250121142850.4960-1-dev@lankhorst.se
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
Backmerge drm-next to get the common APIs and refactors as well as
getting the display changes from i915 in xe so the probe order can be
improved.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
GuC firmware counts received VF configuration KLVs and may start
validation of the complete VF config even if some resources where
unprovisioned in the meantime, leading to unexpected errors like:
$ echo 1 | sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/contexts_quota
$ echo 0 | sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/contexts_quota
$ echo 1 | sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/doorbells_quota
$ echo 0 | sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/doorbells_quota
$ echo 1 | sudo tee /sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/ggtt_quota
tee: '/sys/kernel/debug/dri/0000:00:02.0/gt0/vf1/ggtt_quota': Input/output error
To mitigate this problem trigger explicit VF config reset after
unprovisioning any of the critical resources (GGTT, context or
doorbell IDs) that GuC is monitoring.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250129195947.764-3-michal.wajdeczko@intel.com
The following page fault was observed duringthe KFD process release.
In this particular error case, the HIP test (./MemcpyPerformance -h)
does not require the queue. As a result, the process_context_addr was
not assigned when the KFD process was released, ultimately leading to
this page fault during the execution of the function
kfd_process_dequeue_from_all_devices().
[345962.294891] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:153 vmid:0 pasid:0)
[345962.295333] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10
[345962.295775] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000B33
[345962.296097] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5)
[345962.296394] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1
[345962.296633] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x1
[345962.296876] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[345962.297135] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x1
[345962.297377] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[345962.297682] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:169 vmid:0 pasid:0)
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
The purpose of halt_if_hws_hang is to preserve GPU state for driver
debugging when queue preemption fails. Issuing per-queue reset may
kill wavefronts which caused the preemption failure.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.12.x
Enable boot survivability mode if pcode initialization fails and
if boot status indicates a failure. In this mode, drm card is not
exposed and driver probe returns success after loading the bare minimum
to allow firmware to be flashed via mei.
v2: abstract survivability mode variable
add BMG check inside function (Jani, Rodrigo)
v3: return -EBUSY during system suspend (Anshuman)
check survivability mode in pci probe only
on error
Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250128095632.1294722-3-riana.tauro@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Boot Survivability is a software based workflow for recovering a system
in a failed boot state. Here system recoverability is concerned with
recovering the firmware responsible for boot.
This is implemented by loading the driver with bare minimum (no drm card)
to allow the firmware to be flashed through mei-gsc and collect telemetry.
The driver's probe flow is modified such that it enters survivability mode
when pcode initialization is incomplete and boot status denotes a failure.
In this mode, drm card is not exposed and presence of survivability_mode
entry in PCI sysfs is used to indicate survivability mode and
provide additional information required for debug
This patch adds initialization functions and exposes admin
readable sysfs entries
The new sysfs will have the below layout
/sys/bus/.../bdf
├── survivability_mode
v2: reorder headers
fix doc
remove survivability info and use mode to display information
use separate function for logging survivability information
for critical error (Rodrigo)
v3: use for loop
use dev logs instead of drm
use helper function for aux history(Rodrigo)
remove unnecessary error check of greater than max_scratch
as we are reading only 3 bit
v4: fix checkpatch warnings
fix space (Rodrigo)
rename register
Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Acked-by: Ashwin Kumar Kulkarni <ashwin.kumar.kulkarni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250128095632.1294722-2-riana.tauro@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Commit 70fb86a85d ("drm/xe: Revert some changes that break a mesa
debug tool") partially reverted some changes to workaround breakage
caused to mesa tools. However, in doing so it also broke fetching the
GuC log via debugfs since xe_print_blob_ascii85() simply bails out.
The fix is to avoid the extra newlines: the devcoredump interface is
line-oriented and adding random newlines in the middle breaks it. If a
tool is able to parse it by looking at the data and checking for chars
that are out of the ascii85 space, it can still do so. A format change
that breaks the line-oriented output on devcoredump however needs better
coordination with existing tools.
v2: Add suffix description comment
v3: Reword explanation of xe_print_blob_ascii85() calling drm_puts()
in a loop
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Julia Filipchuk <julia.filipchuk@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: stable@vger.kernel.org
Fixes: 70fb86a85d ("drm/xe: Revert some changes that break a mesa debug tool")
Fixes: ec1455ce7e ("drm/xe/devcoredump: Add ASCII85 dump helper function")
Link: https://patchwork.freedesktop.org/patch/msgid/20250123202307.95103-2-jose.souza@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
The steering code needs to know slice/subslice counts and this
information should be retrieved from the hwconfig table. However,
earlier platforms don't have it, hence the KMD has a fallback path.
Newer platforms really should have the entries and if they are missing
that is a bug that needs to be fixed in the table.
So update the complaint to be an error on newer platforms and remove
it completely for older ones that we know are bad (but are not POR for
the Xe driver anyway). Also, re-word the message a little to make it
clearer what the issue is.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250118005403.2960807-1-John.C.Harrison@Intel.com
Provide a PMU interface for GT C6 residency counters. The interface is
similar to the one available for i915, but gt is passed in the config
when creating the event.
Sample usage and output:
$ perf list | grep gt-c6
xe_0000_00_02.0/gt-c6-residency/ [Kernel PMU event]
$ tail /sys/bus/event_source/devices/xe_0000_00_02.0/events/gt-c6-residency*
==> /sys/bus/event_source/devices/xe_0000_00_02.0/events/gt-c6-residency <==
event=0x01
==> /sys/bus/event_source/devices/xe_0000_00_02.0/events/gt-c6-residency.unit <==
ms
$ perf stat -e xe_0000_00_02.0/gt-c6-residency,gt=0/ -I1000
# time counts unit events
1.001196056 1,001 ms xe_0000_00_02.0/gt-c6-residency,gt=0/
2.005216219 1,003 ms xe_0000_00_02.0/gt-c6-residency,gt=0/
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250124050411.2189060-6-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>