Commit Graph

1281927 Commits

Author SHA1 Message Date
Akshata Jahagirdar
8a92e2a67f drm/xe/migrate: Add kunit to test migration functionality for BMG
This part of kunit verifies that
- main data is decompressed and ccs data is clear post bo eviction.
- main data is raw copied and ccs data is clear post bo restore.

v2: Added missing bo_put()/bo_unlock() (Matt Auld)

Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1d36d4377c566508e42b3fb80d3fe4a588fd00ca.1721250309.git.akshata.jahagirdar@intel.com
2024-07-17 17:02:31 -07:00
Akshata Jahagirdar
523f191cc0 drm/xe/xe_migrate: Handle migration logic for xe2+ dgfx
During eviction (vram->sysmem), we use compressed -> uncompressed mapping.
During restore (sysmem->vram), we need to use mapping from
uncompressed -> uncompressed.
Handle logic for selecting the compressed identity map for eviction,
and selecting uncompressed map for restore operations.
v2: Move check of xe_migrate_ccs_emit() before calling
xe_migrate_ccs_copy(). (Nirmoy)

Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/79b3a016e686a662ae68c32b5fc7f0f2ac8043e9.1721250309.git.akshata.jahagirdar@intel.com
2024-07-17 17:02:31 -07:00
Akshata Jahagirdar
2b808d6b29 drm/xe/xe2: Introduce identity map for compressed pat for vram
Xe2+ has unified compression (exactly one compression mode/format),
where compression is now controlled via PAT at PTE level.
This simplifies KMD operations, as it can now decompress freely
without concern for the buffer's original compression format—unlike DG2,
which had multiple compression formats and thus required copying the
raw CCS state during VRAM eviction. In addition mixed VRAM and system
memory buffers were not supported with compression enabled.

On Xe2 dGPU compression is still only supported with VRAM, however we
can now support compression with VRAM and system memory buffers,
with GPU access being seamless underneath. So long as when doing
VRAM -> system memory the KMD uses compressed -> uncompressed,
to decompress it. This also allows CPU access to such buffers,
assuming that userspace first decompress the corresponding
pages being accessed.
If the pages are already in system memory then KMD would have already
decompressed them. When restoring such buffers with sysmem -> VRAM
the KMD can't easily know which pages were originally compressed,
so we always use uncompressed -> uncompressed here.
With this it also means we can drop all the raw CCS handling on such
platforms (including needing to allocate extra CCS storage).

In order to support this we now need to have two different identity
mappings for compressed and uncompressed VRAM.
In this patch, we set up the additional identity map for the VRAM with
compressed pat_index. We then select the appropriate mapping during
migration/clear. During eviction (vram->sysmem), we use the mapping
from compressed -> uncompressed. During restore (sysmem->vram), we need
the mapping from uncompressed -> uncompressed.
Therefore, we need to have two different mappings for compressed and
uncompressed vram. We set up an additional identity map for the vram
with compressed pat_index.
We then select the appropriate mapping during migration/clear.

v2: Formatting nits, Updated code to match recent changes in
    xe_migrate_prepare_vm(). (Matt)

v3: Move identity map loop to a helper function. (Matt Brost)

v4: Split helper function in different patch, and
	add asserts and nits. (Matt Brost)

v5: Convert the 2 bool arguments of pte_update_size to flags
	argument (Matt Brost)

v6: Formatting nits (Matt Brost)

Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/b00db5c7267e54260cb6183ba24b15c1e6ae52a3.1721250309.git.akshata.jahagirdar@intel.com
2024-07-17 17:02:30 -07:00
Akshata Jahagirdar
8d79acd567 drm/xe/migrate: Add helper function to program identity map
Add an helper function to program identity map.

v2: Formatting nits

Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/91dc05f05bd33076fb9a9f74f8495b48d2abff53.1721250309.git.akshata.jahagirdar@intel.com
2024-07-17 17:02:29 -07:00
Akshata Jahagirdar
54f07cfc01 drm/xe/migrate: Add kunit to test clear functionality
This test verifies if the main and ccs data are cleared during bo creation.
The motivation to use Kunit instead of IGT is that, although we can verify
whether the data is zero following bo creation,
we cannot confirm whether the zero value after bo creation is the result of
our clear function or simply because the initial data present was zero.

v2: Updated the mutex_lock and unlock logic,
    Changed out_unlock to out_put. (Matt)

v3: Added missing dma_fence_put(). (Nirmoy)

v4: Rebase.

v5: Add missing bo_put(), bo_unlock() calls. (Matt Auld)

Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/c07603439b88cfc99e78c0e2069327e65d5aa87d.1721250309.git.akshata.jahagirdar@intel.com
2024-07-17 17:02:28 -07:00
Akshata Jahagirdar
108c972a11 drm/xe/migrate: Handle clear ccs logic for xe2 dgfx
For Xe2 dGPU, we clear the bo by modifying the VRAM using an
uncompressed pat index which then indirectly updates the
compression status as uncompressed i.e zeroed CCS.
So xe_migrate_clear() should be updated for BMG to not
emit CCS surf copy commands.

v2: Moved xe_device_needs_ccs_emit() to xe_migrate.c and changed
name to xe_migrate_needs_ccs_emit() since its very specific to
migration.(Matt)

Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/8dd869dd8dda5e17ace28c04f1a48675f5540874.1721250309.git.akshata.jahagirdar@intel.com
2024-07-17 17:02:27 -07:00
Matthew Brost
452bca0edb drm/xe: Don't suspend device upon wedge
When wedging a device we shouldn't be suspending device as state for
debug will be lost.

Also this appears to not work as the below stack trace pops upon trying
to resume a wedged device:

[  304.245044] INFO: task cat:12115 blocked for more than 151 seconds.
[  304.251333]       Tainted: G        W          6.10.0-rc7-xe+ #3518
[  304.257617] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  304.265459] task:cat             state:D stack:13384 pid:12115 tgid:12115 ppid:3986   flags:0x00000006
[  304.265465] Call Trace:
[  304.265467]  <TASK>
[  304.265469]  __schedule+0x3c4/0xdf0
[  304.265478]  schedule+0x3c/0x140
[  304.265481]  rpm_resume+0x1cc/0x740
[  304.265484]  ? __pfx_autoremove_wake_function+0x10/0x10
[  304.265489]  __pm_runtime_resume+0x49/0x80
[  304.265494]  guc_info+0x6b/0xb0 [xe]
[  304.265538]  ? __pfx___drm_printfn_seq_file+0x10/0x10
[  304.265541]  ? __pfx___drm_puts_seq_file+0x10/0x10
[  304.265545]  seq_read_iter+0x111/0x4c0
[  304.265551]  seq_read+0xfc/0x140
[  304.265556]  full_proxy_read+0x58/0x80
[  304.265560]  vfs_read+0xa7/0x360
[  304.265563]  ? find_held_lock+0x2b/0x80
[  304.265568]  ksys_read+0x64/0xe0
[  304.265571]  do_syscall_64+0x68/0x140
[  304.265575]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  304.265578] RIP: 0033:0x7f4254d14992
[  304.265580] RSP: 002b:00007ffc558666f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[  304.265583] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f4254d14992
[  304.265584] RDX: 0000000000020000 RSI: 00007f4254ebb000 RDI: 0000000000000003
[  304.265586] RBP: 00007f4254ebb000 R08: 00007f4254eba010 R09: 00007f4254eba010
[  304.265587] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000022000
[  304.265588] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
[  304.265593]  </TASK>
[  304.265594]
               Showing all locks held in the system:
[  304.265598] 1 lock held by khungtaskd/57:
[  304.265599]  #0: ffffffff8273b860 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x36/0x1c0
[  304.265607] 3 locks held by kworker/6:1/90:
[  304.265610] 1 lock held by in:imklog/547:
[  304.265611]  #0: ffff88810498cd88 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x76/0xc0
[  304.265620] 1 lock held by dmesg/1310:

v2: Drop local 'err' variable (Jonathan)

Fixes: 8ed9aaae39 ("drm/xe: Force wedged state and block GT reset upon any GPU hang")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240716063902.1390130-2-matthew.brost@intel.com
2024-07-17 12:01:34 -07:00
Matthew Brost
7dbe8af13c drm/xe: Wedge the entire device
Wedge the entire device, not just GT which may have triggered the wedge.
To implement this, cleanup the layering so xe_device_declare_wedged()
calls into the lower layers (GT) to ensure entire device is wedged.

While we are here, also signal any pending GT TLB invalidations upon
wedging device.

Lastly, short circuit reset wait if device is wedged.

v2:
 - Short circuit reset wait if device is wedged (Local testing)

Fixes: 8ed9aaae39 ("drm/xe: Force wedged state and block GT reset upon any GPU hang")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240716063902.1390130-1-matthew.brost@intel.com
2024-07-17 11:58:26 -07:00
Alexander Usyskin
e02cea83d3 drm/xe/gsc: add Battlemage support
Add heci_cscfi support bit for new CSC engine type.
It has same mmio offsets as DG2 GSC but separate interrupt flow.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708084906.2827024-1-alexander.usyskin@intel.com
2024-07-17 09:47:15 -07:00
Michal Wajdeczko
45d30c828c drm/xe/vf: Track writes to inaccessible registers from VF
Only limited set of registers is accessible for the VF driver and
the hardware will silently drop writes to inaccessible registers.
To improve our VF driver lets intercept all such writes to warn
about such unexpected writes on debug builds or optionally allow
to provide some substitution (as a potential future extension).

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240713142643.1242-2-michal.wajdeczko@intel.com
2024-07-15 15:18:34 +02:00
Tejas Upadhyay
86c5b70a9c drm/xe/xe2: Add Wa_15015404425
Wa_15015404425 asks us to perform four "dummy" writes to a
non-existent register offset before every real register read.
Although the specific offset of the writes doesn't directly
matter, the workaround suggests offset 0x130030 as a good target
so that these writes will be easy to recognize and filter out in
debugging traces.

V5(MattR):
  - Avoid negating an equality comparison
V4(MattR):
  - Use writel and remove xe_reg usage
V3(MattR):
  - Define dummy reg local to function
  - Avoid tracing dummy writes
  - Update commit message
V2:
  - Add WA to 8/16/32bit reads also - MattR
  - Corrected dummy reg address - MattR
  - Use for loop to avoid mental pause - JaniN

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240709155606.2998941-1-tejas.upadhyay@intel.com
2024-07-12 16:44:00 -07:00
Michal Wajdeczko
4c3fe5eae4 drm/xe/pf: Limit fair VF LMEM provisioning
Due to the current design of the BO and VRAM manager, any object
with XE_BO_FLAG_PINNED flag, which the PF driver uses during VF
LMEM provisionining, is created with the TTM_PL_FLAG_CONTIGUOUS
flag, which may cause VRAM fragmentation that prevents subsequent
allocations of larger objects, like fair VF LMEM provisioning.

To avoid such failures, round down fair VF LMEM provisioning size
to next power of two size, to compensate what xe_ttm_vram_mgr is
doing to achieve contiguous allocations.

Fixes: ac6598aed1 ("drm/xe/pf: Add support to configure SR-IOV VFs")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711192320.1198-2-michal.wajdeczko@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-12 13:45:56 -07:00
Ashutosh Dixit
43a6faa6d9 drm/xe/exec: Fix minor bug related to xe_sync_entry_cleanup
Increment num_syncs after xe_sync_entry_parse() is successful to ensure
the xe_sync_entry_cleanup() logic under "err_syncs" label works correctly.

v2: Use the same pattern as that in xe_vm.c (Matt Brost)

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711211203.3728180-1-ashutosh.dixit@intel.com
2024-07-12 07:51:53 -07:00
Michal Wajdeczko
e97701a069 drm/xe/kunit: Simplify xe_mocs live tests code layout
The test case logic is implemented by the functions compiled as
part of the core Xe driver module and then exported to build and
register the test suite in the live test module.

But we don't need to export individual test case functions, we may
just export the entire test suite. And we don't need to register
this test suite in a separate file, it can be done in the main
file of the live test module.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708111210.1154-5-michal.wajdeczko@intel.com
2024-07-12 10:49:51 +02:00
Michal Wajdeczko
0237368193 drm/xe/kunit: Simplify xe_migrate live tests code layout
The test case logic is implemented by the functions compiled as
part of the core Xe driver module and then exported to build and
register the test suite in the live test module.

But we don't need to export individual test case functions, we may
just export the entire test suite. And we don't need to register
this test suite in a separate file, it can be done in the main
file of the live test module.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708111210.1154-4-michal.wajdeczko@intel.com
2024-07-12 10:49:49 +02:00
Michal Wajdeczko
ff10c99ab1 drm/xe/kunit: Simplify xe_dma_buf live tests code layout
The test case logic is implemented by the functions compiled as
part of the core Xe driver module and then exported to build and
register the test suite in the live test module.

But we don't need to export individual test case functions, we may
just export the entire test suite. And we don't need to register
this test suite in a separate file, it can be done in the main
file of the live test module.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708111210.1154-3-michal.wajdeczko@intel.com
2024-07-12 10:49:47 +02:00
Michal Wajdeczko
d6e850acc7 drm/xe/kunit: Simplify xe_bo live tests code layout
The test case logic is implemented by the functions compiled as
part of the core Xe driver module and then exported to build and
register the test suite in the live test module.

But we don't need to export individual test case functions, we may
just export the entire test suite. And we don't need to register
this test suite in a separate file, it can be done in the main
file of the live test module.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708111210.1154-2-michal.wajdeczko@intel.com
2024-07-12 10:49:45 +02:00
Michal Wajdeczko
57c2b3e684 drm/xe/kunit: Drop XE_TEST_EXPORT
It's unused and can be replaced with VISIBLE_IF_KUNIT if needed.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240705191057.1110-3-michal.wajdeczko@intel.com
2024-07-12 10:34:49 +02:00
Michal Wajdeczko
bd85e00fa4 drm/xe/kunit: Kill xe_cur_kunit()
We shouldn't use custom helper if there is a official one.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240705191057.1110-2-michal.wajdeczko@intel.com
2024-07-12 10:34:46 +02:00
José Roberto de Souza
f6ca930d97 drm/xe: Add process name and PID to job timedout message
This will be very helpful for Mesa CI, where it uses PID to match
the exacly test that cause timedout/GPU hang and mark that test as
failing.

Also printing the process name as it might be relavant for human
readers.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240710213149.57662-1-jose.souza@intel.com
2024-07-11 13:44:15 -07:00
Tejas Upadhyay
71733b8d7f drm/xe/xe2: Make subsequent L2 flush sequential
Issuing the flush on top of an ongoing flush is not desirable.
Lets use lock to make it sequential.

Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240710052750.3031586-1-tejas.upadhyay@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-07-11 12:36:55 +02:00
Nirmoy Das
33891539f9 drm/xe/display/xe_hdcp_gsc: Free arbiter on driver removal
Free arbiter allocated in intel_hdcp_gsc_init().

Fixes: 152f2df954 ("drm/xe/hdcp: Enable HDCP for XE")
Cc: Suraj Kandpal <suraj.kandpal@intel.com>
Cc: Arun R Murthy <arun.r.murthy@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708125918.23573-1-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-07-10 11:19:56 +02:00
Lucas De Marchi
ea74bf9ccb drm/xe: Generate oob before compiling anything
Instead of keep adding more dependencies as WAs are needed in different
places of the driver, just add a rule with all the objects so the code
generation happens before anything else.

While at it, group lines related to wa_oob in the Makefile.

v2: Prefix $(obj) when declaring dependency

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708213041.1734028-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-09 23:27:48 -07:00
Lucas De Marchi
3d122660dc drm/xe/gt: Remove double include
The header generated/xe_wa_oob.h is included twice. Remove one.

Fixes: 01570b4469 ("drm/xe/bmg: implement Wa_16023588340")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202407052122.AzuWSPuo-lkp@intel.com/
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708173301.1543871-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-09 07:37:35 -07:00
Bommu Krishnaiah
56ab698699 drm/xe/xe2lpg: Extend workaround 14021402888
workaround 14021402888 also applies to Xe2_LPG.
Replicate the existing entry to one specific for Xe2_LPG.

Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240703090754.1323647-1-krishnaiah.bommu@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-09 07:14:06 -07:00
Matthew Brost
caaf1f44a6 drm/xe: Drop trace_xe_hw_fence_free
fence->ctx may be stale memory when trace_xe_hw_fence_free is called
resuling UAF bug when deriving the device name. This tracepoint is not
all that useful, so just drop it.

Fixes: 501c4255c4 ("drm/xe/trace: Print device_id in xe_trace events")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708211008.956384-1-matthew.brost@intel.com
2024-07-08 15:15:02 -07:00
Ngai-Mint Kwan
74e3076800 drm/xe/xe2lpm: Extend Wa_16021639441
Wa_16021639441 applies to Xe2_LPM.

Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240701184637.531794-1-ngai-mint.kwan@linux.intel.com
2024-07-08 08:25:16 -07:00
Thomas Hellström
01e0cfc994 drm/xe: Use write-back caching mode for system memory on DGFX
The caching mode for buffer objects with VRAM as a possible
placement was forced to write-combined, regardless of placement.

However, write-combined system memory is expensive to allocate and
even though it is pooled, the pool is expensive to shrink, since
it involves global CPU TLB flushes.

Moreover write-combined system memory from TTM is only reliably
available on x86 and DGFX doesn't have an x86 restriction.

So regardless of the cpu caching mode selected for a bo,
internally use write-back caching mode for system memory on DGFX.

Coherency is maintained, but user-space clients may perceive a
difference in cpu access speeds.

v2:
- Update RB- and Ack tags.
- Rephrase wording in xe_drm.h (Matt Roper)
v3:
- Really rephrase wording.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: 622f709ca6 ("drm/xe/uapi: Add support for CPU caching mode")
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: dri-devel@lists.freedesktop.org
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Effie Yu <effie.yu@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Jose Souza <jose.souza@intel.com>
Cc: Michal Mrozek <michal.mrozek@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Acked-by: Matthew Auld <matthew.auld@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Fixes: 622f709ca6 ("drm/xe/uapi: Add support for CPU caching mode")
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Acked-by: Effie Yu <effie.yu@intel.com> #On chat
Link: https://patchwork.freedesktop.org/patch/msgid/20240705132828.27714-1-thomas.hellstrom@linux.intel.com
2024-07-06 11:05:46 +02:00
Matthew Auld
c55f79f317 drm/i915: disable fbc due to Wa_16023588340
On BMG-G21 we need to disable fbc due to complications around the WA.

v2:
 - Try to handle with i915_drv.h and compat layer. (Rodrigo)
v3:
 - For simplicity retreat back to the original design for now.
 - Drop the extra \ from the Makefile (Jani)

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Vinod Govindapillai <vinod.govindapillai@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240703124338.208220-4-matthew.auld@intel.com
2024-07-05 09:53:14 +01:00
Matthew Auld
01570b4469 drm/xe/bmg: implement Wa_16023588340
This involves enabling l2 caching of host side memory access to VRAM
through the CPU BAR. The main fallout here is with display since VRAM
writes from CPU can now be cached in GPU l2, and display is never
coherent with caches, so needs various manual flushing.  In the case of
fbc we disable it due to complications in getting this to work
correctly (in a later patch).

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Vinod Govindapillai <vinod.govindapillai@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240703124338.208220-3-matthew.auld@intel.com
2024-07-05 09:53:12 +01:00
Michal Wajdeczko
3078d9c8b6 drm/xe: Use VF_CAP_REG for device wmb
To force a write barrier on the device memory, we write to the
SOFTWARE_FLAGS_SPR33 register, but this particular register was
selected because it was one of the writable and unused register.

Since a write barrier should also work if we use the read-only
register, switch to VF_CAP_REG register that is also marked as
accessible for VFs.

While at it, add simple kernel-doc for xe_device_wmb() function.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240702183704.1022-4-michal.wajdeczko@intel.com
2024-07-04 11:55:40 +02:00
Michal Wajdeczko
466a6c3855 drm/xe: Kill regs/xe_sriov_regs.h
There is no real benefit to maintain a separate file. The register
definitions related to SR-IOV can be placed in existing headers.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240702183704.1022-3-michal.wajdeczko@intel.com
2024-07-04 11:54:35 +02:00
Michal Wajdeczko
9dae9751c7 drm/xe: Fix register definition order in xe_regs.h
Swap XEHP_CLOCK_GATE_DIS(0x101014) with GU_DEBUG(x101018).

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240702183704.1022-2-michal.wajdeczko@intel.com
2024-07-04 11:53:48 +02:00
Matthew Brost
04e9c0ce19 drm/xe: Add VM bind IOCTL error injection
Add VM bind IOCTL error injection which steals MSB of the bind flags
field which if set injects errors at various points in the VM bind
IOCTL. Intended to validate error paths. Enabled by CONFIG_DRM_XE_DEBUG.

v4:
 - Change define layout (Jonathan)

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-8-matthew.brost@intel.com
2024-07-03 22:28:07 -07:00
Matthew Brost
a708f6501c drm/xe: Update PT layer with better error handling
Update PT layer so if a memory allocation for a PTE fails the error can
be propagated to the user without requiring the VM to be killed.

v5:
 - change return value invalidation_fence_init to void (Matthew Auld)
v7:
 - Invert i,j usage in two places (Matthew Auld)
 - s/0/NULL (Matthew Auld)
 - Don't ignore return value of xe_pt_new_shared (Matthew Auld)
 - Don't check for NULL in xe_pt_entry (Matthew Auld)

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-7-matthew.brost@intel.com
2024-07-03 22:28:06 -07:00
Matthew Brost
282e6f846d drm/xe: Update VM trace events
The trace events have changed moving to a single job per VM bind IOCTL,
update the trace events align with old behavior as much as possible.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-6-matthew.brost@intel.com
2024-07-03 22:28:06 -07:00
Matthew Brost
e8babb280b drm/xe: Convert multiple bind ops into single job
This aligns with the uAPI of an array of binds or single bind that
results in multiple GPUVA ops to be considered a single atomic
operations.

The design is roughly:
- xe_vma_ops is a list of xe_vma_op (GPUVA op)
- each xe_vma_op resolves to 0-3 PT ops
- xe_vma_ops creates a single job
- if at any point during binding a failure occurs, xe_vma_ops contains
  the information necessary unwind the PT and VMA (GPUVA) state

v2:
 - add missing dma-resv slot reservation (CI, testing)
v4:
 - Fix TLB invalidation (Paulo)
 - Add missing xe_sched_job_last_fence_add/test_dep check (Inspection)
v5:
 - Invert i, j usage (Matthew Auld)
 - Add helper to test and add job dep (Matthew Auld)
 - Return on anything but -ETIME for cpu bind (Matthew Auld)
 - Return -ENOBUFS if suballoc of BB fails due to size (Matthew Auld)
 - s/do/Do (Matthew Auld)
 - Add missing comma (Matthew Auld)
 - Do not assign return value to xe_range_fence_insert (Matthew Auld)
v6:
 - s/0x1ff/MAX_PTE_PER_SDI (Matthew Auld, CI)
 - Check to large of SA in Xe to avoid triggering WARN (Matthew Auld)
 - Fix checkpatch issues
v7:
 - Rebase
 - Support more than 510 PTEs updates in a bind job (Paulo, mesa testing)
v8:
 - Rebase

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-5-matthew.brost@intel.com
2024-07-03 22:28:04 -07:00
Matthew Brost
96e7ebb220 drm/xe: Add xe_exec_queue_last_fence_test_dep
Helpful to determine if a bind can immediately use CPU or needs to be
deferred a drm scheduler job.

v7:
 - Better wording in kernel doc (Matthew Auld)

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-4-matthew.brost@intel.com
2024-07-03 22:27:02 -07:00
Matthew Brost
2e524668c4 drm/xe: Add xe_vm_pgtable_update_op to xe_vma_ops
Each xe_vma_op resolves to 0-3 pt_ops. Add storage for the pt_ops to
xe_vma_ops which is dynamically allocated based the number and types of
xe_vma_op in the xe_vma_ops list. Allocation only implemented in this
patch.

This will help with converting xe_vma_ops (multiple xe_vma_op) in a
atomic update unit.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-3-matthew.brost@intel.com
2024-07-03 22:27:00 -07:00
Matthew Brost
67d90d679e drm/xe: s/xe_tile_migrate_engine/xe_tile_migrate_exec_queue
Engine is old nomenclature, replace with exec queue.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-2-matthew.brost@intel.com
2024-07-03 22:26:59 -07:00
Ashutosh Dixit
8169b2097d drm/xe/uapi: Rename xe perf layer as xe observation layer
In Xe, the perf layer allows capture of HW counter streams. These HW
counters are generally performance related but don't have to be necessarily
so. Also, the name "perf" is a carryover from i915 and is not preferred.

Here we propose the name "observation" for this common layer which allows
capture of different types of these counter streams.

v2: Rename observability layer to observation layer (Lucas/Rodrigo)
v3: Rename sysctl file to "observation_paranoid" (Jose)

Fixes: 52c2e956dc ("drm/xe/perf/uapi: "Perf" layer to support multiple perf counter stream types")
Fixes: fe8929bdf8 ("drm/xe/perf/uapi: Add perf_stream_paranoid sysctl")
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240703164801.2561423-1-ashutosh.dixit@intel.com
2024-07-03 16:46:02 -07:00
Matthew Brost
627c961d67 drm/xe: Add timeout to preempt fences
To adhere to dma fencing rules that fences must signal within a
reasonable amount of time, add a 5 second timeout to preempt fences. If
this timeout occurs, kill the associated VM as this fatal to the VM.

v2:
 - Add comment for smp_wmb (Checkpatch)
 - Fix kernel doc typo (Inspection)
 - Add comment for killed check (Niranjana)
v3:
 - Drop smp_wmb (Matthew Auld)
 - Don't take vm->lock in preempt fence worker (Matthew Auld)
 - Drop RB given changes to patch
v4:
 - Add WRITE/READ_ONCE (Niranjana)
 - Don't export xe_vm_kill (Niranjana)

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Tested-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240626004137.4060806-1-matthew.brost@intel.com
2024-07-03 15:27:50 -07:00
Michal Wajdeczko
7c0389c615 drm/xe/guc: Demote GuC IDs usage message to debug
Printing message at INFO level about available GuC IDs is not that
important, DEBUG level is enough. It will also match message about
available doorbells:

 [ ] xe ... [drm:xe_guc_id_mgr_init [xe]] GT0: using 65535 GuC IDs
 [ ] xe ... [drm:xe_guc_db_mgr_init [xe]] GT0: using 256 doorbells

While at it, use proper "GuC" name.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240701193030.978-1-michal.wajdeczko@intel.com
2024-07-02 18:33:19 +02:00
Vinay Belgaumkar
aaa08078e7 drm/xe/bmg: Apply Wa_22019338487
Extend this WA to BMG GT as well. In this case media GT is
not affected. The cap frequencies and max allowed ggtt writes
are different as well. On BMG, we need to do a flush after 1100
GGTT writes, and we need to limit the GT frequency request
to 2133 Mhz during driver load and leave it at that value after
driver unloads.

v3: Fix checkpatch issue

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240701231529.2582452-2-vinay.belgaumkar@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-07-02 12:14:00 -04:00
Vinay Belgaumkar
b0b2b50cdb drm/xe/guc: Prevent use of uninitialized mutex
When skip_guc_pc is set and/or this is for a VF.

Fixes: 3b1592fb78 ("drm/xe/lnl: Apply Wa_22019338487")
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240701231529.2582452-1-vinay.belgaumkar@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-07-02 12:14:00 -04:00
Ashutosh Dixit
2d46ecc958 drm/xe/oa: Destroy the stream_lock mutex
The mutex allocated in xe_oa_stream_init() was never previously
destroyed. Do so now.

Fixes: e936f885f1 ("drm/xe/oa/uapi: Expose OA stream fd")
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240628052125.1847989-1-ashutosh.dixit@intel.com
2024-07-01 11:12:49 -07:00
Lucas De Marchi
7dc10eff22 drm/xe/rtp: Fix out-of-bounds array access
Increment the counter before checking for number of rules, otherwise
when there's no XE_RTP_MATCH_OR an out-of-bounds access is done, as
reported by kasan:

	BUG: KASAN: global-out-of-bounds in rule_matches+0xb6d/0x11c0 [xe]
	Read of size 1 at addr ffffffffa0a50b70 by task systemd-udevd/243

Fixes: dc72c52a42 ("drm/xe/rtp: Allow to OR rules")
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240628161726.836734-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-01 10:49:19 -07:00
Michal Wajdeczko
411220808c drm/xe/pf: Restart VFs provisioning after GT reset
Any prior configurations pushed to the GuC are lost when the GT
is reset. Push again all non-empty VF configurations to the GuC
as part of the GuC reset procedure.

This will also help restore early manual provisioning, when the
PF was in the meantime suspended and then resumed.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240701102738.934-3-michal.wajdeczko@intel.com
2024-07-01 19:43:52 +02:00
Michal Wajdeczko
234670cea9 drm/xe/pf: Skip fair VFs provisioning if already provisioned
Our debugfs allows to view and change VFs' provisioning configs.

If we attempt to experiment with VFs provisioning before enabling
them, this early config will affect fair provisioning calculations,
and will also be overwritten, which is undesirable behavior.

To improve this, check if the VFs configs are empty (unprovisioned)
before starting the fair provisioning procedure.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240701102738.934-2-michal.wajdeczko@intel.com
2024-07-01 19:43:50 +02:00
Michal Wajdeczko
d2d5409786 drm/xe/pf: Remove inlined #ifdef CONFIG_PCI_IOV
We can remove #ifdef CONFIG_PCI_IOV in .c files if we provide
dummy replacement of the xe_pci_sriov_configure() function.

Suggested-by: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240627104305.1477-1-michal.wajdeczko@intel.com
2024-07-01 18:01:31 +02:00