Commit Graph

100705 Commits

Author SHA1 Message Date
Matthew Brost
e3828ebf6c drm/xe: Add define WQ_HEADER_SIZE
Previously used a a magic '+ 3', use define instead.

Suggested-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:16 -05:00
Matthew Brost
8d7a91fe58 drm/xe: Remove ct->fence_context
This is unused, remove it.

Suggested-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:16 -05:00
Matthew Brost
757d9fdfe3 drm/xe: Remove XE_GUC_CT_SELFTEST
XE_GUC_CT_SELFTEST enabled a debugfs entry to which ran a very simple
selftest ensuring the GuC CT code worked. This was added before the
kunit framework was available and before submissions were working too.
This test isn't worth porting over to the kunit frame as if the GuC CT
didn't work, literally almost nothing would work so just remove this.

Suggested-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:16 -05:00
Matt Roper
fe58a2432b drm/xe/mtl: Reduce Wa_14018575942 scope to the CCS engine
The MTL version of Wa_14018575942 has been updated to suggest only
applying the register change on the CCS engine.

Note that DG2 and PVC have a functionally equivalent workaround with
Wa_18018781329; for now that one is still applying to all engines,
although we'll keep an eye on it in case it changes to be CCS-specific
too.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20230728175601.2343755-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:16 -05:00
Rodrigo Vivi
d87c424afa drm/xe: Ensure memory eviction on s2idle.
On discrete cards we cannot allow the pci subsystem to skip
the regular suspend and we need to unblock the d3cold.

Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:16 -05:00
Rodrigo Vivi
a32d82b4cf drm/xe: Only init runtime PM after all d3cold config is in place.
We cannot allow runtime pm suspend after we configured the
d3cold capable and threshold.

Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:16 -05:00
Rodrigo Vivi
bba2ec4144 drm/xe: Fix the runtime_idle call and d3cold.allowed decision.
According to Documentation/power/runtime_pm.txt:

int pm_runtime_put(struct device *dev);
    - decrement the device's usage counter; if the result is 0 then run
      pm_request_idle(dev) and return its result

int pm_runtime_put_autosuspend(struct device *dev);
    - decrement the device's usage counter; if the result is 0 then run
      pm_request_autosuspend(dev) and return its result

We need to ensure that the idle function is called before suspending
so we take the right d3cold.allowed decision and respect the values
set on vram_d3cold_threshold sysfs. So we need pm_runtime_put()
instead of pm_runtime_put_autosuspend().

Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Tested-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:15 -05:00
Rodrigo Vivi
e07aa91316 drm/xe: Move d3cold_allowed decision all together.
And let's use the VRAM threshold to keep d3cold temporarily disabled.

With this we have the ability to run D3Cold experiments just by
touching the vram_d3cold_threshold sysfs entry.

Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:15 -05:00
Rodrigo Vivi
f026520367 drm/xe: Only set PCI d3cold_allowed when we are really allowing.
First of all it was strange to see:
if (allowed) {
...
} else {
   D3COLD_ENABLE
}

But besides this misalignment, let's also use the pci
d3cold_allowed useful to us and know that we are not really
allowing d3cold.

Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:15 -05:00
Himal Prasad Ghimiray
8f3013e0b2 drm/xe: Introduce fault injection for gt reset
To trigger gt reset failure:
 echo 100 >  /sys/kernel/debug/dri/<cardX>/fail_gt_reset/probability
 echo 2 >  /sys/kernel/debug/dri/<cardX>/fail_gt_reset/times

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:15 -05:00
Himal Prasad Ghimiray
4f027e304a drm/xe: Notify Userspace when gt reset fails
Send uevent in case of gt reset failure. This intimation can be used by
userspace monitoring tool to do the device level reset/reboot
when GT reset fails. udevadm can be used to monitor the uevents.

v2:
- Support only gt failure notification (Rodrigo)

v3
- Rectify the comments in header file.

v4
- Use pci kobj instead of drm kobj for notification.(Rodrigo)
- Cleanup (Badal)

v5
- Add tile id and gt id as additional info provided by uevent.
- Provide code documentation for the uevent. (Rodrigo)

Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:15 -05:00
Rodrigo Vivi
063e09af6e drm/xe: Invert mask and val in xe_mmio_wait32.
The order: 'offset, mask, val'; is more common in other
drivers and in special in i915, where any dev could copy
a sequence and end up with unexpected behavior.

Done with coccinelle:
@rule1@
expression gt, reg, val, mask, timeout, out, atomic;
@@
- xe_mmio_wait32(gt, reg, val, mask, timeout, out, atomic)
+ xe_mmio_wait32(gt, reg, mask, val, timeout, out, atomic)

spatch -sp_file mmio.cocci *.c *.h compat-i915-headers/intel_uncore.h \
       --in-place

v2: Rebased after changes on xe_guc_mcr usage of xe_mmio_wait32.

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:15 -05:00
Rodrigo Vivi
f83a30f466 drm/xe: Fix an invalid locking wait context bug
We cannot have spin locks around xe_irq_reset, since it will
call the intel_display_power_is_enabled() function, and
that needs a mutex lock. Hence causing the undesired
"[ BUG: Invalid wait context ]"

We cannot convert i915's power domain lock to spin lock
due to the nested dependency of non-atomic context waits.

So, let's move the xe_irq_reset functions from the
critical area, while still ensuring that we are protecting
the irq.enabled and ensuring the right serialization
in the irq handlers.

v2: On the first version, I had missed the fact that
irq.enabled is checked on the xe/display glue layer,
and that i915 display code is actually using the irq
spin lock properly. So, this got changed to a version
suggested by Matthew Auld.

v3: do not use lockdep_assert for display glue.
    do not save restore irq from inside IRQ or we can
    get bogus irq restore warnings

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/463
Suggested-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:15 -05:00
Lucas De Marchi
fda48d15a4 drm/xe: Sort xe_regs.h
Sort it by register address to make it easy to update when needed.

v2: Do not create exception for registers with same functionality.
Always sort it.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230726160708.3967790-11-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:15 -05:00
Lucas De Marchi
c0d6b6163f drm/xe: Carve out top of DSM as reserved
Top of DSM contains the WOPCM where kernel driver shouldn't access as
it contains data from other HW agents. Carve it out from the stolen
memory. On a MTL system, the output now matches the expected values:

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230726160708.3967790-10-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:14 -05:00
Lucas De Marchi
58052eb70c drm/xe: Fix MTL+ stolen memory mapping
Based on commit 8d8d062be6 ("drm/i915/mtl: Fix MTL stolen memory GGTT
mapping"). For stolen on MTL and beyond, the address in the PTE is the
offset from DSM base. While at it, update the comments explaining each
part of the calculation.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230726160708.3967790-9-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:05 -05:00
Lucas De Marchi
b23ebae7ab drm/xe: Set PTE_DM bit for stolen on MTL
Integrated graphics 1270 and beyond should set the PTE_LM bit in the PTE
when it's stolen memory. Add a new function, xe_bo_is_stolen_devmem(),
and use it when encoding the PTE.

In some places in the spec the PTE bit is called "Local Memory",
abbreviated as LM, and in others it's called "Device Memory" (DM). Since
we moved away from "Local Memory" and preferred the "vram" terminology,
also rename the macros as DM to follow the name of the new function.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230726160708.3967790-7-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:05 -05:00
Lucas De Marchi
937b4be72b drm/xe: Decouple vram check from xe_bo_addr()
The output arg is_vram in xe_bo_addr() is unused by several callers.
It's also not what the function is mainly doing. Remove the argument and
let the interested callers to call xe_bo_is_vram().

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230726160708.3967790-6-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:05 -05:00
Lucas De Marchi
621c1fbd9b drm/xe: Remove vma arg from xe_pte_encode()
All the callers pass a NULL vma, so the buffer is always the BO. Remove
the argument and the side effects of dealing with it.

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20230726160708.3967790-5-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:04 -05:00
Daniele Ceraolo Spurio
43b5d81e04 drm/xe: fix mcr semaphore locking for MTL
in commit 81593af6c8 ("drm/xe: Convert xe_mmio_wait32 to us so we can
stop using wait_for_us.") the mcr semaphore register read was
accidentally switched from waiting for the register to go to 1 to
waiting for the register to go to 0, so we need to flip it back.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:38:53 -05:00
Lucas De Marchi
0e34fdb4a0 drm/xe: Fix checking for unset value
Commit 3743040261 ("drm/xe: NULL binding implementation") introduced
the NULL binding implementation, but left a case in which the out value
is_vram is not set and the caller will use whatever was on stack.
Eventually the is_vram out could be removed, but this should at least
fix the current bug.

Fixes: 3743040261 ("drm/xe: NULL binding implementation")
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230726160708.3967790-4-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:38:30 -05:00
Matthew Auld
6aa26f6eb8 drm/xe/engine: add missing rpm for bind engines
Bind engines need to use the migration vm, however we don't have any rpm
for such a vm, otherwise the kernel would prevent rpm suspend-resume.
There are two issues here, first is the actual engine create which needs
to touch the lrc, but since that is in VRAM we trigger loads of missing
mem_access asserts. The second issue is when destroying the actual
engine, which requires GuC CT to deregister the context.

v2 (Rodrigo):
  - Just use ENGINE_FLAG_VM as the indicator that we need to hold an rpm
    ref. This also handles the case in xe_vm_create() where we create
    default bind engines.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/499
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/504
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:55 -05:00
Matthew Brost
70748acb7f drm/xe: Signal out-syncs on VM binds if no operations
If no operations are generated for VM binds the out-syncs must still be
signaled.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:55 -05:00
Matthew Brost
342206b7cc drm/xe: Always use xe_vm_queue_rebind_worker helper
Do not queue the rebind worker directly, rather use the helper
xe_vm_queue_rebind_worker. This ensures we use the correct work queue.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:55 -05:00
Rodrigo Vivi
c8dc154648 drm/xe: Invert guc vs execlists parameters and info.
The module parameter should reflect the name of the optional,
experimental and unsafe option, rather than the default one.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2023-12-21 11:37:54 -05:00
Rodrigo Vivi
c856cc138b drm/xe/uapi: Remove XE_QUERY_CONFIG_FLAGS_USE_GUC
This config is the only real one. If execlist remains in the
code it will forever be experimental and we shouldn't maintain
an uapi like that for that experimental piece of code that
should never be used by real users.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2023-12-21 11:37:54 -05:00
Matthew Auld
c00ce7f223 drm/xe: fully turn on small-bar support
This allows vram_size > io_size, instead of just clamping the vram size
to the BAR size, now that the driver supports it.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:54 -05:00
Matthew Auld
cd928fced9 drm/xe/uapi: add the userspace bits for small-bar
Mostly the same as i915. We add a new hint for userspace to force an
object into the mappable part of vram.

We also need to tell userspace how large the mappable part is. In Vulkan
for example, there will be two vram heaps for small-bar systems. And
here the size of each heap needs to be known. Likewise the used/avail
tracking needs to account for the mappable part.

We also limit the available tracking going forward, such that we limit
to privileged users only, since these values are system wide and are
technically considered an info leak.

v2 (Maarten):
  - s/NEEDS_CPU_ACCESS/NEEDS_VISIBLE_VRAM/ in the uapi. We also no
    longer require smem as an extra placement. This is more flexible,
    and lets us use this for clear-color surfaces, since we need CPU access
    there but we don't want to attach smem, since that effectively disables
    CCS from kernel pov.
  - Reject clear-color CCS buffers where NEEDS_VISIBLE_VRAM is not set,
    instead of migrating it behind the scenes.
v3 (José):
  - Split the changes that limit the accounting for perfmon_capable()
    into a separate patch.
  - Use XE_BO_CREATE_VRAM_MASK.
v4 (Gwan-gyeong Mun):
  - Add some kernel-doc for the query bits.
v5:
  - One small kernel-doc correction. The cpu_visible_size and
    corresponding used tracking are always zero for non
    XE_MEM_REGION_CLASS_VRAM.
v6:
  - Without perfmon_capable() it likely makes more sense to report as
    zero, instead of reporting as used == total size. This should give
    similar behaviour as i915 which rather tracks free instead of used.
  - Only enforce NEEDS_VISIBLE_VRAM on rc_ccs_cc_plane surfaces when the
    device is actually small-bar.

Testcase: igt/tests/xe_query
Testcase: igt/tests/xe_mmap@small-bar
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Filip Hazubski <filip.hazubski@intel.com>
Cc: Carl Zhang <carl.zhang@intel.com>
Cc: Effie Yu <effie.yu@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:54 -05:00
Matthew Auld
6a024f1bfd drm/xe/bo: support tiered vram allocation for small-bar
Add the new flag XE_BO_NEEDS_CPU_ACCESS, to force allocating in the
mappable part of vram. If no flag is specified we do a topdown
allocation, to limit the chances of stealing the precious mappable part,
if we don't need it. If this is a full-bar system, then this all gets
nooped.

For kernel users, it looks like xe_bo_create_pin_map() is the central
place which users should call if they want CPU access to the object, so
add the flag there.

We still need to plumb this through for userspace allocations. Also it
looks like page-tables are using pin_map(), which is less than ideal. If
we can already use the GPU to do page-table management, then maybe we
should just force that for small-bar.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:54 -05:00
Matt Roper
2a6d871bd9 drm/xe: xe_engine_create_ioctl should check gt_count, not tile_count
Platforms like MTL only have a single tile, but multiple GTs.
Ensure XE_ENGINE_CREATE accepts engine creation on gt1 on such
platforms.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230725003433.1992137-4-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:54 -05:00
Matt Roper
7a060d786c drm/xe/mtl: Map PPGTT as CPU:WC
On MTL and beyond, the GPU performs non-coherent accesses to the PPGTT
page tables.  These page tables should be mapped as CPU:WC.

Removes CAT errors triggered by xe_exec_basic@once-basic on MTL:

   xe 0000:00:02.0: [drm:__xe_pt_bind_vma [xe]] Preparing bind, with range [1a0000...1a0fff) engine 0000000000000000.
   xe 0000:00:02.0: [drm:xe_vm_dbg_print_entries [xe]] 1 entries to update
   xe 0000:00:02.0: [drm:xe_vm_dbg_print_entries [xe]]  0: Update level 3 at (0 + 1) [0...8000000000) f:0
   xe 0000:00:02.0: [drm] Engine memory cat error: guc_id=2
   xe 0000:00:02.0: [drm] Engine memory cat error: guc_id=2
   xe 0000:00:02.0: [drm] Timedout job: seqno=4294967169, guc_id=2, flags=0x4

v2:
 - Rename to XE_BO_PAGETABLE to make it more clear that this BO is the
   pagetable itself, rather than just being bound in the PPGTT.  (Lucas)

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://lore.kernel.org/r/20230725003433.1992137-3-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:54 -05:00
Matthew Auld
9700a1df0a drm/xe: add lockdep annotation for xe_device_mem_access_put()
The main motivation is with d3cold which will make the suspend and
resume callbacks even more scary, but is useful regardless. We already
have the needed annotation on the acquire side with
xe_device_mem_access_get(), and by adding the annotation on the release
side we should have a lot more confidence that our locking hierarchy is
correct.

v2:
  - Move the annotation into both callbacks for better symmetry. Also
    don't hold over the entire mem_access_get(); we only need to lockep
    to understand what is being held upon entering mem_access_get(), and
    how that matches up with locks in the callbacks.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:53 -05:00
Matthew Brost
7ead331564 drm/xe: Use migrate engine for page fault binds
We must use migrate engine for page fault binds in order to avoid a
deadlock as the migrate engine has a reserved BCS instance which cannot
be stuck on a fault. To use the migrate engine the engine argument to
xe_migrate_update_pgtables must be NULL, this was incorrectly wired up
so vm->eng[tile_id] was always being used. Fix this.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:53 -05:00
Matthew Brost
a4cc60a55f drm/xe: Only alloc userptr part of xe_vma for userptrs
Only alloc userptr part of xe_vma for userptrs, this will save on space
in the common BO case.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:53 -05:00
Matthew Brost
eae553cbe0 drm/xe: Combine destroy_cb and destroy_work in xe_vma into union
The callback kicks the worker thus mutually exclusive execution,
combining saves a bit of space in xe_vma.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:53 -05:00
Matthew Brost
63412a5a67 drm/xe: Change tile masks from u64 to u8
This will save us a few bytes in the xe_vma structure.

v2: Use hweight8 rather than hweight_long (Rodrigo)

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:53 -05:00
Matthew Brost
3daf694ccf drm/xe: Replace list_del_init with list_del for userptr.invalidate_link cleanup
This list isn't used again, list_del is the proper call.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:53 -05:00
Matthew Brost
1655c893af drm/xe: Reduce the number list links in xe_vma
Combine the userptr, rebind, and destroy links into a union as
the lists these links belong to are mutually exclusive.

v2: Adjust which lists are combined (Thomas H)
v3: Add kernel doc why this is safe (Thomas H), remove related change
of list_del_init -> list_del (Rodrigo)

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:53 -05:00
Matthew Brost
8f33b4f054 drm/xe: Avoid doing rebinds
If we dont change page sizes we can avoid doing rebinds rather just do a
partial unbind. The algorithm to determine its page size is greedy as we
assume all pages in the removed VMA are the largest page used in the
VMA.

v2: Don't exceed 100 lines
v3: struct xe_vma_op_unmap remove in different patch, remove XXX comment

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:52 -05:00
Matthew Brost
3188c0f4c8 drm/xe: Remove xe_vma_op_unmap
xe_vma_op_unmap isn't used, remove it.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:52 -05:00
Matthew Brost
fd84041d09 drm/xe: Make bind engines safe
We currently have a race between bind engines which can result in
corrupted page tables leading to faults.

A simple example:
bind A 0x0000-0x1000, engine A, has unsatisfied in-fence
bind B 0x1000-0x2000, engine B, no in-fences
exec A uses 0x1000-0x2000

Bind B will pass bind A and exec A will fault. This occurs as bind A
programs the root of the page table in a bind job which is held up by an
in-fence. Bind B in this case just programs a leaf entry of the
structure.

To fix use range-fence utility to track cross bind engine conflicts. In
the above example bind A would insert an dependency into the range-fence
tree with a key of 0x0-0x7fffffffff, bind B would find that dependency
and its bind job would scheduled behind the unsatisfied in-fence and
bind A's job.

Reviewed-by: Maarten Lankhorst<maarten.lankhorst@linux.intel.com>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:52 -05:00
Thomas Hellström
845f64bdbf drm/xe: Introduce a range-fence utility
Add generic utility to track range conflicts signaled by a dma-fence.
Tracking implemented via an interval tree. An example use case being
tracking conflicts for pending (un)binds from multiple bind engines. By
being generic ths idea would this could moved to the DRM level and used
in multiple drivers for similar problems.

v2: Make interval tree functions static (CI)
v3: Remove non-static cleanup function (CI)

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:52 -05:00
Francois Dugast
0043a3e8a1 drm/xe/execlist: Log when using execlist submission
Make explicit in the log that execlist submission is used to prevent from
silently using it over GuC submission.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:52 -05:00
Francois Dugast
72e8d73b71 drm/xe: Cleanup style warnings and errors
Fix 6 errors and 20 warnings reported by checkpatch.pl.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:52 -05:00
Francois Dugast
ea82d5aab5 drm/xe/execlist: Remove leftover printk messages
Those look like leftover debug and are not even being used. If they were
real debug/info, they should be using the drm helpers.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:52 -05:00
Francois Dugast
955c09e2cc drm/xe: Rely on kmalloc/kzalloc log message
Those messages are unnecessary because a generic message is already
produced in case of allocation failure. Besides, this also removes a
misuse of the XE_IOCTL_DBG macro.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:51 -05:00
Lucas De Marchi
4d18eac032 drm/xe: Use FIELD_PREP/FIELD_GET for tile id encoding
Use FIELD_PREP()/FIELD_GET() to encode the tile id into flags. Besides
protecting for eventual overflow it also makes it easier to see a new
flag can't be added as BIT(7).

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20230718193924.3084759-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:51 -05:00
Lucas De Marchi
0d39b6daa5 drm/xe: Normalize XE_VM_FLAG* names
Rename XE_VM_FLAGS_64K to XE_VM_FLAG_64K to follow the other names and
s/GT/TILE/ that got missed in commit 08dea76745 ("drm/xe: Move
migration from GT to tile").

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20230718193924.3084759-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:37 -05:00
Matthew Auld
ee82d2da9c drm/xe: add missing bulk_move reset
It looks like bulk_move is set during object construction, but is only
removed on object close, however in various places we might not yet have
an actual fd to close, like on the error paths for the gem_create ioctl,
and also one internal user for the evict_test_run_gt() selftest. Try to
handle those cases by manually resetting the bulk_move. This should
prevent triggering:

WARNING: CPU: 7 PID: 8252 at drivers/gpu/drm/ttm/ttm_bo.c:327
ttm_bo_release+0x25e/0x2a0 [ttm]

v2 (Nirmoy):
  - It should be safe to just unconditionally call
    __xe_bo_unset_bulk_move() in most places.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:37 -05:00
Matthew Auld
5a142f9c67 drm/xe/selftests: restart GT after xe_bo_restore_kernel()
Test seems to be failing badly after calling xe_bo_restore_kernel().
Taking a snapshot of the CTB and copying back a potentially old version
seems risky, depending on what might have been inflight. Also it seems
snapshotting the ADS object and copying back results in serious
breakage. Normally when calling xe_bo_restore_kernel() we always fully
restart the GT, which re-intializes such things.  We could potentially
skip saving and restoring such objects in xe_bo_evict_all() however
seems quite fragile not to also restart the GT. Try to do that here by
triggering a GT reset.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:37 -05:00