Michal Wajdeczko
fe4b17c4f7
drm/xe/vf: Don't try to program MOCS if VF
...
VFs drivers don't have access to MOCS registers. It is a PF driver
responsibility to program MOCS according to the HW team guidelines.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240405133936.891-3-michal.wajdeczko@intel.com
2024-04-08 14:33:14 +02:00
Michal Wajdeczko
97515d0b3e
drm/xe/vf: Don't emit access to Global HWSP if VF
...
VFs can't access Global HWSP, don't emit questionable MI_FLUSH_DW
while processing a migration job.
Bspec: 52398
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240405133936.891-2-michal.wajdeczko@intel.com
2024-04-08 14:33:13 +02:00
Michal Wajdeczko
83787afe06
drm/xe/guc: Initialize GuC ID manager sooner
...
The GuC submission cleanup code may depend on the GuC ID manager,
thus we can't initialize it after registering a submission cleanup
action, as reverse cleanup sequence will destroy GuC ID manager
prior to a call to guc_submit_fini().
Move GuC ID manager initialization up, right after managed mutex
initialization, to have it available during guc_submit_fini().
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240406143946.979-2-michal.wajdeczko@intel.com
2024-04-08 11:22:18 +02:00
Michal Wajdeczko
104f7519db
drm/xe/guc: Use drm_device-managed version of mutex_init()
...
This is safer approach and will help resolve a cleanup ordering
conflict related to the GuC ID manager.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240406143946.979-1-michal.wajdeczko@intel.com
2024-04-08 11:22:17 +02:00
Michal Wajdeczko
1d7d997cd7
drm/xe: Drop xe_vm_assert_held() macro definition from xe_bo.h
...
It is already defined in xe_vm.h and shouldn't be duplicated.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240405113844.803-1-michal.wajdeczko@intel.com
2024-04-05 20:01:05 +02:00
Michal Wajdeczko
48651e18bb
drm/xe: Move PTE/PDE bit definitions to proper header
...
We already have dedicated header for GGTT/PPGTT definitions.
It's also cleaner to separate them from implementation macros.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Lucas De Marchi <lucas.demarchi@intel.com >
Cc: Matt Roper <matthew.d.roper@intel.com >
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240405123520.847-1-michal.wajdeczko@intel.com
2024-04-05 19:58:54 +02:00
Andrzej Hajda
788d2ad60d
drm/xe: fix multicast support for Xe_LP platforms
...
Xe_LP has six sublices per slice.
v2: fixed commit message and subject (Matt)
Bspec: 66696
Fixes: bde5d76785 ("drm/xe: Add helper macro to loop each DSS")
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com >
Reviewed-by: Matt Roper <matthew.d.roper@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240405-mcr_adlp-v2-1-2fd1e4325ef2@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-04-05 06:17:40 -07:00
Michal Wajdeczko
f73155654d
drm/xe/guc: Reuse code while debugging GuC params
...
There is no need to duplicate code to print GuC parameters.
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240404155046.627-2-michal.wajdeczko@intel.com
2024-04-05 12:15:52 +02:00
Michal Wajdeczko
12f95f9900
drm/xe/guc: Prefer GT oriented logs for GuC messages
...
A platform can have more than one GuC, so we should use GT-oriented
logs to correctly identify the source of the message.
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240404155046.627-1-michal.wajdeczko@intel.com
2024-04-05 12:15:50 +02:00
Bommu Krishnaiah
91b93fae17
drm/xe/xe_hw_engine_class_sysfs: use sysfs_emit() for attr's _show()
...
sprintf() is deprecated for sysfs, use preferred sysfs_emit() instead.
v2: used sysfs_emit instand of sprintf
Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com >
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com >
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20231209235949.54524-3-krishnaiah.bommu@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
2024-04-04 14:54:51 -04:00
Bommu Krishnaiah
a3c86b6d7b
drm/xe: prefer snprintf over sprintf
...
since the sprintf() function lacks built-in protection against buffer
overflows using the snprintf() function.
v2: Removed hard coded values and used sizeof()
Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com >
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com >
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20231209235949.54524-2-krishnaiah.bommu@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
2024-04-04 14:54:51 -04:00
Rodrigo Vivi
972d01d0e3
drm/xe: Protect devcoredump access after unbind
...
While we don't have the full flow protection when devcoredump
is accessed after device unbind. Let's at least for now
protect against null dereference:
[ 422.766508] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[ 423.119584] RIP: 0010:xe_vm_snapshot_free+0x30/0x180 [xe]
While at it, I also fixed a non-standard code-declaration block
on the similar function of xe_guc_submit.
v2: - Use IS_ERR_OR_NULL (Nirmoy)
- Expand to other functions
Cc: José Roberto de Souza <jose.souza@intel.com >
Cc: Nirmoy Das <nirmoy.das@intel.com >
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com >
Reviewed-by: José Roberto de Souza <jose.souza@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240403195044.239766-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
2024-04-04 14:53:22 -04:00
Himal Prasad Ghimiray
34820967ae
drm/xe/xe_migrate: Cast to output precision before multiplying operands
...
Addressing potential overflow in result of multiplication of two lower
precision (u32) operands before widening it to higher precision
(u64).
-v2
Fix commit message and description. (Rodrigo)
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com >
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240401175300.3823653-1-himal.prasad.ghimiray@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
2024-04-03 15:04:56 -04:00
Matthew Brost
37c15c4aae
drm/xe: Use ordered wq for preempt fence waiting
...
Preempt fences can sleep waiting for an exec queue suspend operation to
complete. If the system_unbound_wq is used for waiting and the number of
waiters exceeds max_active this will result in other users of the
system_unbound_wq getting starved. Use a device private work queue for
preempt fences to avoid starvation of the system_unbound_wq.
Even though suspend operations can complete out-of-order, all suspend
operations within a VM need to complete before the preempt rebind worker
can start. With that, use a device private ordered wq for preempt fence
waiting.
v2:
- Add comment about cleanup on failure (Matt R)
- Update commit message (Lucas)
Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240401221913.139672-2-matthew.brost@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-04-03 07:11:00 -07:00
Himal Prasad Ghimiray
9f18b55b6d
drm/xe/xe2: Add workaround 18033852989
...
This workaround applies to RCS engine's context, hence added as
LRC workaround.
v2
- Fix commit description as lrc workaround instead of engine.(Lucas)
v3
- COMMON_SLICE_CHICKEN1 is a masked register, add XE_REG_OPTION_MASKED
flag. (Matt)
BSPEC: 55899
Cc: Matt Roper <matthew.d.roper@intel.com >
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com >
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
Reviewed-by: Matt Roper <matthew.d.roper@intel.com >
Signed-off-by: Matt Roper <matthew.d.roper@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240401163806.3821128-1-himal.prasad.ghimiray@intel.com
2024-04-02 12:11:41 -07:00
Lucas De Marchi
62742d1266
drm/xe: Normalize bo flags macros
...
The flags stored in the BO grew over time without following
much a naming pattern. First of all, get rid of the _BIT suffix that was
banned from everywhere else due to the guideline in
drivers/gpu/drm/i915/i915_reg.h that xe kind of follows:
Define bits using ``REG_BIT(N)``. Do **not** add ``_BIT`` suffix to the name.
Here the flags aren't for a register, but it's good practice to keep it
consistent.
Second divergence on names is the use or not of "CREATE". This is
because most of the flags are passed to xe_bo_create*() family of
functions, changing its behavior. However, since the flags are also
stored in the bo itself and checked elsewhere in the code, it seems
better to just omit the CREATE part.
With those 2 guidelines, all the flags are given the form
XE_BO_FLAG_<FLAG_NAME> with the following commands:
git grep -le "XE_BO_" -- drivers/gpu/drm/xe | xargs sed -i \
-e "s/XE_BO_\([_A-Z0-9]*\)_BIT/XE_BO_\1/g" \
-e 's/XE_BO_CREATE_/XE_BO_FLAG_/g'
git grep -le "XE_BO_" -- drivers/gpu/drm/xe | xargs sed -i -r \
-e 's/XE_BO_(DEFER_BACKING|SCANOUT|FIXED_PLACEMENT|PAGETABLE|NEEDS_CPU_ACCESS|NEEDS_UC|INTERNAL_TEST|INTERNAL_64K|GGTT_INVALIDATE)/XE_BO_FLAG_\1/g'
And then the defines in drivers/gpu/drm/xe/xe_bo.h are adjusted to
follow the coding style.
Reviewed-by: Matthew Auld <matthew.auld@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240322142702.186529-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-04-02 10:33:57 -07:00
Lucas De Marchi
e27f8a45c8
drm/xe: Stop passing user flag to xe_bo_create_user()
...
It's quite redundant to pass XE_BO_CREATE_USER_BIT to
xe_bo_create_user() since the only difference of that function is to
force that flag. Stop passing the flag in the few cases that were
explicitly doing so.
Reviewed-by: Matthew Auld <matthew.auld@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240322142702.186529-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-04-02 10:32:54 -07:00
Himal Prasad Ghimiray
b15e653495
drm/xe/xe_devcoredump: Check NULL before assignments
...
Assign 'xe_devcoredump_snapshot *' and 'xe_device *' only if
'coredump' is not NULL.
v2
- Fix commit messages.
v3
- Define variables before code.(Ashutosh/Jose)
v4
- Drop return check for coredump_to_xe. (Jose/Rodrigo)
v5
- Modify misleading commit message. (Matt)
Cc: Matt Roper <matthew.d.roper@intel.com >
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com >
Cc: José Roberto de Souza <jose.souza@intel.com >
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com >
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Reviewed-by: José Roberto de Souza <jose.souza@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240328123739.3633428-1-himal.prasad.ghimiray@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
2024-03-29 11:28:43 -04:00
Karthik Poosa
345dadc4f6
drm/xe/hwmon: Add infra to support card power and energy attributes
...
Add infra to support card power and energy attributes through channel 0.
Package attributes will be now exposed through channel 1 rather than
channel 0 as shown below.
Channel 0 i.e power1/energy1_xxx used for card and
channel 1 i.e power2/energy2_xxx used for package power,energy attributes.
power1/curr1_crit and in0_input are moved to channel 1, i.e.
power2/curr2_crit and in1_input as these are available for package only.
This would be needed for future platforms where they might be
separate registers for package and card power and energy.
Each discrete GPU supported by Xe driver, would have a directory in
/sys/class/hwmon/ with multiple channels under it.
Each channel would have attributes for power, energy etc.
Ex: /sys/class/hwmon/hwmon2/power1_max
/power1_label
/energy1_input
/energy1_label
Attributes will have a label to get more description of it.
Labelling is as below.
power1_label/energy1_label - "card",
power2_label/energy2_label - "pkg".
v2: Fix checkpatch errors.
v3:
- Update intel-xe-hwmon documentation. (Riana, Badal)
- Rename hwmon card channel enum from CHANNEL_PLATFORM
to CHANNEL_CARD. (Riana)
v4:
- Remove unrelated changes from patch. (Anshuman)
- Fix typo in commit msg.
v5:
- Update commit message and intel-xe-hwmon documentation with "Xe"
instead of xe when using it as a name. (Rodrigo)
Signed-off-by: Karthik Poosa <karthik.poosa@intel.com >
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240328175435.3870957-1-karthik.poosa@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
2024-03-29 11:27:21 -04:00
Michal Wajdeczko
c54eb24f71
drm/xe: Refactor GT debugfs
...
We are abusing struct drm_info_list.data by storing there pointer
to the xe_gt, while it shouldn't be used for any device specific
data. Use recently introduced xe_gt_debugfs_simple_show() that
hides all details how to obtain the xe_gt pointer. This will also
remove the need for making copies of the struct drm_info_list
to get GT specific definitions.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Link: https://lore.kernel.org/r/20240214115756.1525-4-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240328162808.451-4-michal.wajdeczko@intel.com
2024-03-29 14:00:07 +01:00
Michal Wajdeczko
19b8f86f4a
drm/xe: Define helper for GT specific debugfs files
...
Many of our debugfs files are GT specific and require a pointer to
struct xe_gt to correctly show its content. Our initial approach
to use drm_info_list.data field to pass pointer not only requires
extra steps (like copying template per each GT) but also abuses
the rule that this data field should not be device specific.
Introduce helper function that will use xe_gt pointer stored at
parent directory level and use .data only to pass actual print
function that would expects xe_gt pointer as a parameter.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Link: https://lore.kernel.org/r/20240214115756.1525-3-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240328162808.451-3-michal.wajdeczko@intel.com
2024-03-29 14:00:06 +01:00
Michal Wajdeczko
aee9781f81
drm/xe: Store pointer to struct xe_gt in gt/ debugfs directory
...
Attributes added under 'gt/' directories may wish to use that
in case they can't obtain it from elsewhere.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com >
Link: https://lore.kernel.org/r/20240214115756.1525-2-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240328162808.451-2-michal.wajdeczko@intel.com
2024-03-29 14:00:05 +01:00
Francois Dugast
ca83f9d201
drm/xe/uapi: Define topology types as indexes rather than masks
...
The topology type is an index (not a mask) so define the values
like other indexes instead of using powers of 2. This is also
to make clear that the next type can use value 3. This commit
does not change the existing values so it does not break
compatibility.
Cc: Lucas De Marchi <lucas.demarchi@intel.com >
Suggested-by: Matt Roper <matthew.d.roper@intel.com >
Signed-off-by: Francois Dugast <francois.dugast@intel.com >
Link: https://lore.kernel.org/intel-xe/20240327232317.GI718896@mdroper-desk1.amr.corp.intel.com/
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240328140243.7-1-francois.dugast@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-03-28 16:57:32 -07:00
Daniele Ceraolo Spurio
d62753a57d
drm/xe/gsc: Implement WA 14018094691
...
The WA states that we need to keep the primary GT powered up during GSC
load to allow the GSC FW to access its registers. We also need to make
sure that one of the registers is locked before starting the load.
v2: fix location of register def (Matt)
Bspec: 55928
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com >
Reviewed-by: Matt Roper <matthew.d.roper@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240326224456.518548-1-daniele.ceraolospurio@intel.com
2024-03-28 13:26:31 -07:00
Michal Wajdeczko
aed2c1d70a
drm/xe/pf: Add minimal support for VF_STATE_NOTIFY events
...
GuC will use VF_STATE_NOTIFY events to notify the PF about changes
of the VF state, in particular when a VF FLR was requested. Add
very minimal support for such events to avoid reporting errors due
to unexpected G2H. We will improve handling of these messages later.
While around also add few basic functions to control the VF state
(pause, resume, stop) as we will also exercise them soon.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240326191518.363-3-michal.wajdeczko@intel.com
2024-03-28 14:01:48 +01:00
Michal Wajdeczko
476f6c48d1
drm/xe/guc: Add VF_STATE_NOTIFY and VF_CONTROL to ABI
...
In upcoming patches the PF driver will add support to handle the
GUC2PF_VF_STATE_NOTIFY events and to send PF2GUC_VF_CONTROL request
messages. Add necessary definitions to our GuC firmware ABI header.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240326191518.363-2-michal.wajdeczko@intel.com
2024-03-28 14:01:42 +01:00
Michal Wajdeczko
0613834f3d
drm/xe/vf: Add proper detection of the SR-IOV VF mode
...
SR-IOV VF mode detection is based on testing VF capability bit on
the register that is accessible from both the PF and enabled VFs.
Bspec: 49904, 53227
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Matt Roper <matthew.d.roper@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240327182740.407-4-michal.wajdeczko@intel.com
2024-03-28 13:45:37 +01:00
Michal Wajdeczko
d79c88c45d
drm/xe: Move SR-IOV probe to xe_device_probe_early()
...
SR-IOV mode detection requires access to the MMIO register and
this can be done now in xe_device_probe_early().
We can also drop explicit has_sriov parameter as this flag is now
already available from xe->info.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Matt Roper <matthew.d.roper@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240327182740.407-3-michal.wajdeczko@intel.com
2024-03-28 13:45:35 +01:00
Michal Wajdeczko
451d261a6e
drm/xe: Separate pure MMIO init from VRAM checkout
...
We can setup root tile registers mapping at the same time as we
do early mapping of the entire MMIO BAR and keep mandatory VRAM
checkout as a separate step. This will allow us to perform SR-IOV
VF mode detection between those two steps using regular MMIO regs
access functions.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Matt Roper <matthew.d.roper@intel.com >
Reviewed-by: Matt Roper <matthew.d.roper@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240327182740.407-2-michal.wajdeczko@intel.com
2024-03-28 13:45:33 +01:00
Thomas Hellström
7ee7dd6f30
drm/xe: Move vma rebinding to the drm_exec locking loop
...
Rebinding might allocate page-table bos, causing evictions.
To support blocking locking during these evictions,
perform the rebinding in the drm_exec locking loop.
Also Reserve fence slots where actually needed rather than trying to
predict how many fence slots will be needed over a complete
wound-wait transaction.
v2:
- Remove a leftover call to xe_vm_rebind() (Matt Brost)
- Add a helper function xe_vm_validate_rebind() (Matt Brost)
v3:
- Add comments and squash with previous patch (Matt Brost)
Fixes: 24f947d58f ("drm/xe: Use DRM GPUVM helpers for external- and evicted objects")
Fixes: 29f424eb87 ("drm/xe/exec: move fence reservation")
Cc: Matthew Auld <matthew.auld@intel.com >
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-5-thomas.hellstrom@linux.intel.com
2024-03-28 08:39:30 +01:00
Thomas Hellström
0453f17575
drm/xe: Make TLB invalidation fences unordered
...
They can actually complete out-of-order, so allocate a unique
fence context for each fence.
Fixes: 5387e865d9 ("drm/xe: Add TLB invalidation fence after rebinds issued from execs")
Cc: Matthew Brost <matthew.brost@intel.com >
Cc: <stable@vger.kernel.org > # v6.8+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-4-thomas.hellstrom@linux.intel.com
2024-03-28 08:39:29 +01:00
Thomas Hellström
5a091aff50
drm/xe: Rework rebinding
...
Instead of handling the vm's rebind fence separately,
which is error prone if they are not strictly ordered,
attach rebind fences as kernel fences to the vm's resv.
Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com >
Cc: Matthew Brost <matthew.brost@intel.com >
Cc: <stable@vger.kernel.org > # v6.8+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-3-thomas.hellstrom@linux.intel.com
2024-03-28 08:39:28 +01:00
Thomas Hellström
4fc4899e86
drm/xe: Use ring ops TLB invalidation for rebinds
...
For each rebind we insert a GuC TLB invalidation and add a
corresponding unordered TLB invalidation fence. This might
add a huge number of TLB invalidation fences to wait for so
rather than doing that, defer the TLB invalidation to the
next ring ops for each affected exec queue. Since the TLB
is invalidated on exec_queue switch, we need to invalidate
once for each affected exec_queue.
v2:
- Simplify if-statements around the tlb_flush_seqno.
(Matthew Brost)
- Add some comments and asserts.
Fixes: 5387e865d9 ("drm/xe: Add TLB invalidation fence after rebinds issued from execs")
Cc: Matthew Brost <matthew.brost@intel.com >
Cc: <stable@vger.kernel.org > # v6.8+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-2-thomas.hellstrom@linux.intel.com
2024-03-28 08:39:26 +01:00
Michal Wajdeczko
e6e7eff627
drm/xe/guc: Use GuC ID Manager in submission code
...
We are ready to replace private guc_ids management code with
separate GuC ID Manager that can be shared with upcoming SR-IOV
PF provisioning code.
Cc: Matthew Brost <matthew.brost@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240313221112.1089-5-michal.wajdeczko@intel.com
2024-03-27 20:19:29 +01:00
Michal Wajdeczko
f4fb157cd0
drm/xe/kunit: Add basic tests for GuC context ID Manager
...
Before we switch-over submission code to use new GuC context ID
Manager, lets add some kunit tests to make sure that ID manager
works as expected.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240313221112.1089-4-michal.wajdeczko@intel.com
2024-03-27 20:19:27 +01:00
Michal Wajdeczko
68fac8ab0f
drm/xe/guc: Introduce GuC context ID Manager
...
While we are already managing GuC IDs directly in GuC submission
code, using bitmap() for MLRC and ida() for SLRC, this code can't
be easily extended to meet additional requirements for SR-IOV use
cases, like limited number of IDs available on VFs, or ID range
reservation for provisioning VFs by the PF.
Add a separate component for managing GuC IDs, that will replace
existing ID management. Start with bitmap() based implementation
that could be optimized later based on perf data.
Cc: Matthew Brost <matthew.brost@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240313221112.1089-3-michal.wajdeczko@intel.com
2024-03-27 20:19:25 +01:00
Michal Wajdeczko
f88beeed82
drm/xe/guc: Move GUC_ID_MAX definition to GuC ABI header
...
This macro represents GuC firmware capability and shall be defined
in the firmware ABI header. Move it to xe_guc_fwif.h file.
Cc: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240313221112.1089-2-michal.wajdeczko@intel.com
2024-03-27 20:19:23 +01:00
Michal Wajdeczko
59058f2af9
drm/xe/guc: Fix include guard for SR-IOV ABI
...
Use include guard macro name that follows naming used by the other
GuC ABI files.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240213214908.1481-1-michal.wajdeczko@intel.com
2024-03-27 12:11:48 +01:00
Michal Wajdeczko
7da3f561cb
drm/xe: Move HW GGTT definitions to dedicated file
...
It's better to keep all hardware GGTT definitions separated from
the driver code. It also helps to avoid duplicated definitions.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Matt Roper <matthew.d.roper@intel.com >
Reviewed-by: Matt Roper <matthew.d.roper@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240326131042.319-1-michal.wajdeczko@intel.com
2024-03-27 11:31:11 +01:00
Nirmoy Das
5dffaa1bb9
drm/xe: Create a helper function to init job's user fence
...
Refactor xe_sync_entry_signal so it doesn't have to
modify xe_sched_job struct instead create a new helper function
to set user fence values for a job.
v2: Move the sync type check to xe_sched_job_init_user_fence(Lucas)
Cc: Lucas De Marchi <lucas.demarchi@intel.com >
Cc: Matthew Auld <matthew.auld@intel.com >
Cc: Matthew Brost <matthew.brost@intel.com >
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com >
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com >
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240321161142.4954-1-nirmoy.das@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-03-26 15:40:19 -07:00
Vinay Belgaumkar
4b217c7fa6
drm/xe/guc: Remove explicit shutdown of SLPC
...
SLPC shutdown is called in reset and suspend paths. In the reset
path, it is possible that the H2G call gets lost as GuC is in the
process of being reset. There is no value in stopping SLPC when
it will happen anyways.
In the suspend path, we disable communication with GuC, so there
is no need to explicitly shutdown SLPC.
v2: Rebase
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com >
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com >
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240325235602.1155486-1-vinay.belgaumkar@intel.com
2024-03-26 15:35:38 -07:00
Ravi Kumar Vodapalli
0bd25f78c4
drm/xe: Add new PCI IDs to DG2 platform
...
New PCI IDs are added in Bspec for DG2 platform, add them in driver
Bspec: 44477
Signed-off-by: Ravi Kumar Vodapalli <ravi.kumar.vodapalli@intel.com >
Reviewed-by: Matt Roper <matthew.d.roper@intel.com >
Signed-off-by: Matt Roper <matthew.d.roper@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240326103825.3832879-1-ravi.kumar.vodapalli@intel.com
2024-03-26 13:58:20 -07:00
Niranjana Vishwanathapura
cf03825bdd
drm/xe: Use FIELD_PREP for lrc descriptor
...
Use FIELD_PREP for setting lrc descriptor fields instead
of shifting values to fields.
v2: Use ULL macro variants
v3: Do not use FIELD_PREP for 1-bit values
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com >
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240322191455.7613-1-niranjana.vishwanathapura@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-03-26 09:17:13 -07:00
Lucas De Marchi
008aa86a09
drm/xe: Remove redundant functions to get xe
...
xe_device.h implements these helpers, just use them.
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240321213818.72311-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-03-22 14:25:56 -07:00
Lucas De Marchi
35b22649eb
drm/xe: Fix END redefinition
...
mips declares an END macro in its headers so it can't be used without
namespace in a driver like xe.
Instead of coming up with a longer name, just remove the macro and
replace its use with 0 since it's still clear what that means:
set_offsets() was already using that implicitly when checking the data
variable.
Reported-by: Guenter Roeck <linux@roeck-us.net >
Closes: http://kisskb.ellerman.id.au/kisskb/buildresult/15143996/
Tested-by: Guenter Roeck <linux@roeck-us.net >
Reviewed-by: Jani Nikula <jani.nikula@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240322145037.196548-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-03-22 11:27:58 -07:00
Daniele Ceraolo Spurio
b4abeb5545
drm/xe/guc: Check error code when initializing the CT mutex
...
The initialization via drmm_mutex_init can fail, so we need to check the
return code and escalate the failure.
The mutex initialization has been moved after all the other init steps
that can't fail, so we're always guaranteed to have those done and don't
have to check in the cleanup code.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com >
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240321195512.274210-1-daniele.ceraolospurio@intel.com
2024-03-22 10:22:20 -07:00
Vinay Belgaumkar
c04b8aaeb4
drm/xe/guc: Add some failure checks
...
Return failures from pc_adjust_freq_bounds.
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com >
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240321191219.243583-1-vinay.belgaumkar@intel.com
2024-03-22 10:22:08 -07:00
José Roberto de Souza
8f6444e1d1
drm/xe: Nuke EXEC_QUEUE_FLAG_PERSISTENT
...
This is a left over of commit f1a9abc0cf ("drm/xe/uapi: Remove support for persistent exec_queues").
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com >
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com >
Signed-off-by: José Roberto de Souza <jose.souza@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240307135229.41973-3-jose.souza@intel.com
2024-03-22 08:08:48 -07:00
José Roberto de Souza
e5f661bb56
drm/xe/devcoredump: Print errno if VM snapshot was not captured
...
My testing machine has only 8GB of RAM and while running piglit tests
I can reach the OOM cache in xe_vm_snapshot_capture() snap allocaiton
sometimes.
So to differentiate the OOM from race between capture and UMDs
unbinbind VMs here I'm adding a '[0].error: -12' to devcoredump.
v2:
- fix returned errno values
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com >
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com >
Signed-off-by: José Roberto de Souza <jose.souza@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240307135229.41973-2-jose.souza@intel.com
2024-03-22 08:08:47 -07:00
José Roberto de Souza
241dea2101
drm/xe: Make devcoredump VM error state print consistent
...
This makes VM error consistent with [x].length and [x].data.
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com >
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com >
Signed-off-by: José Roberto de Souza <jose.souza@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240307135229.41973-1-jose.souza@intel.com
2024-03-22 08:08:46 -07:00