Commit Graph

1267942 Commits

Author SHA1 Message Date
Michal Wajdeczko
629df234bf drm/xe/pf: Introduce functions to configure VF thresholds
The GuC firmware monitors VF's activity and notifies the PF driver
once any configured threshold related to such activity is exceeded.
Add functions to allow configuration of these thresholds per VF.

Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240514190015.2172-5-michal.wajdeczko@intel.com
2024-05-16 18:04:42 +02:00
Michal Wajdeczko
7aefee83fc drm/xe/guc: Add support for threshold KLVs in to_string() helper
Use MAKE_XE_GUC_KLV_THRESHOLDS_SET to generate missing conversion
of threshold KLV keys to string.

Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240514190015.2172-4-michal.wajdeczko@intel.com
2024-05-16 18:04:41 +02:00
Michal Wajdeczko
b1ce52fbf6 drm/xe/guc: Introduce GuC KLV thresholds set
The GuC firmware monitors VF's activity and notifies the PF driver
once any configured threshold related to such activity is exceeded.

The available thresholds are defined in the GuC ABI as part of the
GuC VF Configuration KLVs.  Threshold configurations performed by
the PF driver and notifications sent by the GuC rely on the KLV keys,
which are not zero-based and might not guarantee continuity.

To simplify the driver code and eliminate the need to repeat very
similar code for each threshold, introduce the threshold set macro
that allows to generate required code based on unique threshold tag.

Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240514190015.2172-3-michal.wajdeczko@intel.com
2024-05-16 18:04:39 +02:00
Michal Wajdeczko
e6946ea8fc drm/xe/guc: Add more KLV helper macros
In upcoming patches we will want to generate some of the KLV keys
from other macros. Add MAKE_GUC_KLV_{KEY|LEN} macros for that and
make sure they will correctly expand provided TAG parameter. Also
fix PREP_GUC_KLV_TAG to also work correctly within other macros.

Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240514190015.2172-2-michal.wajdeczko@intel.com
2024-05-16 18:04:38 +02:00
Michal Wajdeczko
9aa8586063 drm/xe/pf: Implement pci_driver.sriov_configure callback
The PCI subsystem already exposes the "sriov_numvfs" attribute
that users can use to enable or disable SR-IOV VFs. Add custom
implementation of the .sriov_configure callback defined by the
pci_driver to perform additional steps, including fair VFs
provisioning with the resources, as required by our platforms.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Cc: Badal Nilawar <badal.nilawar@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> #v2
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240506184121.2615-1-michal.wajdeczko@intel.com
2024-05-15 23:45:49 +02:00
Michal Wajdeczko
75fe5f3471 drm/xe/pf: Don't advertise support to enable VFs if not ready
Even if we have not enabled SR-IOV support using the platform
specific has_sriov flag, the hardware may still report SR-IOV
capability and the PCI layer may wrongly advertise driver support
to enable VFs.  Explicitly reset the number of supported VFs to
zero to avoid confusion.

Applications may read the /sys/bus/pci/devices/.../sriov_totalvfs
prior to enabling VFs using the sriov_numvfs to check if such an
operation is possible.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240507165757.2835-1-michal.wajdeczko@intel.com
2024-05-15 22:31:52 +02:00
Matthew Brost
c8ff26b82c drm/xe: Only zap PTEs as needed
If PTEs are already invalidated no need to invalidate again.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240514232325.84508-1-matthew.brost@intel.com
2024-05-15 08:07:57 -07:00
Jonathan Cavitt
b31cfb47b2 drm/xe/xe_guc_submit: Declare reset if banned or killed or wedged
Add an additional condition to the reset_status guc_exec_queue_op that
returns true if the exec queue has been banned or killed or wedged.  The
reset_status op is only used for exiting any xe_wait_user_fence_ioctl
that waits on an exec queue without timing out, so doing this will exit
the ioctl early in cases where the exec queue can no longer function,
such as after a GuC stop during a reset.

Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240510194540.3246991-3-jonathan.cavitt@intel.com
2024-05-14 16:28:53 -07:00
Jonathan Cavitt
abdea2847a drm/xe/xe_guc_submit: Allow lr exec queues to be banned
LR queues currently don't get banned during a GT/GuC reset because they
lack a job.  Though they don't have a job to detect the reset status of,
it's still possible to tell when they should be banned by looking at the
LRC: if the LRC head and tail don't match, then the exec queue should be
banned and cleaned up.

This also requires swapping the usage of xe_sched_tdr_queue_imm with
xe_guc_exec_queue_trigger_cleanup, as the former is specific to non-lr
exec queues.

Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240510194540.3246991-2-jonathan.cavitt@intel.com
2024-05-14 16:28:52 -07:00
Jonathan Cavitt
1564d411e1 drm/xe/xe_guc_submit: Fix exec queue stop race condition
Reorder the xe_sched_tdr_queue_imm and set_exec_queue_banned calls in
guc_exec_queue_stop.  This prevents a possible race condition between
the two events in which it's possible for xe_sched_tdr_queue_imm to
wake the ufence waiter before the exec queue is banned, causing the
ufence waiter to miss the banned state.

Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240510194540.3246991-1-jonathan.cavitt@intel.com
2024-05-14 16:28:51 -07:00
Michal Wajdeczko
4071e0872f drm/xe/uc: Move GuC submission init to post hwconfig step
We shouldn't need anything from the GuC submission code until we
finish GuC initialization in post hwconfig step.

While around add diagnostic message if we fail uC init.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240510203810.1952-3-michal.wajdeczko@intel.com
2024-05-14 23:19:27 +02:00
Michal Wajdeczko
3df01f5c72 drm/xe/uc: Reorder post hwconfig uC initialization step
We want to move the GuC submission initialization to the post
hwconfig step, but now this step is done too late as migration
initialization uses exec_queue that would crash due to a unset
exec_queue_ops. We can easily fix that by small function reorder.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240510203810.1952-2-michal.wajdeczko@intel.com
2024-05-14 23:19:25 +02:00
Matthew Brost
04f4a70a18 drm/xe: Only use reserved BCS instances for usm migrate exec queue
The GuC context scheduling queue is 2 entires deep, thus it is possible
for a migration job to be stuck behind a fault if migration exec queue
shares engines with user jobs. This can deadlock as the migrate exec
queue is required to service page faults. Avoid deadlock by only using
reserved BCS instances for usm migrate exec queue.

Fixes: a043fbab7a ("drm/xe/pvc: Use fast copy engines as migrate engine on PVC")
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240415190453.696553-2-matthew.brost@intel.com
Reviewed-by: Brian Welty <brian.welty@intel.com>
2024-05-14 13:12:27 -07:00
Himal Prasad Ghimiray
4c0be90e68 drm/xe: Fix the warning conditions
The maximum timeout display uses in xe_pcode_request is 3 msec, add the
warning in cases the function is misused with higher timeouts.

Add a warning if pcode_try_request is not passed the timeout parameter
greater than 0.

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240508152216.3263109-3-himal.prasad.ghimiray@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-05-14 11:26:40 -04:00
Himal Prasad Ghimiray
c81858eb52 drm/xe: Change pcode timeout to 50msec while polling again
Polling is initially attempted with timeout_base_ms enabled for
preemption, and if it exceeds this timeframe, another attempt is made
without preemption, allowing an additional 50 ms before timing out.

v2
- Rebase

v3
- Move warnings to separate patch (Lucas)

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Fixes: 7dc9b92dcf ("drm/xe: Remove i915_utils dependency from xe_pcode.")
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240508152216.3263109-2-himal.prasad.ghimiray@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-05-14 11:26:40 -04:00
Lucas De Marchi
d1855d284e drm/xe: Move sw-only pcode initialization
Move it to xe_gt_init_early() that initializes the sw-only part for each
gt.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240513213751.1017791-5-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-05-13 21:21:13 -07:00
Lucas De Marchi
45b9066ec3 drm/xe: Move xe_force_wake_init_gt() inside gt initialization
xe_force_wake_init_gt() is a software-only initialization and doesn't
need to be called from xe_device_probe(). Move it to initialize
together with the gt.

Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240513213751.1017791-4-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-05-13 21:21:13 -07:00
Lucas De Marchi
65c4de2a91 drm/xe: Move xe_gt_init_early() where it belongs
Early shall be early enough, stop doing other things with gt before it.
Now that xe_gt_init_early() doesn't need forcewake and doesn't depend on
the fake engine_mask initialization, move it where it belongs: it
doesn't need to be after hwconfig config anymore.

Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240513213751.1017791-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-05-13 21:21:13 -07:00
Lucas De Marchi
402c014cbc drm/xe: Drop useless forcewake get/put
Forcewake used to be needed in xe_gt_init_early() since it was calling
xe_gt_topology_init(). That call was dropped in commit 4c47049d93
("drm/xe/guc: Fix missing topology init"), but the forcewake calls were
left behind. Remove them.

Cc: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240513213751.1017791-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-05-13 21:21:13 -07:00
Lucas De Marchi
61549a2ee5 drm/xe: Drop __engine_mask
Not really used, it's just a copy of engine_mask, which already reads
the fuses to mark engines as available/not-available.

Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240513213751.1017791-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-05-13 21:21:13 -07:00
Michal Wajdeczko
664de50cbf drm/xe: Fix xe_reg_sr.h
Prefer forward declarations over #include xe_reg_sr_types.h

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240513084218.2084-5-michal.wajdeczko@intel.com
2024-05-13 21:36:52 +02:00
Michal Wajdeczko
38830bfe28 drm/xe: Fix xe_lrc.h
Prefer forward declarations over #include xe_lrc_types.h

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240513084218.2084-4-michal.wajdeczko@intel.com
2024-05-13 21:36:51 +02:00
Michal Wajdeczko
c5d9c6690e drm/xe: Fix xe_guc_ads.h
We don't need to include xe_guc_ads_types.h here.
Use forward declaration instead.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240513084218.2084-3-michal.wajdeczko@intel.com
2024-05-13 21:36:50 +02:00
Michal Wajdeczko
304aa805ee drm/xe: Fix xe_gt_throttle_sysfs.h
We don't need to include drm/drm_managed.h here.
We don't need to comment final #endif.
Also remove empty line at the end.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240513084218.2084-2-michal.wajdeczko@intel.com
2024-05-13 21:36:49 +02:00
Michal Wajdeczko
c3203ca3b8 drm/xe: Rename few xe_args.h macros
To minimize the risk of future name collisions, rename macros to
always include the ARG or ARGS tag:

  DROP_FIRST to DROP_FIRST_ARG
  PICK_FIRST to FIRST_ARG
  PICK_LAST to LAST_ARG

Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> #v2
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240508171000.1864-1-michal.wajdeczko@intel.com
2024-05-09 21:28:25 +02:00
Michal Wajdeczko
62010b3cd6 drm/xe: Move xe_gpu_commands.h file to instructions/
All other files with commands definitions are in instructions/
folder. Move xe_gpu_commands.h also there.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240508174856.1908-1-michal.wajdeczko@intel.com
2024-05-09 21:17:57 +02:00
Karthik Poosa
515f089723 drm/xe/hwmon: Remove unwanted write permission for currN_label
Change umode of currN_label from 0644 to 0444 as write permission
not needed for label.

Signed-off-by: Karthik Poosa <karthik.poosa@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240419125945.4085629-1-karthik.poosa@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-05-09 14:50:25 -04:00
Shuicheng Lin
205e5c4b20 drm/xe: Fix UBSAN shift-out-of-bounds failure
Here is the failure stack:
[   12.988209] ------------[ cut here ]------------
[   12.988216] UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13
[   12.988232] shift exponent 64 is too large for 64-bit type 'long unsigned int'
[   12.988235] CPU: 4 PID: 1310 Comm: gnome-shell Tainted: G     U             6.9.0-rc6+prerelease1158+ #19
[   12.988237] Hardware name: Intel Corporation Raptor Lake Client Platform/RPL-S ADP-S DDR5 UDIMM CRB, BIOS RPLSFWI1.R00.3301.A02.2208050712 08/05/2022
[   12.988239] Call Trace:
[   12.988240]  <TASK>
[   12.988242]  dump_stack_lvl+0xd7/0xf0
[   12.988248]  dump_stack+0x10/0x20
[   12.988250]  ubsan_epilogue+0x9/0x40
[   12.988253]  __ubsan_handle_shift_out_of_bounds+0x10e/0x170
[   12.988260]  dma_resv_reserve_fences.cold+0x2b/0x48
[   12.988262]  ? ww_mutex_lock_interruptible+0x3c/0x110
[   12.988267]  drm_exec_prepare_obj+0x45/0x60 [drm_exec]
[   12.988271]  ? vm_bind_ioctl_ops_execute+0x5b/0x740 [xe]
[   12.988345]  vm_bind_ioctl_ops_execute+0x78/0x740 [xe]

It is caused by the value 0 of parameter num_fences in function
drm_exec_prepare_obj.  And lead to in function __rounddown_pow_of_two,
"0 - 1" causes the shift-out-of-bounds.

By design drm_exec_prepare_obj() should be called only when there are
fences to be reserved. If num_fences is 0, calling drm_exec_lock_obj()
is sufficient as was done in commit 9377de4cb3 ("drm/xe/vm: Avoid
reserving zero fences")

Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://lore.kernel.org/all/24d4a9a9-c622-4f56-8672-21f4c6785476@amd.com
Link: https://patchwork.freedesktop.org/patch/msgid/20240507130411.630361-1-shuicheng.lin@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-05-08 21:44:17 -07:00
Niranjana Vishwanathapura
fe0154cf82 drm/xe/xe2: Enable Indirect Ring State support for Xe2
Indirect Ring State is the recommended mode for Xe2 platforms,
enable it by default.

v2: Set has_indirect_ring_state to '1' instead of 'true'

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240507224255.5059-5-niranjana.vishwanathapura@intel.com
2024-05-08 14:48:33 -07:00
Niranjana Vishwanathapura
7578c2f811 drm/xe: Dump Indirect Ring State registers
Dump INDIRECT_RING_STATE and RING_START_UDW registers.

v2: Add bspec reference

Bspec: 67137, 67138
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240507224255.5059-4-niranjana.vishwanathapura@intel.com
2024-05-08 14:48:32 -07:00
Niranjana Vishwanathapura
d6219e1cd5 drm/xe: Add Indirect Ring State support
When Indirect Ring State is enabled, the Ring Buffer state and
Batch Buffer state are context save/restored to/from Indirect
Ring State instead of the LRC. The Indirect Ring State is a 4K
page mapped in global GTT at a 4K aligned address. This address
is programmed in the INDIRECT_RING_STATE register of the
corresponding context's LRC.

v2: Fix kernel-doc, add bspec reference
v3: Fix typo in commit text

Bspec: 67296, 67139
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240507224255.5059-3-niranjana.vishwanathapura@intel.com
2024-05-08 14:48:30 -07:00
Niranjana Vishwanathapura
85cfc41257 drm/xe: Minor cleanup in LRC handling
Properly define register fields and remove redundant
lower_32_bits().

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240507224255.5059-2-niranjana.vishwanathapura@intel.com
2024-05-08 14:48:29 -07:00
Bommu Krishnaiah
598dc939ed drm/xe/xe2: Add workaround 14021402888
This workaround applies to Graphics 20.01 as RCS engine workaround

Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240418111534.481568-1-krishnaiah.bommu@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-05-08 14:24:50 -07:00
Lucas De Marchi
ee72842306 drm/xe/ads: Use flexible-array
Zero-length arrays are deprecated and flexible arrays should be
used instead: https://www.kernel.org/doc/html/v6.9-rc7/process/deprecated.html#zero-length-and-one-element-arrays

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Julia Lawall <julia.lawall@inria.fr>
Closes: https://lore.kernel.org/r/202405051824.AmjAI5Pg-lkp@intel.com/
Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240506141917.205714-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-05-07 16:32:25 -07:00
Michal Wajdeczko
b7f6318a9c drm/xe: Fix xe_device.h
Some explicit includes are needed only from the xe_device.c.
And there is no need for redundant forward declarations.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240507110959.2747-4-michal.wajdeczko@intel.com
2024-05-07 23:21:21 +02:00
Michal Wajdeczko
93dd6ad89c drm/xe: Don't rely on xe_force_wake.h to be included elsewhere
While xe_force_wake.h is now included from the xe_device.h, we
want to drop that include as we don't need it there. Explicitly
include xe_force_wake.h where needed.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240507110959.2747-3-michal.wajdeczko@intel.com
2024-05-07 23:21:17 +02:00
Michal Wajdeczko
7348a9a112 drm/xe: Don't rely on xe_assert.h to be included elsewhere
While xe_assert.h is now included and used by the xe_force_wake.h,
we want to stop include xe_force_wake.h from xe_device.h as it's
not needed there.  Explicitly include xe_assert.h where needed.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240507110959.2747-2-michal.wajdeczko@intel.com
2024-05-07 23:21:15 +02:00
Tejas Upadhyay
c18a5e3e61 drm/xe: skip error capture when exec queue is killed
When user closes exec queue soon after job submission,
we are generating error coredump. Instead check if
exec queue is killed during job timeout then skip
error coredump capture.

V2:
  - Just skip error capture - MattB

Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240430131229.2228809-1-tejas.upadhyay@intel.com
2024-05-07 11:43:08 -07:00
Francois Dugast
a4cb575d91 drm/xe/vm_doc: Fix some typos
Fix some typos and add / remove / change a few words to improve
readability and prevent some ambiguities.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240506202950.109750-1-francois.dugast@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-05-07 12:45:39 -04:00
Michal Wajdeczko
5b882c1e5a drm/xe: Fix xe_mocs.h
We don't need to include <linux/types.h>.
We don't use struct xe_exec_queue here.
We should sort forward declarations.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240506205254.2659-1-michal.wajdeczko@intel.com
2024-05-07 12:03:49 +02:00
Matthew Brost
50aec9665e drm/xe: Use ordered WQ for G2H handler
System work queues are shared, use a dedicated work queue for G2H
processing to avoid G2H processing getting block behind system tasks.

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: <stable@vger.kernel.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240506034758.3697397-1-matthew.brost@intel.com
2024-05-06 10:42:51 -07:00
Janga Rahul Kumar
9fbd0adbcb drm/xe/mocs: Add debugfs node to dump mocs
This is useful to check mocs configuration. Tests/Tools can use
this debugfs entry to get mocs info.

v2: Address review comments. Change debugfs output style similar
to pat debugfs. (Lucas De Marchi)

v3: rebase.

v4: Address review comments. Use function pointer inside ops
struct. Update Test-with links. Remove usage of flags wherever
not required. (Lucas De Marchi)

v5: Address review comments. Move register defines. Modify mocs
info struct to avoid holes. (Luca De Marchi)

Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Janga Rahul Kumar <janga.rahul.kumar@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240503193902.2056202-3-janga.rahul.kumar@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-05-06 09:24:50 -07:00
Janga Rahul Kumar
72c7163f27 drm/xe: Relocate regs_are_mcr function
Relocate regs_are_mcr funciton to a higher position in the file
for improved visibility.

Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Janga Rahul Kumar <janga.rahul.kumar@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240503193902.2056202-2-janga.rahul.kumar@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-05-06 09:24:49 -07:00
Nirmoy Das
a0862cf2fe drm/xe: Refactor default device atomic settings
The default behavior of device atomics depends on the
VM type and buffer allocation types. Device atomics are
expected to function with all types of allocations for
traditional applications/APIs. Additionally, in compute/SVM
API scenarios with fault mode or LR mode VMs, device atomics
must work with single-region allocations. In all other cases
device atomics should be disabled by default also on platforms
where we know device atomics doesn't on work on particular
allocations types.

v3: fault mode requires LR mode so only check for LR mode
    to determine compute API(Jose).
    Handle SMEM+LMEM BO's migration to LMEM where device
    atomics is expected to work. (Brian).
v2: Fix platform checks to correct atomics behaviour on PVC.

Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Reviewed-by: Oak Zeng <oak.zeng@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240430162529.21588-6-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-05-06 18:14:11 +02:00
Nirmoy Das
a4b725767d drm/xe: Add function to check if BO has single placement
A new helper function xe_bo_has_single_placement() to check
if a BO has single placement.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240430162529.21588-5-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-05-06 18:14:11 +02:00
Nirmoy Das
06e69a4249 drm/xe: Introduce has_device_atomics_on_smem device info
Add has_device_atomics_on_smem to specify that a device
supports device atomics on system memory. Currently XE2
supports this so set this for XE2.

v2: Set has_device_atomics_on_smem for all platform but
    PVC.

Reviewed-by: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240430162529.21588-4-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-05-06 18:14:10 +02:00
Nirmoy Das
e7192f0162 drm/xe: Move vm bind bo validation to a helper function
Move vm bind bo validation to a helper function to make the
xe_vm_bind_ioctl() more readable.

v2: Capture ret value of xe_vm_bind_ioctl_validate_bo(Matt B).
    Remove redundant coh_mode param.

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240430162529.21588-3-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-05-06 18:14:10 +02:00
Nirmoy Das
c462f81b69 drm/xe: Introduce has_atomic_enable_pte_bit device info
Add has_atomic_enable_pte_bit to specify that a device
has PTE_AE bit in its PTE feild. Currently XE2 and PVC
supports this so set this for those two.

This will help consolidate setting atomic access bit in PTE
logic which is spread between multiple files.

Reviewed-by: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240430162529.21588-2-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-05-06 18:14:10 +02:00
Rodrigo Vivi
e9c190b9b8 drm/xe: Demote CCS_MODE info to debug only
This information is printed in any gt_reset, which actually
occurs in any runtime resume, what can be so verbose in
production build. Let's demote it to debug only.

Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240503190331.6690-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-05-03 16:59:05 -04:00
Francois Dugast
7867541241 drm/xe/debugfs: Get a runtime_pm reference when setting wedged mode
This function is another entry point where it must be ensured that
the device resumes before operating on the GuC, so grab a runtime_pm
reference. This fixes inner xe_pm_runtime_get_noresume calls which
were previously failing.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240503082450.268335-1-francois.dugast@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-05-03 08:54:32 -04:00