Ensure we can safely take a ref of the exec queue's last fence from the
context of resuming jobs from the hw engine group. The locking requirements
differ from the general case, hence the introduction of this new function.
v2: Add kernel doc, rework the code to prevent code duplication
v3: Fix kernel doc, remove now unnecessary lockdep variants (Matt Brost)
v4: Remove new put function (Matt Brost)
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-7-francois.dugast@intel.com
Add helpers to safely add and delete the exec queues attached to a hw
engine group, and make use them at the time of creation and destruction of
the exec queues. Keeping track of them is required to control the
execution mode of the hw engine group.
v2: Improve error handling and robustness, suspend exec queues created in
fault mode if group in dma-fence mode, init queue link (Matt Brost)
v3: Delete queue from hw engine group when it is destroyed by the user,
also clean up at the time of closing the file in case the user did
not destroy the queue
v4: Use correct list when checking if empty, do not add the queue if VM
is in xe_vm_in_preempt_fence_mode (Matt Brost)
v5: Remove unrelated newline, add checks and asserts for group, unwind on
suspend failure (Matt Brost)
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-4-francois.dugast@intel.com
A xe_hw_engine_group is a group of hw engines. Two hw engines belong to
the same xe_hw_engine_group if one hw engine cannot make progress while
the other is stuck on a page fault.
Typically, hw engines of the same group share some resources such as EUs,
but this really depends on the hardware configuration of the platforms.
The simple engines partitioning proposed here might be too conservative
but is intended to work for existing platforms. It can be optimized later
if more sets of independent engines are identified.
The hw engine groups are intended to be used in the context of faulting
long-running jobs submissions.
v2: Move to own files, improve error handling (Matt Brost)
v3: Fix build issue reported by CI, improve commit message (Matt Roper)
v4: Fix kernel doc
v5: Add switch case for XE_ENGINE_CLASS_OTHER
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-2-francois.dugast@intel.com
When steering MCR register ranges of type "DSS," the group_id and
instance_id values are calculated by dividing the DSS pool according to
the size of a gslice or cslice, depending on the platform. These values
haven't changed much on past platforms, so we've been able to hardcode
the proper divisor so far. However the layout may not be so fixed on
future platforms so the proper, future-proof way to determine this is by
using some of the attributes from the GuC's hwconfig table. The
hwconfig has two attributes reflecting the architectural maximum slice
and subslice counts (i.e., before any fusing is considered) that can be
used for the purposes of calculating MCR steering targets.
If the hwconfig is lacking the necessary values (which should only be
possible on older platforms before these attributes were added), we can
still fall back to the old hardcoded values. Going forward the hwconfig
is expected to always provide the information we need on newer
platforms, and any failure to do so will be considered a bug in the
firmware that will prevent us from switching to the buggy firmware
release.
It's worth noting that over time GuC's hwconfig has provided a couple
different keys with similar-sounding descriptions. For our purposes
here, we only trust the newer key "70" which has supplanted the
similarly-named key "2" that existed on older platforms.
Bspec: 73210
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240815172602.2729146-4-matthew.d.roper@intel.com
Get drm-xe-next on v6.11-rc2 and synchronized with drm-intel-next for
the display side. This resolves the current conflict for the
enable_display module parameter and allows further pending refactors.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
xe_gt_enable_host_l2_vram() is reading the XE2_GAMREQSTRM_CTRL register
that is currently missing the MCR annotation. However, just adding the
annotation doesn't work as this function is called before MCR handling
is initialized in xe_gt_mcr_init().
xe_gt_enable_host_l2_vram() is used to implement WA 16023588340 that
needs to be done as early as possible during initialization in order to
be effective since the MMIO writes impact it. In the failure scenario,
driver would simply not be able to bind successfully.
Moving xe_gt_enable_host_l2_vram() later, after MCR initialization is
done, only incurs a few additional HW accesses, particularly when
loading GuC for hwconfig. Binding/unbinding the driver 100 times in BMG
still works so it should be ok to start handling the WA a little bit
later. This is sufficient to allow adding the MCR annotation to
XE2_GAMREQSTRM_CTRL.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240814095614.909774-2-tejas.upadhyay@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
The different approach used by xe regarding the initialization of
display HW has been proved a great addition for early driver bring up:
core xe can be tested without having all the bits sorted out on the
display side.
On the other hand, the approach exposed by i915-display is to *actively*
disable the display by programming it if needed, i.e. if it was left
enabled by firmware. It also has its use to make sure the HW is actually
disabled and not wasting power.
However having both the way it is in xe doesn't expose a good interface
wrt module params. From modinfo:
disable_display:Disable display (default: false) (bool)
enable_display:Enable display (bool)
Rename enable_display to probe_display to try to convey the message that
the HW is being touched and improve the module param description. To
avoid confusion, the enable_display is renamed everywhere, not only in
the module param. New description for the parameters:
disable_display:Disable display (default: false) (bool)
probe_display:Probe display HW, otherwise it's left untouched (default: true) (bool)
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240813141931.3141395-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Early in the development of Xe we identified an issue with SVG state
handling on DG2 and MTL (and later on Xe2 as well). In
commit 72ac304769 ("drm/xe: Emit SVG state on RCS during driver load
on DG2 and MTL") and commit fb24b858a2 ("drm/xe/xe2: Update SVG state
handling") we implemented our own workaround to prevent SVG state from
leaking from context A to context B in cases where context B never
issues a specific state setting.
The hardware teams have now created official workaround Wa_14019789679
to cover this issue. The workaround description only requires emitting
3DSTATE_MESH_CONTROL, since they believe that's the only SVG instruction
that would potentially remain unset by a context B, but still cause
notable issues if unwanted values were inherited from context A.
However since we already have a more extensive implementation that emits
the entire SVG state and prevents _any_ SVG state from unintentionally
leaking, we'll stick with our existing implementation just to be safe.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240812181042.2013508-2-matthew.d.roper@intel.com
When validating VF config on the media GT, we may wrongly report
that VF is already partially configured on it, as we consider GGTT
and LMEM provisioning done on the primary GT (since both GGTT and
LMEM are tile-level resources, not a GT-level).
This will cause skipping a VF auto-provisioning on the media-GT and
in result will block a VF from successfully initialize that GT.
Fix that by considering GGTT and LMEM configurations only when
checking if a VF provisioning is complete, and omit GGTT and LMEM
when reporting empty/partial provisioning.
Fixes: 234670cea9 ("drm/xe/pf: Skip fair VFs provisioning if already provisioned")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240806180516.618-1-michal.wajdeczko@intel.com
Mgag200's BMC connector tracks the status of an underlying physical
connector and updates the BMC status accordingly. This functionality
works around GNOME's settings app, which cannot handle multiple
outputs on the same CRTC.
The workaround is now obsolete as the VGA-BMC connector handles BMC
support internally. Hence, remove the driver's code and the BMC output
entirely.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240805130622.63458-6-tzimmermann@suse.de
Move calls to stop and start BMC scanout from CRTC helpers to the
VGA-BMC encoder's atomic_disable and atomic_enable. Makes the BMC
scanout transparent to the CRTC.
DRM's atomic helpers call an encoder's atomic_disable and atomic_enable
helpers for all enabled encoders. The BMC stops scanning out the VGA
signal if modeset disables the VGA encoder, and starts scanning out
if the modeset enables the VGA encoder.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240805130622.63458-5-tzimmermann@suse.de
Control the VIDRST pin from the VGA-BMC encoder's atomic_check and
remove the respective code from CRTC. Makes the VIDRST functionality
fully composable.
The VIDRST pin allows an external clock source to control the SYNC
signals of the Matrox chip. The functionality is part of the CRTC,
but depends on the presence of the clock source. This is the case for
some BMCs, so control the pin from the VGA-BMC output.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240805130622.63458-4-tzimmermann@suse.de