Michal Wajdeczko
7065b19bd5
drm/xe/guc: Allow to initialize submission with limited set of IDs
...
While PF and native drivers may initialize submission code to use
all available GuC contexts IDs, the VF driver may only use limited
number of IDs. Update init function to accept number of context
IDs available for use.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Matthew Brost <matthew.brost@intel.com >
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240521092518.624-2-michal.wajdeczko@intel.com
2024-05-22 12:53:43 +02:00
Michal Wajdeczko
f7e20cfb59
drm/xe: Cleanup xe_mmio.h
...
We don't need <linux/delay.h> include since commit 5c09bd6ccd
("drm/xe/mmio: Move xe_mmio_wait32() to xe_mmio.c").
We don't need <linux/io-64-nonatomic-lo-hi.h> include since commit
54c659660d ("drm/xe: Make xe_mmio_read|write() functions non-inline").
And since commit 924e6a9789 ("drm/xe/uapi: Remove MMIO ioctl")
we don't need forward declarations of drm_device and drm_file.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240520181814.2392-4-michal.wajdeczko@intel.com
2024-05-22 12:15:10 +02:00
Michal Wajdeczko
26a22952c8
drm/xe: Don't rely on indirect includes from xe_mmio.h
...
These compilation units use udelay() or some GT oriented printk
functions without explicitly including proper header files, and
relying on #includes from the xe_mmio.h instead. Fix that.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Francois Dugast <francois.dugast@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240520181814.2392-3-michal.wajdeczko@intel.com
2024-05-22 12:11:26 +02:00
Michal Wajdeczko
31a278b5a1
drm/i915/display: Add missing include to intel_vga.c
...
This compilation unit uses udelay() function without including
it's header file. Fix that to break dependency on other code.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Jani Nikula <jani.nikula@intel.com >
Acked-by: Jani Nikula <jani.nikula@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240520181814.2392-2-michal.wajdeczko@intel.com
2024-05-22 12:11:25 +02:00
Michal Wajdeczko
a6bc7cda37
drm/xe: Fix xe_guc_pc.h
...
Prefer forward declaration over #include xe_guc_pc_types.h
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Francois Dugast <francois.dugast@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240521102828.668-5-michal.wajdeczko@intel.com
2024-05-22 12:03:56 +02:00
Michal Wajdeczko
de1429a99f
drm/xe: Fix xe_huc.h
...
Prefer forward declaration over #include xe_huc_types.h
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Francois Dugast <francois.dugast@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240521102828.668-4-michal.wajdeczko@intel.com
2024-05-22 12:03:55 +02:00
Michal Wajdeczko
2291c09110
drm/xe: Fix xe_gsc.h
...
Prefer forward declaration over #include xe_gsc_types.h
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Francois Dugast <francois.dugast@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240521102828.668-3-michal.wajdeczko@intel.com
2024-05-22 12:03:54 +02:00
Michal Wajdeczko
bdc9abed51
drm/xe: Fix xe_uc.h
...
Prefer forward declaration over #include xe_uc_types.h
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Francois Dugast <francois.dugast@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240521102828.668-2-michal.wajdeczko@intel.com
2024-05-22 12:03:53 +02:00
Nirmoy Das
01d71dff61
drm/xe/tests: Use uninterruptible VM lock
...
Interruptible lock can return error and needed a return value
check. This test should finish quick enough so use a uninterruptible
lock instead.
Cc: Matthew Auld <matthew.auld@intel.com >
Reviewed-by: Matthew Auld <matthew.auld@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240521102715.22700-1-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com >
2024-05-21 22:18:51 +02:00
Nirmoy Das
735940f999
drm/xe: Add warn when level can not be zero.
...
At xe_pt_zap_ptes_entry() and xe_pt_stage_unbind_entry, the level cannot
be 0. Therefore, add an independent check for the level. Since the level
cannot be zero at this point, there is no need to check for `is_compact`,
so remove that instead.
Cc: Matthew Auld <matthew.auld@intel.com >
Reviewed-by: Matthew Auld <matthew.auld@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240521103623.11645-1-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com >
2024-05-21 22:09:20 +02:00
Francois Dugast
995f7dafd1
drm/xe/uapi: Expose the L3 bank mask
...
The L3 bank mask is already generated and stored internally with
the rest of the GT topology. In user space, the compute runtime
now needs this information to be added to the device properties
therefore the topology mask query is extended to provide a new
mask which represents the L3 banks enabled on the GT.
The changes in the compute runtime are ready and approved, see
link below.
v2: Rewrite commit message and add a link to the compute
runtime PR (Francois Dugast)
Cc: Matt Roper <matthew.d.roper@intel.com >
Cc: Robert Krzemien <robert.krzemien@intel.com >
Cc: Mateusz Jablonski <mateusz.jablonski@intel.com >
Link: https://github.com/intel/compute-runtime/pull/722
Signed-off-by: Francois Dugast <francois.dugast@intel.com >
Acked-by: Mateusz Jablonski <mateusz.jablonski@intel.com >
Reviewed-by: José Roberto de Souza <jose.souza@intel.com >
Signed-off-by: Matt Roper <matthew.d.roper@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240416145037.7-2-francois.dugast@intel.com
2024-05-21 09:01:40 -07:00
Lucas De Marchi
188ced1e0f
drm/xe/client: Print runtime to fdinfo
...
Print the accumulated runtime for client when printing fdinfo.
Each time a query is done it first does 2 things:
1) loop through all the exec queues for the current client and
accumulate the runtime, per engine class. CTX_TIMESTAMP is used for
that, being read from the context image.
2) Read a "GPU timestamp" that can be used for considering "how much GPU
time has passed" and that has the same unit/refclock as the one
recording the runtime. RING_TIMESTAMP is used for that via MMIO.
Since for all current platforms RING_TIMESTAMP follows the same
refclock, just read it once, using any first engine available.
This is exported to userspace as 2 numbers in fdinfo:
drm-cycles-<class>: <RUNTIME>
drm-total-cycles-<class>: <TIMESTAMP>
Userspace is expected to collect at least 2 samples, which allows to
know the client engine busyness as per:
RUNTIME1 - RUNTIME0
busyness = ---------------------
T1 - T0
Since drm-cycles-<class> always starts at 0, it's also possible to know
if and engine was ever used by a client.
It's expected that userspace will read any 2 samples every few seconds.
Given the update frequency of the counters involved and that
CTX_TIMESTAMP is 32-bits, the counter for each exec_queue can wrap
around (assuming 100% utilization) after ~200s. The wraparound is not
perceived by userspace since it's just accumulated for all the
exec_queues in a 64-bit counter) but the measurement will not be
accurate if the samples are too far apart.
This could be mitigated by adding a workqueue to accumulate the counters
every so often, but it's additional complexity for something that is
done already by userspace every few seconds in tools like gputop (from
igt), htop, nvtop, etc, with none of them really defaulting to 1 sample
per minute or more.
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com >
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240517204310.88854-9-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-05-21 06:33:40 -07:00
Lucas De Marchi
6aa18d7436
drm/xe: Add helper to return any available hw engine
...
Get the first available engine from a gt, which helps in the case any
engine serves as a context, like when reading RING_TIMESTAMP.
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240517204310.88854-8-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-05-21 06:33:40 -07:00
Lucas De Marchi
baa1486552
drm/xe: Cache data about user-visible engines
...
gt->info.engine_mask used to indicate the available engines, but that
is not always true anymore: some engines are reserved to kernel and some
may be exposed as a single engine (e.g. with ccs_mode).
Runtime changes only happen when no clients exist, so it's safe to cache
the list of engines in the gt and update that when it's needed. This
will help implementing per client engine utilization so this (mostly
constant) information doesn't need to be re-calculated on every query.
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com >
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240517204310.88854-7-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-05-21 06:33:40 -07:00
Umesh Nerlige Ramappa
6109f24f87
drm/xe: Add helper to accumulate exec queue runtime
...
Add a helper to accumulate per-client runtime of all its
exec queues. This is called every time a sched job is finished.
v2:
- Use guc_exec_queue_free_job() and execlist_job_free() to accumulate
runtime when job is finished since xe_sched_job_completed() is not a
notification that job finished.
- Stop trying to update runtime from xe_exec_queue_fini() - that is
redundant and may happen after xef is closed, leading to a
use-after-free
- Do not special case the first timestamp read: the default LRC sets
CTX_TIMESTAMP to zero, so even the first sample should be a valid
one.
- Handle the parallel submission case by multiplying the runtime by
width.
v3: Update comments
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com >
Reviewed-by: Matt Roper <matthew.d.roper@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240517204310.88854-6-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-05-21 06:33:40 -07:00
Lucas De Marchi
f2f6b667c6
drm/xe: Add helper to capture engine timestamp
...
Just like CTX_TIMESTAMP is used to calculate runtime, add a helper to
get the timestamp for the engine so it can be used to calculate the
"engine time" with the same unit as the runtime is recorded.
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240517204310.88854-5-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-05-21 06:33:40 -07:00
Umesh Nerlige Ramappa
9b090d5774
drm/xe/lrc: Add helper to capture context timestamp
...
Add a helper to capture CTX_TIMESTAMP from the context image so it can
be used to calculate the runtime.
v2: Add kernel-doc to clarify expectation from caller
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com >
Reviewed-by: Francois Dugast <francois.dugast@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240517204310.88854-4-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-05-21 06:33:40 -07:00
Lucas De Marchi
bd49e50d81
drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion
...
XE_ENGINE_CLASS_OTHER was missing from the str conversion. Add it and
remove the default handling so it's protected by -Wswitch.
Currently the only user is xe_hw_engine_class_sysfs_init(), which
already skips XE_ENGINE_CLASS_OTHER, so there's no change in behavior.
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240517204310.88854-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-05-21 06:33:39 -07:00
Lucas De Marchi
ab689514b6
drm/xe: Promote xe_hw_engine_class_to_str()
...
Move it out of the sysfs compilation unit so it can be re-used in other
places.
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com >
Reviewed-by: Oak Zeng <oak.zeng@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240517204310.88854-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-05-21 06:33:39 -07:00
José Roberto de Souza
844f3228d2
drm/xe: Replace RING_START_UDW by u64 RING_START
...
Other u64 registers are printed in a single line so RING_START
needs to follow that too.
As there is no upstream decoder tool parsing RING_START this will
not break any decoder application.
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com >
Cc: Matt Roper <matthew.d.roper@intel.com >
Signed-off-by: José Roberto de Souza <jose.souza@intel.com >
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240510150108.80679-1-jose.souza@intel.com
2024-05-17 09:59:18 -07:00
Michal Wajdeczko
63d8cb8fe3
drm/xe/vf: Expose SR-IOV VF attributes to GT debugfs
...
For debug purposes we might want to view actual VF configuration
(including GGTT range, LMEM size, number of GuC contexts IDs or
doorbells) and the negotiated ABI versions (with GuC and PF).
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240516110546.2216-7-michal.wajdeczko@intel.com
2024-05-16 20:18:39 +02:00
Michal Wajdeczko
25275c8a4f
drm/xe/vf: Custom hardware config load step if VF
...
The VF drivers may immediately communicate with the GuC to obtain
the hardware config since the firmware shall already be running.
With the GuC communication established, VFs can also obtain the
values of the runtime registers (fuses) from the PF driver.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240516110546.2216-6-michal.wajdeczko@intel.com
2024-05-16 20:18:38 +02:00
Michal Wajdeczko
f2345ed537
drm/xe/vf: Add support for VF to query its configuration
...
The VF driver doesn't know which GuC firmware was loaded by the PF
driver and must perform GuC ABI version handshake prior to sending
any other H2G actions to the GuC to submit workloads.
The VF driver also doesn't have access to the fuse registers and
must rely on the runtime info, which includes values of the fuse
registers, that the PF driver is exposing to the VFs.
Add functions to cover that functionality. We will use these
functions in upcoming patches.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com >
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240516110546.2216-5-michal.wajdeczko@intel.com
2024-05-16 20:18:34 +02:00
Michal Wajdeczko
c454f1a6b9
drm/xe/guc: Add VF2GUC_QUERY_SINGLE_KLV to ABI
...
In upcoming patches we will add support to the VF driver to
read its configuration from the GuC using special H2G actions.
Add necessary definitions to our GuC firmware ABI header.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240516110546.2216-4-michal.wajdeczko@intel.com
2024-05-16 20:18:33 +02:00
Michal Wajdeczko
769551c45c
drm/xe/guc: Add VF2GUC_VF_RESET to ABI
...
The version negotiation between the VF driver and the GuC firmware
must start with explicit soft reset of the GuC state initiated by
the VF driver. Add VF2GUC action definitions to the ABI header.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240516110546.2216-3-michal.wajdeczko@intel.com
2024-05-16 20:18:32 +02:00
Michal Wajdeczko
e158cf9361
drm/xe/guc: Add VF2GUC_MATCH_VERSION to ABI
...
In upcoming patches we will add a version negotiation between
the VF driver and the GuC firmware. Add necessary definitions
to our GuC firmware ABI header.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240516110546.2216-2-michal.wajdeczko@intel.com
2024-05-16 20:18:31 +02:00
Michal Wajdeczko
1c99d3d3ed
drm/xe/pf: Expose PF monitor details via debugfs
...
For debug purposes we might want to view statistics maintained by
the PF driver about VFs activity.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240514190015.2172-9-michal.wajdeczko@intel.com
2024-05-16 18:04:54 +02:00
Michal Wajdeczko
335d62ade5
drm/xe/pf: Track adverse events notifications from GuC
...
When thresholds used to monitor VFs activities are configured,
then GuC may send GUC2PF_ADVERSE_EVENT messages informing the
PF driver about exceeded thresholds. Start handling such messages.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240514190015.2172-8-michal.wajdeczko@intel.com
2024-05-16 18:04:51 +02:00
Michal Wajdeczko
d5e12fffcc
drm/xe/guc: Add GUC2PF_ADVERSE_EVENT to ABI
...
When thresholds used to monitor VFs activities are configured,
then GuC may send GUC2PF_ADVERSE_EVENT messages informing the
PF driver about exceeded thresholds. Add necessary definitions
to our GuC firmware ABI header.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240514190015.2172-7-michal.wajdeczko@intel.com
2024-05-16 18:04:45 +02:00
Michal Wajdeczko
c4f5ded082
drm/xe/pf: Allow configuration of VF thresholds over debugfs
...
Initial values of all thresholds used by the GuC to monitor VF's
activity is zero (disabled) and we need to explicitly configure
them per each VF. Expose additional attributes over debugfs.
Definitions of all attributes are generated so we will not need
to make any changes if new thresholds would be added to the set.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240514190015.2172-6-michal.wajdeczko@intel.com
2024-05-16 18:04:44 +02:00
Michal Wajdeczko
629df234bf
drm/xe/pf: Introduce functions to configure VF thresholds
...
The GuC firmware monitors VF's activity and notifies the PF driver
once any configured threshold related to such activity is exceeded.
Add functions to allow configuration of these thresholds per VF.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240514190015.2172-5-michal.wajdeczko@intel.com
2024-05-16 18:04:42 +02:00
Michal Wajdeczko
7aefee83fc
drm/xe/guc: Add support for threshold KLVs in to_string() helper
...
Use MAKE_XE_GUC_KLV_THRESHOLDS_SET to generate missing conversion
of threshold KLV keys to string.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240514190015.2172-4-michal.wajdeczko@intel.com
2024-05-16 18:04:41 +02:00
Michal Wajdeczko
b1ce52fbf6
drm/xe/guc: Introduce GuC KLV thresholds set
...
The GuC firmware monitors VF's activity and notifies the PF driver
once any configured threshold related to such activity is exceeded.
The available thresholds are defined in the GuC ABI as part of the
GuC VF Configuration KLVs. Threshold configurations performed by
the PF driver and notifications sent by the GuC rely on the KLV keys,
which are not zero-based and might not guarantee continuity.
To simplify the driver code and eliminate the need to repeat very
similar code for each threshold, introduce the threshold set macro
that allows to generate required code based on unique threshold tag.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240514190015.2172-3-michal.wajdeczko@intel.com
2024-05-16 18:04:39 +02:00
Michal Wajdeczko
e6946ea8fc
drm/xe/guc: Add more KLV helper macros
...
In upcoming patches we will want to generate some of the KLV keys
from other macros. Add MAKE_GUC_KLV_{KEY|LEN} macros for that and
make sure they will correctly expand provided TAG parameter. Also
fix PREP_GUC_KLV_TAG to also work correctly within other macros.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240514190015.2172-2-michal.wajdeczko@intel.com
2024-05-16 18:04:38 +02:00
Michal Wajdeczko
9aa8586063
drm/xe/pf: Implement pci_driver.sriov_configure callback
...
The PCI subsystem already exposes the "sriov_numvfs" attribute
that users can use to enable or disable SR-IOV VFs. Add custom
implementation of the .sriov_configure callback defined by the
pci_driver to perform additional steps, including fair VFs
provisioning with the resources, as required by our platforms.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com >
Cc: Badal Nilawar <badal.nilawar@intel.com >
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com >
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com > #v2
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240506184121.2615-1-michal.wajdeczko@intel.com
2024-05-15 23:45:49 +02:00
Michal Wajdeczko
75fe5f3471
drm/xe/pf: Don't advertise support to enable VFs if not ready
...
Even if we have not enabled SR-IOV support using the platform
specific has_sriov flag, the hardware may still report SR-IOV
capability and the PCI layer may wrongly advertise driver support
to enable VFs. Explicitly reset the number of supported VFs to
zero to avoid confusion.
Applications may read the /sys/bus/pci/devices/.../sriov_totalvfs
prior to enabling VFs using the sriov_numvfs to check if such an
operation is possible.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com >
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240507165757.2835-1-michal.wajdeczko@intel.com
2024-05-15 22:31:52 +02:00
Matthew Brost
c8ff26b82c
drm/xe: Only zap PTEs as needed
...
If PTEs are already invalidated no need to invalidate again.
Signed-off-by: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240514232325.84508-1-matthew.brost@intel.com
2024-05-15 08:07:57 -07:00
Jonathan Cavitt
b31cfb47b2
drm/xe/xe_guc_submit: Declare reset if banned or killed or wedged
...
Add an additional condition to the reset_status guc_exec_queue_op that
returns true if the exec queue has been banned or killed or wedged. The
reset_status op is only used for exiting any xe_wait_user_fence_ioctl
that waits on an exec queue without timing out, so doing this will exit
the ioctl early in cases where the exec queue can no longer function,
such as after a GuC stop during a reset.
Suggested-by: Matthew Brost <matthew.brost@intel.com >
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com >
Reviewed-by: Stuart Summers <stuart.summers@intel.com >
Signed-off-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240510194540.3246991-3-jonathan.cavitt@intel.com
2024-05-14 16:28:53 -07:00
Jonathan Cavitt
abdea2847a
drm/xe/xe_guc_submit: Allow lr exec queues to be banned
...
LR queues currently don't get banned during a GT/GuC reset because they
lack a job. Though they don't have a job to detect the reset status of,
it's still possible to tell when they should be banned by looking at the
LRC: if the LRC head and tail don't match, then the exec queue should be
banned and cleaned up.
This also requires swapping the usage of xe_sched_tdr_queue_imm with
xe_guc_exec_queue_trigger_cleanup, as the former is specific to non-lr
exec queues.
Suggested-by: Matthew Brost <matthew.brost@intel.com >
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Stuart Summers <stuart.summers@intel.com >
Signed-off-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240510194540.3246991-2-jonathan.cavitt@intel.com
2024-05-14 16:28:52 -07:00
Jonathan Cavitt
1564d411e1
drm/xe/xe_guc_submit: Fix exec queue stop race condition
...
Reorder the xe_sched_tdr_queue_imm and set_exec_queue_banned calls in
guc_exec_queue_stop. This prevents a possible race condition between
the two events in which it's possible for xe_sched_tdr_queue_imm to
wake the ufence waiter before the exec queue is banned, causing the
ufence waiter to miss the banned state.
Suggested-by: Matthew Brost <matthew.brost@intel.com >
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Stuart Summers <stuart.summers@intel.com >
Signed-off-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240510194540.3246991-1-jonathan.cavitt@intel.com
2024-05-14 16:28:51 -07:00
Michal Wajdeczko
4071e0872f
drm/xe/uc: Move GuC submission init to post hwconfig step
...
We shouldn't need anything from the GuC submission code until we
finish GuC initialization in post hwconfig step.
While around add diagnostic message if we fail uC init.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240510203810.1952-3-michal.wajdeczko@intel.com
2024-05-14 23:19:27 +02:00
Michal Wajdeczko
3df01f5c72
drm/xe/uc: Reorder post hwconfig uC initialization step
...
We want to move the GuC submission initialization to the post
hwconfig step, but now this step is done too late as migration
initialization uses exec_queue that would crash due to a unset
exec_queue_ops. We can easily fix that by small function reorder.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com >
Cc: Matthew Brost <matthew.brost@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240510203810.1952-2-michal.wajdeczko@intel.com
2024-05-14 23:19:25 +02:00
Matthew Brost
04f4a70a18
drm/xe: Only use reserved BCS instances for usm migrate exec queue
...
The GuC context scheduling queue is 2 entires deep, thus it is possible
for a migration job to be stuck behind a fault if migration exec queue
shares engines with user jobs. This can deadlock as the migrate exec
queue is required to service page faults. Avoid deadlock by only using
reserved BCS instances for usm migrate exec queue.
Fixes: a043fbab7a ("drm/xe/pvc: Use fast copy engines as migrate engine on PVC")
Cc: Matt Roper <matthew.d.roper@intel.com >
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com >
Signed-off-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240415190453.696553-2-matthew.brost@intel.com
Reviewed-by: Brian Welty <brian.welty@intel.com >
2024-05-14 13:12:27 -07:00
Himal Prasad Ghimiray
4c0be90e68
drm/xe: Fix the warning conditions
...
The maximum timeout display uses in xe_pcode_request is 3 msec, add the
warning in cases the function is misused with higher timeouts.
Add a warning if pcode_try_request is not passed the timeout parameter
greater than 0.
Cc: Lucas De Marchi <lucas.demarchi@intel.com >
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com >
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240508152216.3263109-3-himal.prasad.ghimiray@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
2024-05-14 11:26:40 -04:00
Himal Prasad Ghimiray
c81858eb52
drm/xe: Change pcode timeout to 50msec while polling again
...
Polling is initially attempted with timeout_base_ms enabled for
preemption, and if it exceeds this timeframe, another attempt is made
without preemption, allowing an additional 50 ms before timing out.
v2
- Rebase
v3
- Move warnings to separate patch (Lucas)
Cc: Lucas De Marchi <lucas.demarchi@intel.com >
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com >
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com >
Fixes: 7dc9b92dcf ("drm/xe: Remove i915_utils dependency from xe_pcode.")
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240508152216.3263109-2-himal.prasad.ghimiray@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
2024-05-14 11:26:40 -04:00
Lucas De Marchi
d1855d284e
drm/xe: Move sw-only pcode initialization
...
Move it to xe_gt_init_early() that initializes the sw-only part for each
gt.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240513213751.1017791-5-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-05-13 21:21:13 -07:00
Lucas De Marchi
45b9066ec3
drm/xe: Move xe_force_wake_init_gt() inside gt initialization
...
xe_force_wake_init_gt() is a software-only initialization and doesn't
need to be called from xe_device_probe(). Move it to initialize
together with the gt.
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com >
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240513213751.1017791-4-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-05-13 21:21:13 -07:00
Lucas De Marchi
65c4de2a91
drm/xe: Move xe_gt_init_early() where it belongs
...
Early shall be early enough, stop doing other things with gt before it.
Now that xe_gt_init_early() doesn't need forcewake and doesn't depend on
the fake engine_mask initialization, move it where it belongs: it
doesn't need to be after hwconfig config anymore.
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com >
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240513213751.1017791-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-05-13 21:21:13 -07:00
Lucas De Marchi
402c014cbc
drm/xe: Drop useless forcewake get/put
...
Forcewake used to be needed in xe_gt_init_early() since it was calling
xe_gt_topology_init(). That call was dropped in commit 4c47049d93
("drm/xe/guc: Fix missing topology init"), but the forcewake calls were
left behind. Remove them.
Cc: Zhanjun Dong <zhanjun.dong@intel.com >
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com >
Reviewed-by: Zhanjun Dong <zhanjun.dong@intel.com >
Reviewed-by: Matthew Brost <matthew.brost@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240513213751.1017791-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-05-13 21:21:13 -07:00
Lucas De Marchi
61549a2ee5
drm/xe: Drop __engine_mask
...
Not really used, it's just a copy of engine_mask, which already reads
the fuses to mark engines as available/not-available.
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com >
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com >
Link: https://patchwork.freedesktop.org/patch/msgid/20240513213751.1017791-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com >
2024-05-13 21:21:13 -07:00