When validating VF config on the media GT, we may wrongly report
that VF is already partially configured on it, as we consider GGTT
and LMEM provisioning done on the primary GT (since both GGTT and
LMEM are tile-level resources, not a GT-level).
This will cause skipping a VF auto-provisioning on the media-GT and
in result will block a VF from successfully initialize that GT.
Fix that by considering GGTT and LMEM configurations only when
checking if a VF provisioning is complete, and omit GGTT and LMEM
when reporting empty/partial provisioning.
Fixes: 234670cea9 ("drm/xe/pf: Skip fair VFs provisioning if already provisioned")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240806180516.618-1-michal.wajdeczko@intel.com
The VF API version for this release is 1.13.4.
Bumping the minimum required GuC version just before force-probe
removal allows us to set a baseline for what API features are expected
to be available. I.e., at this point there is no need for any version
checking in the code before using a feature. Of course, if/when the
API is extended in future GuC releases, those new features will need
API version checks in the code.
Bump the recommended GuC versions to match.
Also add numerical comparison helpers to simplify the version number
checks.
v2: Reword commit message and make comparison helpers GuC specific -
review feedback from Daniele, done by JohnH
Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240802222129.3976212-3-John.C.Harrison@Intel.com
Enable feature to allow memory reads to take a priority memory path.
This will reduce latency on the read path, but may introduce read after
write (RAW) hazards as read and writes will no longer be ordered.
To avoid RAW hazards, SW can use the MI_MEM_FENCE command or any other
MI command that generates non posted memory writes. This will ensure
data is coherent in memory prior to execution of commands which read
data from memory. RCS,BCS and CCS support this feature.
No pattern identified in KMD that could lead to a hazard.
v2: Modify commit message, enable priority mem read feature for media,
modify version range, modify bspec detail (Matt Roper)
v3: Rebase, fix cramped line-wrapping (jcavitt)
v4: Rebase
v5: Media does not support Priority Mem Read. Modify commit
to reflect the same.
v6: Rebase
Bspec: 60298, 60237, 60187, 60188
Signed-off-by: Pallavi Mishra <pallavi.mishra@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Carl Zhang <carl.zhang@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240731195622.1868401-1-pallavi.mishra@intel.com
When building with gcc-5:
In function ‘decode_oa_format.isra.26’,
inlined from ‘xe_oa_set_prop_oa_format’ at drivers/gpu/drm/xe/xe_oa.c:1664:6:
././include/linux/compiler_types.h:510:38: error: call to ‘__compiletime_assert_1336’ declared with attribute error: FIELD_GET: mask is not constant
[...]
./include/linux/bitfield.h:155:3: note: in expansion of macro ‘__BF_FIELD_CHECK’
__BF_FIELD_CHECK(_mask, _reg, 0U, "FIELD_GET: "); \
^
drivers/gpu/drm/xe/xe_oa.c:1573:18: note: in expansion of macro ‘FIELD_GET’
u32 bc_report = FIELD_GET(DRM_XE_OA_FORMAT_MASK_BC_REPORT, fmt);
^
Fixes: b6fd51c621 ("drm/xe/oa/uapi: Define and parse OA stream properties")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240729092634.2227611-1-geert+renesas@glider.be
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Now that rtp has OR rules, it's not needed to extend it to process OOB
WAs. Previously if an entry had no name, it was considered as "a set of
rules OR'ed with the last named entry".
Instead of generating new entries, add OR rules. The syntax for
xe_wa_oob.rules remains the same, with xe_gen_wa_oob generating the
slightly different table. Object sizes delta are negligible, but having
just one logic makes it easier to maintain:
add/remove: 0/0 grow/shrink: 1/2 up/down: 160/-269 (-109)
Function old new delta
__compound_literal 6104 6264 +160
xe_wa_dump 1839 1810 -29
oob_was 816 576 -240
Total: Before=17257, After=17148, chg -0.63%
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240727015907.899192-9-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Like commit 512660cd1f ("drm/xe/rtp: Expand max rules/actions per
entry") did, expand the maximum number of actions/rules. That commit was
too conservative, just incrementing 2. Other than the ugliness of these
macros and additional preprocessor steps when they are used, there are
no downsides on increasing the maximum: the tables in which they are
used use a sentinel to mark the last element.
With rtp processing now supporting OR rules, it's possible to migrate
the extension made for OOB WAs that "entries with name are OR'ed in
previous entry". For that the maximum number of rules needs to be
increased.
Just double it. Hopefully 12 is sufficient for longer than 6 was.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240727015907.899192-8-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
The OOB WAs use xe_rtp_process(), without passing an sr to save result
of the actions since there are none. They are also executed in a gt-only
context, making it harder to share the implementation. Thus, introduce a
new set of tests to check these RTP entries. The only check that can be
done is if the entry was marked as active.
Before commit fd6797ec50 ("drm/xe/rtp: Fix off-by-one when processing
rules") several of these tests were failing: the processing of OR'ed
entries would make the subsequent entry to be inadvertently enabled.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240727015907.899192-6-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Although all current Xe2 platforms support FlatCCS, we probably
shouldn't assume that will be universally true forever. In the past
we've had platforms like PVC that didn't support compression, and the
same could show up again at some point in the future. Future-proof the
migration code by adding an explicit check for FlatCCS support to the
condition that decides whether to use a compressed PAT index for
migration.
While we're at it, we can drop the IS_DGFX check since it's redundant
with the src_is_vram check (only dGPUs have VRAM).
Cc: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240726171757.2728819-2-matthew.d.roper@intel.com
Gustavo noticed an odd "+ 2" in rtp_mark_active() while processing
rtp rules and pointed that it should be "+ 1". In fact, while processing
entries without actions (OOB workarounds), if the WA is activated and
has OR rules, it will also inadvertently activate the very next
workaround.
Test in a LNL B0 platform by moving 18024947630 on top of 16020292621,
makes the latter become active:
$ cat /sys/kernel/debug/dri/0/gt0/workarounds
...
OOB Workarounds
18024947630
16020292621
14018094691
16022287689
13011645652
22019338487_display
In future a kunit test will be added to cover the rtp checks for entries
without actions.
Fixes: fe19328b90 ("drm/xe/rtp: Add support for entries with no action")
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240726064337.797576-6-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
An xe file can outlive the associated process as the GPU cleanup is just
triggered upon file close (process kill) and completes sometime later.
If the file close triggers error conditions (GPU hangs) the process
cannot be safely referenced to retrieve the name and pid for debug
information. Store the process name and pid directly in the xe file to
be safe.
v2:
- Access file->pid via rcu_access_pointer (Matthew Auld)
Fixes: b10d0c5e9d ("drm/xe: Add process name to devcoredump")
Fixes: f6ca930d97 ("drm/xe: Add process name and PID to job timedout message")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240723151045.1725417-1-matthew.brost@intel.com
eu_type_to_str() relies on -Wswitch to warn (and -Werror) to make sure
it handles all enum values. However it's perfectly legal to pass an int
to that function so in the end that function may happen to return
nothing. There's too much implicit knowledge about the initialization
of eu_type for a compiler to notice eu_type is never assigned to
anything other than those values.
Trying to reproduce this issue, none of gcc-9, gcc-10 and gcc-13
triggered for me, but this was reported in a different system with
gcc-10:
drivers/gpu/drm/xe/xe.o: warning: objtool: xe_gt_topology_dump() falls through to next function xe_gt_topology_init()
Also it was reported these warnings when building with clang:
drivers/gpu/drm/xe/xe.o: warning: objtool: xe_gt_topology_dump+0x77: sibling call from callable instruction with modified stack frame
drivers/gpu/drm/xe/xe.o: warning: objtool: xe_gt_topology_dump() falls through to next function xe_dss_mask_group_ffs()
drivers/gpu/drm/xe/xe.o: warning: objtool: xe_gt_topology_dump+0x77: can't find jump dest instruction at .text.xe_gt_topology_dump+0xc0
Since that value is not really possible in real world, just take the
simple approach and return NULL.
Fixes: 7108b4a589 ("drm/xe/uapi: Expose SIMD16 EU mask in topology query")
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240719191534.3845469-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>