Commit Graph

1413020 Commits

Author SHA1 Message Date
Shekhar Chauhan
8fcb7dfb8b drm/xe/xe3p_lpg: Add support for graphics IP 35.10
Add Xe3p_LPG graphics IP version 35.10. Xe3p_LPG supports all features
described by XE2_GFX_FEATURES and also multi-queue feature on BCS and
CCS engines.  As such, create a new struct xe_graphics_desc named
graphics_xe3p_lpg that inherits from XE2_GFX_FEATURES and also includes
the necessary .multi_queue_engine_class_mask.

Here is a list of fields and associated Bspec references for the members
of the IP descriptor:

 .hw_engine_mask (Bspec 60149)
 .multi_queue_engine_class_mask (Bspec 74110)
 .has_asid (Bspec 71132)
 .has_atomic_enable_pte_bit (Bspec 59510, 74675)
 .has_indirect_ring_state (Bspec 67296)
 .has_range_tlb_inval (Bspec 71126)
 .has_usm (Bspec 59651)
 .has_64bit_timestamp (Bspec 60318)
 .num_geometry_xecore_fuse_regs (Bspec 62566, 67401, 67536)
 .num_compute_xecore_fuse_regs (Bspec 62565, 62561, 67537)

v2:
  - Drop non-existing fields from the list in the commit message. (Matt)
  - Squash patch adding .multi_queue_engine_class_mask here. (Matt)
  - Rename graphics_xe3p to graphics_xe3p_lpg. (Matt)
  - Add fields .num_geometry_xecore_fuse_regs and
    .num_compute_xecore_fuse_regs after rebasing and inheriting
    commit 6acf3d3ed6 ("drm/xe: Move number of XeCore fuse registers to
    graphics descriptor"). (Gustavo)

Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-1-636e1ad32688@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
2026-02-10 10:05:12 -03:00
Shuicheng Lin
a30f999681 drm/xe/mmio: Avoid double-adjust in 64-bit reads
xe_mmio_read64_2x32() was adjusting register addresses and then
calling xe_mmio_read32(), which applies the adjustment again.
This may shift accesses twice if adj_offset < adj_limit. There is
no issue currently, as for media gt, adj_offset > adj_limit, so
the 2nd adjust will be a no-op. But it may not work in future.

To fix it, replace the adjusted-address comparison with a direct
sanity check that ensures the MMIO address adjustment cutoff never
falls within the 8-byte range of a 64-bit register. And let
xe_mmio_read32() handle address translation.

v2: rewrite the sanity check in a more natural way. (Matt)
v3: Add Fixes tag. (Jani)

Fixes: 07431945d8 ("drm/xe: Avoid 64-bit register reads")
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20260130165621.471408-2-shuicheng.lin@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2026-02-09 15:10:46 -08:00
Michal Wajdeczko
4e2796c828 drm/xe/vf: Allow VF to initialize MCR tables
While VFs can't access MCR registers, it's still safe to initialize
our per-platform MCR tables, as we might need them later in the LRC
programming, as engines itself may access MCR steer registers and
thanks to all our past fixes to the VF probe initialization order,
VFs are able to use values of the fuse registers needed here.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260207214428.5205-1-michal.wajdeczko@intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2026-02-09 12:30:34 -08:00
Sk Anirban
c57db41b8d drm/xe/guc: Add Wa_14025883347 for GuC DMA failure on reset
Prevent GuC firmware DMA failures during GuC-only reset by disabling
idle flow and verifying SRAM handling completion. Without this, reset
can be issued while SRAM handler is copying WOPCM to SRAM,
causing GuC HW to get stuck.

v2: Modify error message (Badal)
    Rename reg bit name (Daniele)
    Update WA skip condition (Daniele)
    Update SRAM handling logic (Daniele)
v3: Reorder WA call (Badal)
    Wait for GuC ready status (Daniele)
v4: Update reg name (Badal)
    Add comment (Daniele)
    Add extended graphics version (Daniele)
    Modify rules

Signed-off-by: Sk Anirban <sk.anirban@intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Acked-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patch.msgid.link/20260202105313.3338094-4-sk.anirban@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2026-02-09 12:00:16 -08:00
Matthew Auld
dc90ead440 drm/xe/uapi: update used tracking kernel-doc
In commit 4d0b035fd6 ("drm/xe/uapi: loosen used tracking restriction")
we dropped the CAP_PERMON restriction but missed updating the
corresponding kernel-doc. Fix that.

v2 (Sanjay):
  - Don't drop the note around the extra cpu_visible_used expectations.

Reported-by: Ulisses Furquim <ulisses.furquim@intel.com>
Fixes: 4d0b035fd6 ("drm/xe/uapi: loosen used tracking restriction")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Sanjay Yadav <sanjay.kumar.yadav@intel.com>
Reviewed-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com>
Link: https://patch.msgid.link/20260130125105.451229-2-matthew.auld@intel.com
2026-02-09 10:09:15 +00:00
Jia Yao
944a3329b0 drm/xe: Add bounds check on pat_index to prevent OOB kernel read in madvise
When user provides a bogus pat_index value through the madvise IOCTL, the
xe_pat_index_get_coh_mode() function performs an array access without
validating bounds. This allows a malicious user to trigger an out-of-bounds
kernel read from the xe->pat.table array.

The vulnerability exists because the validation in madvise_args_are_sane()
directly calls xe_pat_index_get_coh_mode(xe, args->pat_index.val) without
first checking if pat_index is within [0, xe->pat.n_entries).

Although xe_pat_index_get_coh_mode() has a WARN_ON to catch this in debug
builds, it still performs the unsafe array access in production kernels.

v2(Matthew Auld)
- Using array_index_nospec() to mitigate spectre attacks when the value
is used

v3(Matthew Auld)
- Put the declarations at the start of the block

Fixes: ada7486c56 ("drm/xe: Implement madvise ioctl for xe")
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v6.18+
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Jia Yao <jia.yao@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260205161529.1819276-1-jia.yao@intel.com
2026-02-09 10:06:40 +00:00
Michał Winiarski
6fa45759cf drm/xe/pf: Fix the address range assert in ggtt_get_pte helper
The ggtt_get_pte helper used for saving VF GGTT incorrectly assumes that
ggtt_size == ggtt_end.
Fix it to avoid triggering spurious asserts if VF GGTT object lands in
high GGTT range.

Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260130215624.556099-1-michal.winiarski@intel.com
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
2026-02-09 09:22:32 +01:00
Matt Roper
e8100643ff drm/xe/xe3p_xpc: XeCore mask spans four registers
On Xe3p_XPC, there are now four registers reserved to express the XeCore
mask rather than just three. Define the new registers and update the IP
descriptor accordingly.

Note that this only applies to Xe3p_XPC for now; Xe3p_LPG still only
uses three registers to express the mask.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patch.msgid.link/20260205214139.48515-4-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2026-02-06 09:49:20 -08:00
Matt Roper
6acf3d3ed6 drm/xe: Move number of XeCore fuse registers to graphics descriptor
The number of registers used to express the XeCore mask has some
"special cases" that don't always get inherited by later IP versions so
it's cleaner and simpler to record the numbers in the IP descriptor
rather than adding extra conditions to the standalone get_num_dss_regs()
function.

Note that a minor change here is that we now always treat the number of
registers as 0 for the media GT.  Technically a copy of these fuse
registers does exist in the media GT as well (at the usual
0x380000+$offset location), but the value of those is always supposed to
read back as 0 because media GTs never have any XeCores or EUs.

v2:
 - Add a kunit assertion to catch descriptors that forget to initialize
   either count.  (Gustavo)

Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patch.msgid.link/20260205214139.48515-3-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2026-02-06 09:49:20 -08:00
Vinay Belgaumkar
91be6115e4 drm/xe: Add forcewake status to powergate_info
Dump forcewake status and ref counts for all domains as part
of this debugfs. This is the sample output from gt1-

$ cat /sys/kernel/debug/dri//0/gt1/powergate_info
Media Power Gating Enabled: yes
Media Slice0 Power Gate Status: down
GSC Power Gate Status: down
GT.ref_count=0, GT.forcewake=0x10000
VDBox0.ref_count=0, VDBox0.forcewake=0x10000
VEBox0.ref_count=0, VEBox0.forcewake=0x10000
GSC.ref_count=0, GSC.forcewake=0x10000

v2: Fix checkpatch issues

Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Signed-off-by: Vinay Belgaumkar<vinay.belgaumkar@intel.com>
Link: https://patch.msgid.link/20260204190314.2904009-3-vinay.belgaumkar@intel.com
2026-02-05 14:33:44 -08:00
Vinay Belgaumkar
2ea05b4b02 drm/xe: Add GSC to powergate_info
Add GSC powergate status to the existing debugfs.

Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patch.msgid.link/20260204190314.2904009-2-vinay.belgaumkar@intel.com
2026-02-05 14:33:43 -08:00
Vinay Belgaumkar
fabedb758f drm/xe: Add a wrapper for SLPC set/unset params
Also, extract out the GuC RC related set/unset param functions
into xe_guc_rc file. GuC still allows us to override GuC RC mode
using an SLPC H2G interface. Continue to use that interface, but
move the related code to the newly created xe_guc_rc file.

Cc: Riana Tauro <riana.tauro@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patch.msgid.link/20260204014234.2867763-4-vinay.belgaumkar@intel.com
2026-02-05 14:17:37 -08:00
Vinay Belgaumkar
a3f949cd61 drm/xe: Use FORCEWAKE_GT in xe_guc_pc_fini_hw()
No need to use FORCEWAKE_ALL since the registers being written are in
GT domain.

Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260204014234.2867763-3-vinay.belgaumkar@intel.com
2026-02-05 14:17:36 -08:00
Vinay Belgaumkar
40a684f91d drm/xe: Decouple GuC RC code from xe_guc_pc
Move enable/disable GuC RC logic into the new file. This will
allow us to independently enable/disable GuC RC and not rely
on SLPC related functions. GuC already provides separate H2G
interfaces to setup GuC RC and SLPC.

Cc: Riana Tauro <riana.tauro@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patch.msgid.link/20260204014234.2867763-2-vinay.belgaumkar@intel.com
2026-02-05 14:17:30 -08:00
Pallavi Mishra
106340775a drm/xe/tests: Fix g2g_test_array indexing
The G2G KUnit test allocates a compact N×N
matrix sized by gt_count and verifies entries
using dense indices: idx = (j * gt_count) + i

The producer path currently computes idx using
gt->info.id. However, gt->info.id values
are not guaranteed to be contiguous.
For example, with gt_count=2 and IDs {0,3},
this formula produces indices beyond the
allocated range, causing mismatches and
potential out-of-bounds access.

Update the producer to map each GT to a dense
index in [0..gt_count-1] and compute:
    idx = (tx_dense * gt_count) + rx_dense

Additionally, introduce an event-based delay
in g2g_test_in_order() to ensure ordering
between sends.

v2: Add single helper function (Daniele)

v3: Modify comment (Daniele)

Signed-off-by: Pallavi Mishra <pallavi.mishra@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patch.msgid.link/20260129054722.2150674-1-pallavi.mishra@intel.com
2026-02-05 14:10:22 -08:00
Nareshkumar Gollakoti
9b5e995e61 drm/xe: Mutual exclusivity between CCS-mode and PF
Due to SLA agreement between PF and VFs, currently we block CCS
mode changes if driver is running as PF, even if there are no VFs
enabled yet. Use lockdown mechanism provided by the PF to relax
that limitation and still enforce above VFs related requirements.

Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Nareshkumar Gollakoti <naresh.kumar.g@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260202170810.1393147-6-naresh.kumar.g@intel.com
2026-02-05 22:03:59 +01:00
Nareshkumar Gollakoti
4e8f602ac3 drm/xe: Prevent VFs from exposing the CCS mode sysfs file
Skip creating CCS sysfs files in VF mode to ensure VFs do not
try to change CCS mode, as it is predefined and immutable in
the SR-IOV mode.

Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Nareshkumar Gollakoti <naresh.kumar.g@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260202170810.1393147-5-naresh.kumar.g@intel.com
2026-02-05 21:51:58 +01:00
Michal Wajdeczko
f59cde8a24 drm/xe/configfs: Fix 'parameter name omitted' errors
On some configs and old compilers we can get following build errors:

  ../drivers/gpu/drm/xe/xe_configfs.h: In function 'xe_configfs_get_ctx_restore_mid_bb':
  ../drivers/gpu/drm/xe/xe_configfs.h:40:76: error: parameter name omitted
   static inline u32 xe_configfs_get_ctx_restore_mid_bb(struct pci_dev *pdev, enum xe_engine_class,
                                                                            ^~~~~~~~~~~~~~~~~~~~
  ../drivers/gpu/drm/xe/xe_configfs.h: In function 'xe_configfs_get_ctx_restore_post_bb':
  ../drivers/gpu/drm/xe/xe_configfs.h:42:77: error: parameter name omitted
   static inline u32 xe_configfs_get_ctx_restore_post_bb(struct pci_dev *pdev, enum xe_engine_class,
                                                                             ^~~~~~~~~~~~~~~~~~~~
when trying to define our configfs stub functions. Fix that.

Fixes: 7a4756b2fd ("drm/xe/lrc: Allow to add user commands mid context switch")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20260203193745.576-1-michal.wajdeczko@intel.com
2026-02-05 21:21:46 +01:00
Michal Wajdeczko
18443ff225 drm/xe: Drop unnecessary include from xe_tile.h
We don't need to include xe_device_types.h there.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Link: https://patch.msgid.link/20260203211240.745-5-michal.wajdeczko@intel.com
2026-02-05 21:16:20 +01:00
Michal Wajdeczko
e7002e0eb4 drm/xe: Promote struct xe_tile definition to own file
We already have separate .c and .h files for xe_tile functions,
time to introduce _types.h to follow what other components do.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260203211240.745-4-michal.wajdeczko@intel.com
2026-02-05 21:15:34 +01:00
Michal Wajdeczko
ed61c18617 drm/xe: Promote struct xe_mmio definition to own file
We already have separate .c and .h files for xe_mmio functions,
time to introduce _types.h to follow what other components do.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com> #v1
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260203211240.745-3-michal.wajdeczko@intel.com
2026-02-05 21:14:23 +01:00
Michal Wajdeczko
8965e00883 drm/xe: Move xe_root_tile_mmio() to xe_device.h
It seems to be a better place for this helper function, where
we already have other 'root' oriented helpers.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260203211240.745-2-michal.wajdeczko@intel.com
2026-02-05 21:08:28 +01:00
Michal Wajdeczko
98b16727f0 drm/xe/pf: Fix sysfs initialization
In case of devm_add_action_or_reset() failure the provided cleanup
action will be run immediately on the not yet initialized kobject.
This may lead to errors like:

 [ ] kobject: '(null)' (ff110001393608e0): is not initialized, yet kobject_put() is being called.
 [ ] WARNING: lib/kobject.c:734 at kobject_put+0xd9/0x250, CPU#0: kworker/0:0/9
 [ ] RIP: 0010:kobject_put+0xdf/0x250
 [ ] Call Trace:
 [ ]  xe_sriov_pf_sysfs_init+0x21/0x100 [xe]
 [ ]  xe_sriov_pf_init_late+0x87/0x2b0 [xe]
 [ ]  xe_sriov_init_late+0x5f/0x2c0 [xe]
 [ ]  xe_device_probe+0x5f2/0xc20 [xe]
 [ ]  xe_pci_probe+0x396/0x610 [xe]
 [ ]  local_pci_probe+0x47/0xb0

 [ ] refcount_t: underflow; use-after-free.
 [ ] WARNING: lib/refcount.c:28 at refcount_warn_saturate+0x68/0xb0, CPU#0: kworker/0:0/9
 [ ] RIP: 0010:refcount_warn_saturate+0x68/0xb0
 [ ] Call Trace:
 [ ]  kobject_put+0x174/0x250
 [ ]  xe_sriov_pf_sysfs_init+0x21/0x100 [xe]
 [ ]  xe_sriov_pf_init_late+0x87/0x2b0 [xe]
 [ ]  xe_sriov_init_late+0x5f/0x2c0 [xe]
 [ ]  xe_device_probe+0x5f2/0xc20 [xe]
 [ ]  xe_pci_probe+0x396/0x610 [xe]
 [ ]  local_pci_probe+0x47/0xb0

Fix that by calling kobject_init() and kobject_add() separately
and register cleanup action after the kobject is initialized.

Also make this cleanup registration a part of the create helper to
fix another mistake, as in the loop we were wrongly passing parent
kobject while registering cleanup action, and this resulted in some
undetected leaks.

Fixes: 5c170a4d9c ("drm/xe/pf: Prepare sysfs for SR-IOV admin attributes")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20260203235332.1350-1-michal.wajdeczko@intel.com
2026-02-05 20:57:13 +01:00
Matt Roper
f27e644220 drm/xe: Drop unnecessary goto in xe_device_create
The error label in this function just does an immediate return without
any further cleanup or processing.  Replace the goto statements with
returns.

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260204191025.3957211-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2026-02-05 08:59:24 -08:00
Matthew Brost
ec49857ad1 drm/gpusvm: Allow device pages to be mapped in mixed mappings after system pages
The current code rejects device mappings whenever system pages have
already been encountered. This is not the intended behavior when
allow_mixed is set.

Relax the restriction by permitting a single pagemap to be selected when
allow_mixed is enabled, even if system pages were found earlier.

Fixes: bce13d6ecd ("drm/gpusvm, drm/xe: Allow mixed mappings for userptr")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://patch.msgid.link/20260130194928.3255613-3-matthew.brost@intel.com
2026-02-04 11:22:51 -08:00
Matthew Brost
556dba9547 drm/gpusvm: Force unmapping on error in drm_gpusvm_get_pages
drm_gpusvm_get_pages() only sets the local flags prior to committing the
pages. If an error occurs mid-mapping, has_dma_mapping will be clear,
causing the unmap function to skip unmapping pages that were
successfully mapped before the error. Fix this by forcibly setting
has_dma_mapping in the error path to ensure all previously mapped pages
are properly unmapped.

Fixes: 99624bdff8 ("drm/gpusvm: Add support for GPU Shared Virtual Memory")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://patch.msgid.link/20260130194928.3255613-2-matthew.brost@intel.com
2026-02-04 11:22:49 -08:00
Karthik Poosa
39125eaf88 drm/xe/pm: Disable D3Cold for BMG only on specific platforms
Restrict D3Cold disablement for BMG to unsupported NUC platforms,
instead of disabling it on all platforms.

Signed-off-by: Karthik Poosa <karthik.poosa@intel.com>
Fixes: 3e331a6715 ("drm/xe/pm: Temporarily disable D3Cold on BMG")
Link: https://patch.msgid.link/20260123173238.1642383-1-karthik.poosa@intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-03 15:01:39 -05:00
Michal Wajdeczko
34ef561a0d drm/xe/configfs: Add sriov.admin_only_pf attribute
Instead of relying on fixed relation to the display probe flag,
add configfs attribute to allow an administrator to configure
desired PF operation mode in a more flexible way.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260121214218.2817-6-michal.wajdeczko@intel.com
2026-02-03 12:03:50 +01:00
Michal Wajdeczko
10f817c256 drm/xe/pf: Define admin_only as real flag
Instead of doing guesses each time during the runtime, set flag
admin_only once during PF's initialization.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260127210501.794-1-michal.wajdeczko@intel.com
2026-02-03 12:02:44 +01:00
Michal Wajdeczko
0dfc7306b9 drm/xe/configfs: Always return consistent max_vfs value
The max_vfs parameter used by the Xe driver has its default value
definition, but it could be altered by the module parameter or by
the device specific configfs attribute.

To avoid mistakes or code duplication, always rely on the configfs
helper (or stub), which will provide necessary fallback if needed.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260121214218.2817-4-michal.wajdeczko@intel.com
2026-02-03 12:02:05 +01:00
Michal Wajdeczko
56dfa9fc39 drm/xe/configfs: Use proper notation for local include
For local includes we should use "" notation, not <>.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260121214218.2817-3-michal.wajdeczko@intel.com
2026-02-03 12:01:18 +01:00
Michal Wajdeczko
44f44d43f9 drm/xe: Keep all defaults in single header
We already have most of Xe defaults defined in xe_module.c,
where we use them for the modparam initializations, but some
were defined elsewhere, which breaks the consistency.

Introduce xe_defaults.h file, that will act as a placeholder
for all our default values, and can be used from other places.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260121214218.2817-2-michal.wajdeczko@intel.com
2026-02-03 11:58:26 +01:00
Marco Crivellari
0bc2c2e1a3 drm/xe: add WQ_PERCPU to alloc_workqueue users
This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:

   commit 128ea9f6cc ("workqueue: Add system_percpu_wq and system_dfl_wq")
   commit 930c2ea566 ("workqueue: Add new WQ_PERCPU flag")

The refactoring is going to alter the default behavior of
alloc_workqueue() to be unbound by default.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU. For more details see the Link tag below.

In order to keep alloc_workqueue() behavior identical, explicitly request
WQ_PERCPU.

Link: https://lore.kernel.org/all/20250221112
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260202103756.62138-3-marco.crivellari@suse.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-02 19:18:09 -05:00
Marco Crivellari
fa171b805f drm/xe: replace use of system_unbound_wq with system_dfl_wq
This patch continues the effort to refactor workqueue APIs, which has begun
with the changes introducing new workqueues and a new alloc_workqueue flag:

   commit 128ea9f6cc ("workqueue: Add system_percpu_wq and system_dfl_wq")
   commit 930c2ea566 ("workqueue: Add new WQ_PERCPU flag")

The point of the refactoring is to eventually alter the default behavior of
workqueues to become unbound by default so that their workload placement is
optimized by the scheduler.

Before that to happen, workqueue users must be converted to the better named
new workqueues with no intended behaviour changes:

   system_wq -> system_percpu_wq
   system_unbound_wq -> system_dfl_wq

This way the old obsolete workqueues (system_wq, system_unbound_wq) can be
removed in the future.

Link: https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260202103756.62138-2-marco.crivellari@suse.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-02 19:18:09 -05:00
Michal Wajdeczko
65b9886062 drm/xe/guc: Allow second H2G retry on FLR
During VF FLR the scratch registers could be cleared both by the
GuC and by the PF driver. Allow to retry more times once we find
out that the HXG header was cleared and wait at least 256ms before
resending the same message again to the GuC.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260127193727.601-7-michal.wajdeczko@intel.com
2026-02-02 22:35:46 +01:00
Michal Wajdeczko
e116fd5c60 drm/xe/guc: Wait before retrying sending H2G
We shall resend H2G message after receiving NO_RESPONSE_RETRY reply,
but since GuC dropped that H2G due to some interim state, we should
give it a little time to stabilize. Wait before sending the same H2G
again, start with 1ms delay, then increase exponentially to 256ms.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260127193727.601-6-michal.wajdeczko@intel.com
2026-02-02 22:35:45 +01:00
Michal Wajdeczko
09b45fd9d3 drm/xe/guc: Drop redundant register read
The xe_mmio_wait32() already returns the last value of the register
for which we were waiting, there is no need read it again.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260127193727.601-5-michal.wajdeczko@intel.com
2026-02-02 22:35:44 +01:00
Michal Wajdeczko
943c4d0637 drm/xe/guc: Limit sleep while waiting for H2G credits
Instead of endlessly increasing the sleep timeout while waiting
for the H2G credits, use exponential increase only up to the given
limit, like it was initially done in the GuC submission code.

While here, fix the actual timeout to the 1s as it was documented.

Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260127193727.601-4-michal.wajdeczko@intel.com
2026-02-02 22:35:43 +01:00
Michal Wajdeczko
eec43f3684 drm/xe: Move exponential sleep logic to helper
We want to reuse the same increased sleep logic in other places.
To avoid code duplication, move it to the helper.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260127193727.601-3-michal.wajdeczko@intel.com
2026-02-02 22:35:42 +01:00
Michal Wajdeczko
94a2ceb190 drm/xe: Promote relaxed_ms_sleep
We want to have single place with sleep related helpers for better
code reuse. Create xe_sleep.h and move relaxed_ms_sleep() there.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260127193727.601-2-michal.wajdeczko@intel.com
2026-02-02 22:35:41 +01:00
Michal Wajdeczko
316b05ae7e drm/xe/pf: Simplify IS_SRIOV_PF macro
Instead of two having variants of the IS_SRIOV_PF macro, move the
CONFIG_PCI_IOV check to the xe_device_is_sriov_pf() function and
let the compiler optimize that. This will help us drop poor man's
type check of the macro parameter that fails on const xe pointer.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20260128222714.3056-1-michal.wajdeczko@intel.com
2026-02-02 22:22:57 +01:00
Shuicheng Lin
9f9c117ac5 drm/xe: Fix kerneldoc for xe_tlb_inval_job_alloc_dep
Correct the function name in the kerneldoc.
It is for below warning:
"Warning: drivers/gpu/drm/xe/xe_tlb_inval_job.c:210 expecting prototype for
xe_tlb_inval_alloc_dep(). Prototype was for xe_tlb_inval_job_alloc_dep()
instead"

Fixes: 15366239e2 ("drm/xe: Decouple TLB invalidations from GT")
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260129233834.419977-8-shuicheng.lin@intel.com
2026-02-02 22:05:00 +01:00
Shuicheng Lin
0651dbb9d6 drm/xe: Fix kerneldoc for xe_gt_tlb_inval_init_early
Correct the function name in the kerneldoc.
It is for below warning:
"Warning: drivers/gpu/drm/xe/xe_tlb_inval.c:136 expecting prototype for
xe_gt_tlb_inval_init(). Prototype was for xe_gt_tlb_inval_init_early()
instead"

v2: add () for the function. (Michal)

Fixes: db16f9d90c ("drm/xe: Split TLB invalidation code in frontend and backend")
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260129233834.419977-7-shuicheng.lin@intel.com
2026-02-02 22:04:58 +01:00
Shuicheng Lin
9fd8da7179 drm/xe: Fix kerneldoc for xe_migrate_exec_queue
Correct the function name in the kerneldoc.
It is for below warning:
"Warning: drivers/gpu/drm/xe/xe_migrate.c:1262 expecting prototype for
xe_get_migrate_exec_queue(). Prototype was for xe_migrate_exec_queue()
instead"

Fixes: 916ee4704a ("drm/xe/vf: Register CCS read/write contexts with Guc")
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260129233834.419977-6-shuicheng.lin@intel.com
2026-02-02 22:04:55 +01:00
Shuicheng Lin
c2a6859138 drm/xe/query: Fix topology query pointer advance
The topology query helper advanced the user pointer by the size
of the pointer, not the size of the structure. This can misalign
the output blob and corrupt the following mask. Fix the increment
to use sizeof(*topo).
There is no issue currently, as sizeof(*topo) happens to be equal
to sizeof(topo) on 64-bit systems (both evaluate to 8 bytes).

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260130043907.465128-2-shuicheng.lin@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2026-02-02 10:09:48 -08:00
Xin Wang
568d9d0d83 drm/xe: use entry_dump callbacks for xe2+ PAT dumps
Move xe2+ PAT entry printing into the entry_dump op so platform
specific logic stays localized, simplifying future maintenance.

v2:
 - Do not null xe->pat.ops for VFs.
 - Skip PAT init and dump on VFs (-EOPNOTSUPP), avoiding NULL ops use.

v3:
 - fixed typo

v4: (Matt)
 - Switch xe2_dump() to use the new ops->entry_dump() vfunc.
 - Remove xe3p_xpc_dump() and reuse the common xe2_dump() for Xe3p XPC.
 - This also fixes Xe3p_HPM media PAT dumping by using the proper
non-MCR access for the PAT register range (bspec 76445).

Cc: Matt Roper <matthew.d.roper@intel.com>
Suggested-by: Brian Nguyen <brian3.nguyen@intel.com>
Signed-off-by: Xin Wang <x.wang@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260130175349.2249033-1-x.wang@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2026-02-02 10:03:43 -08:00
Chaitanya Kumar Borah
f89dbe14a0 drm/xe/guc: Fix kernel-doc warning in GuC scheduler ABI header
The GuC scheduler ABI header contains a file-level comment that is not
intended to document a kernel-doc symbol. Using kernel-doc comment
syntax (/** */) triggers kernel-doc warnings.

With "-Werror", this causes the build to fail. Convert the comment to a
regular block comment.

HDRTEST drivers/gpu/drm/xe/abi/guc_scheduler_abi.h
Warning: drivers/gpu/drm/xe/abi/guc_scheduler_abi.h:11 This comment starts with '/**', but isn't a kernel-doc comment. Refer to Documentation/doc-guide/kernel-doc.rst
 * Generic defines required for registration with and submissions to the GuC
1 warnings as errors
make[6]: *** [drivers/gpu/drm/xe/Makefile:377: drivers/gpu/drm/xe/abi/guc_scheduler_abi.hdrtest] Error 3
make[5]: *** [scripts/Makefile.build:544: drivers/gpu/drm/xe] Error 2
make[4]: *** [scripts/Makefile.build:544: drivers/gpu/drm] Error 2
make[3]: *** [scripts/Makefile.build:544: drivers/gpu] Error 2
make[2]: *** [scripts/Makefile.build:544: drivers] Error 2
make[1]: *** [/home/kbuild2/kernel/Makefile:2088: .] Error 2
make: *** [Makefile:248: __sub-make] Error 2

v2:
 - Add Fixes tag (Daniele)

Fixes: b0c5cf4f59 ("drm/gt/guc: extract scheduler-related defines from guc_fwif.h")
Signed-off-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patch.msgid.link/20260130135210.2659200-1-chaitanya.kumar.borah@intel.com
2026-02-02 09:13:31 -08:00
Daniele Ceraolo Spurio
dd8ea2f2ab drm/xe/guc: Fix CFI violation in debugfs access.
xe_guc_print_info is void-returning, but the function pointer it is
assigned to expects an int-returning function, leading to the following
CFI error:

[  206.873690] CFI failure at guc_debugfs_show+0xa1/0xf0 [xe]
(target: xe_guc_print_info+0x0/0x370 [xe]; expected type: 0xbe3bc66a)

Fix this by updating xe_guc_print_info to return an integer.

Fixes: e15826bb3c ("drm/xe/guc: Refactor GuC debugfs initialization")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: George D Sworo <george.d.sworo@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260129182547.32899-2-daniele.ceraolospurio@intel.com
2026-02-02 09:12:56 -08:00
Balasubramani Vivekanandan
de96c43a69 drm/xe: Apply WA_16028005424 to Media
Apply WA_16028005424 to following IPs:
Xe2_LPM, Xe2_HPM, Xe3_LPM, Xe3p_LPM

While doing this move the same WA defined for Xe3_LPG under the comment
for Xe3_LPG. It was wrongly placed under Xe3_LPM.

Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260128062911.1456539-2-balasubramani.vivekanandan@intel.com
2026-02-02 15:03:35 +05:30
Michal Wajdeczko
b47239bc30 drm/xe/pf: Fix typo in function kernel-doc
The function name is missing an underscore, which results in:

  Warning: ../drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c:1261
  This comment starts with '/**', but isn't a kernel-doc comment.
  Refer to Documentation/doc-guide/kernel-doc.rst
  * xe_gt_sriov_pf_control_trigger restore_vf() - Start ...

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20251217150702.2669-1-michal.wajdeczko@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-01-30 16:08:16 -05:00