Commit Graph

120460 Commits

Author SHA1 Message Date
Dave Airlie
f66ac60dee Merge tag 'drm-xe-fixes-2025-12-19' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
UAPI Changes:
- Limit num_syncs to prevent oversized kernel allocations (Shuicheng)
- Disallow 0 OA property values (Ashutosh)
- Disallow 0 EU stall property values (Ashutosh)

Driver Changes:
- Fix kobject leak (Shuicheng)
- Workaround (Vinay)
- Loop variable reference fix (Matt Brost)
- Fix a CONFIG corner-case incorrect number of arguments (Arnd Bergmann)
- Skip reason prefix while emitting array (Raag)
- VF migration fix (Tomasz)
- Fix context in mei interrupt top half (Junxiao)
- Don't include the CCS metadata in the dma-buf sg-table (Thomas)
- VF queueing recovery work fix (Satyanarayana)
- Increase TDF timeout (Jagmeet)
- GT reset registers vs scheduler ordering fix (Jan)
- Adjust long-running workload timeslices (Matt Brost)
- Always set OA_OAGLBCTXCTRL_COUNTER_RESUME (Ashutosh)
- Fix a return value (Dan Carpenter)
- Drop preempt-fences when destroying imported dma-bufs (Thomas)
- Use usleep_range for accurate long-running workload timeslicing (Matthew)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/aUSMlQ4iruzm0NQR@fedora
2025-12-19 10:56:13 +10:00
Dave Airlie
77de4a273d Merge tag 'drm-misc-fixes-2025-12-18' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
drm-misc-fixes for v6.19-rc2:
- Add -EDEADLK handling in drm unit tests.
- Plug DRM_IOCTL_GEM_CHANGE_HANDLE leak.
- Fix regression in sony-td4353-jdi.
- Kconfig fix for visionox-rm69299.
- Do not load amdxdna when running virtualized.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patch.msgid.link/21861d1b-54bf-4853-9c35-97abe3c5deba@linux.intel.com
2025-12-19 07:38:18 +10:00
Matthew Brost
80f9c601d9 drm/xe: Use usleep_range for accurate long-running workload timeslicing
msleep is not very accurate in terms of how long it actually sleeps,
whereas usleep_range is precise. Replace the timeslice sleep for
long-running workloads with the more accurate usleep_range to avoid
jitter if the sleep period is less than 20ms.

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20251212182847.1683222-3-matthew.brost@intel.com
(cherry picked from commit ca415c4d4c)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-18 18:25:42 +01:00
Thomas Hellström
fe3ccd2413 drm/xe: Drop preempt-fences when destroying imported dma-bufs.
When imported dma-bufs are destroyed, TTM is not fully
individualizing the dma-resv, but it *is* copying the fences that
need to be waited for before declaring idle. So in the case where
the bo->resv != bo->_resv we can still drop the preempt-fences, but
make sure we do that on bo->_resv which contains the fence-pointer
copy.

In the case where the copying fails, bo->_resv will typically not
contain any fences pointers at all, so there will be nothing to
drop. In that case, TTM would have ensured all fences that would
have been copied are signaled, including any remaining preempt
fences.

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Fixes: fa0af721bd ("drm/ttm: test private resv obj on release/destroy")
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.16+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Tested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251217093441.5073-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 425fe550fb)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-18 18:12:16 +01:00
Ashutosh Dixit
3767ca4166 drm/xe/eustall: Disallow 0 EU stall property values
An EU stall property value of 0 is invalid and will cause a NPD.

Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6453
Fixes: 1537ec85eb ("drm/xe/uapi: Introduce API for EU stall sampling")
Cc: stable@vger.kernel.org
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Harish Chegondi <harish.chegondi@intel.com>
Link: https://patch.msgid.link/20251212061850.1565459-4-ashutosh.dixit@intel.com
(cherry picked from commit 5bf763e908)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-18 18:12:09 +01:00
Ashutosh Dixit
3595114bc3 drm/xe/oa: Disallow 0 OA property values
An OA property value of 0 is invalid and will cause a NPD.

Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6452
Fixes: cc4e6994d5 ("drm/xe/oa: Move functions up so they can be reused for config ioctl")
Cc: stable@vger.kernel.org
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Harish Chegondi <harish.chegondi@intel.com>
Link: https://patch.msgid.link/20251212061850.1565459-3-ashutosh.dixit@intel.com
(cherry picked from commit 7a100e6ddc)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-18 18:11:45 +01:00
Dan Carpenter
eb192bedf5 drm/xe/xe_sriov_vfio: Fix return value in xe_sriov_vfio_migration_supported()
The xe_sriov_vfio_migration_supported() function is type bool so
returning -EPERM means returning true.  Return false instead.

Fixes: bd45d46ffc ("drm/xe/pf: Export helpers for VFIO")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/aTLEZ4g-FD-iMQ2V@stanley.mountain
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
(cherry picked from commit 0a2404c8f6)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-18 18:11:32 +01:00
Ashutosh Dixit
256edb267a drm/xe/oa: Always set OAG_OAGLBCTXCTRL_COUNTER_RESUME
Reports can be written out to the OA buffer using ways other than periodic
sampling. These include mmio trigger and context switches. To support these
use cases, when periodic sampling is not enabled,
OAG_OAGLBCTXCTRL_COUNTER_RESUME must be set.

Fixes: 1db9a9dc90 ("drm/xe/oa: OA stream initialization (OAG)")
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patch.msgid.link/20251205212613.826224-4-ashutosh.dixit@intel.com
(cherry picked from commit 88d98e74ad)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-18 18:11:22 +01:00
Matthew Brost
6f0f404bd2 drm/xe: Adjust long-running workload timeslices to reasonable values
A 10ms timeslice for long-running workloads is far too long and causes
significant jitter in benchmarks when the system is shared. Adjust the
value to 5ms for preempt-fencing VMs, as the resume step there is quite
costly as memory is moved around, and set it to zero for pagefault VMs,
since switching back to pagefault mode after dma-fence mode is
relatively fast.

Also change min_run_period_ms to 'unsiged int' type rather than 's64' as
only positive values make sense.

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20251212182847.1683222-2-matthew.brost@intel.com
(cherry picked from commit 33a5abd9a6)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-18 18:11:04 +01:00
Shuicheng Lin
f8dd66bfb4 drm/xe/oa: Limit num_syncs to prevent oversized allocations
The OA open parameters did not validate num_syncs, allowing
userspace to pass arbitrarily large values, potentially
leading to excessive allocations.

Add check to ensure that num_syncs does not exceed DRM_XE_MAX_SYNCS,
returning -EINVAL when the limit is violated.

v2: use XE_IOCTL_DBG() and drop duplicated check. (Ashutosh)

Fixes: c8507a25ce ("drm/xe/oa/uapi: Define and parse OA sync properties")
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251205234715.2476561-6-shuicheng.lin@intel.com
(cherry picked from commit e057b2d2b8)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-18 18:10:50 +01:00
Shuicheng Lin
8e46130400 drm/xe: Limit num_syncs to prevent oversized allocations
The exec and vm_bind ioctl allow userspace to specify an arbitrary
num_syncs value. Without bounds checking, a very large num_syncs
can force an excessively large allocation, leading to kernel warnings
from the page allocator as below.

Introduce DRM_XE_MAX_SYNCS (set to 1024) and reject any request
exceeding this limit.

"
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1217 at mm/page_alloc.c:5124 __alloc_frozen_pages_noprof+0x2f8/0x2180 mm/page_alloc.c:5124
...
Call Trace:
 <TASK>
 alloc_pages_mpol+0xe4/0x330 mm/mempolicy.c:2416
 ___kmalloc_large_node+0xd8/0x110 mm/slub.c:4317
 __kmalloc_large_node_noprof+0x18/0xe0 mm/slub.c:4348
 __do_kmalloc_node mm/slub.c:4364 [inline]
 __kmalloc_noprof+0x3d4/0x4b0 mm/slub.c:4388
 kmalloc_noprof include/linux/slab.h:909 [inline]
 kmalloc_array_noprof include/linux/slab.h:948 [inline]
 xe_exec_ioctl+0xa47/0x1e70 drivers/gpu/drm/xe/xe_exec.c:158
 drm_ioctl_kernel+0x1f1/0x3e0 drivers/gpu/drm/drm_ioctl.c:797
 drm_ioctl+0x5e7/0xc50 drivers/gpu/drm/drm_ioctl.c:894
 xe_drm_ioctl+0x10b/0x170 drivers/gpu/drm/xe/xe_device.c:224
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:598 [inline]
 __se_sys_ioctl fs/ioctl.c:584 [inline]
 __x64_sys_ioctl+0x18b/0x210 fs/ioctl.c:584
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xbb/0x380 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
...
"

v2: Add "Reported-by" and Cc stable kernels.
v3: Change XE_MAX_SYNCS from 64 to 1024. (Matt & Ashutosh)
v4: s/XE_MAX_SYNCS/DRM_XE_MAX_SYNCS/ (Matt)
v5: Do the check at the top of the exec func. (Matt)

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Reported-by: Koen Koning <koen.koning@intel.com>
Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6450
Cc: <stable@vger.kernel.org> # v6.12+
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Michal Mrozek <michal.mrozek@intel.com>
Cc: Carl Zhang <carl.zhang@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Ivan Briano <ivan.briano@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251205234715.2476561-5-shuicheng.lin@intel.com
(cherry picked from commit b07bac9bd7)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-18 18:10:34 +01:00
Brian Kocoloski
969faea4e9 drm/amdkfd: Fix improper NULL termination of queue restore SMI event string
Pass character "0" rather than NULL terminator to properly format
queue restoration SMI events. Currently, the NULL terminator precedes
the newline character that is intended to delineate separate events
in the SMI event buffer, which can break userspace parsers.

Signed-off-by: Brian Kocoloski <brian.kocoloski@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6e7143e5e6)
2025-12-16 14:17:32 -05:00
mythilam
7a372e214f drm/amd/pm: restore SCLK settings after S0ix resume
User-configured SCLK(GPU core clock)frequencies were not persisting
across S0ix suspend/resume cycles on smu v14 hardware.
The issue occurred because of the code resetting clock frequency
to zero during resume.

This patch addresses the problem by:
- Preserving user-configured values in driver and sets the
  clock frequency across resume
- Preserved settings are sent to the hardware during resume

Signed-off-by: mythilam <mythilam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 20ba98326f)
2025-12-16 14:16:48 -05:00
Alex Deucher
77f7325301 drm/amdgpu: fix a job->pasid access race in gpu recovery
Avoid a possible UAF in GPU recovery due to a race between
the sched timeout callback and the tdr work queue.

The gpu recovery function calls drm_sched_stop() and
later drm_sched_start().  drm_sched_start() restarts
the tdr queue which will eventually free the job.  If
the tdr queue frees the job before time out callback
completes, the job will be freed and we'll get a UAF
when accessing the pasid.  Cache it early to avoid the
UAF.

Example KASAN trace:
[  493.058141] BUG: KASAN: slab-use-after-free in amdgpu_device_gpu_recover+0x968/0x990 [amdgpu]
[  493.067530] Read of size 4 at addr ffff88b0ce3f794c by task kworker/u128:1/323
[  493.074892]
[  493.076485] CPU: 9 UID: 0 PID: 323 Comm: kworker/u128:1 Tainted: G            E       6.16.0-1289896.2.zuul.bf4f11df81c1410bbe901c4373305a31 #1 PREEMPT(voluntary)
[  493.076493] Tainted: [E]=UNSIGNED_MODULE
[  493.076495] Hardware name: TYAN B8021G88V2HR-2T/S8021GM2NR-2T, BIOS V1.03.B10 04/01/2019
[  493.076500] Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
[  493.076512] Call Trace:
[  493.076515]  <TASK>
[  493.076518]  dump_stack_lvl+0x64/0x80
[  493.076529]  print_report+0xce/0x630
[  493.076536]  ? _raw_spin_lock_irqsave+0x86/0xd0
[  493.076541]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[  493.076545]  ? amdgpu_device_gpu_recover+0x968/0x990 [amdgpu]
[  493.077253]  kasan_report+0xb8/0xf0
[  493.077258]  ? amdgpu_device_gpu_recover+0x968/0x990 [amdgpu]
[  493.077965]  amdgpu_device_gpu_recover+0x968/0x990 [amdgpu]
[  493.078672]  ? __pfx_amdgpu_device_gpu_recover+0x10/0x10 [amdgpu]
[  493.079378]  ? amdgpu_coredump+0x1fd/0x4c0 [amdgpu]
[  493.080111]  amdgpu_job_timedout+0x642/0x1400 [amdgpu]
[  493.080903]  ? pick_task_fair+0x24e/0x330
[  493.080910]  ? __pfx_amdgpu_job_timedout+0x10/0x10 [amdgpu]
[  493.081702]  ? _raw_spin_lock+0x75/0xc0
[  493.081708]  ? __pfx__raw_spin_lock+0x10/0x10
[  493.081712]  drm_sched_job_timedout+0x1b0/0x4b0 [gpu_sched]
[  493.081721]  ? __pfx__raw_spin_lock_irq+0x10/0x10
[  493.081725]  process_one_work+0x679/0xff0
[  493.081732]  worker_thread+0x6ce/0xfd0
[  493.081736]  ? __pfx_worker_thread+0x10/0x10
[  493.081739]  kthread+0x376/0x730
[  493.081744]  ? __pfx_kthread+0x10/0x10
[  493.081748]  ? __pfx__raw_spin_lock_irq+0x10/0x10
[  493.081751]  ? __pfx_kthread+0x10/0x10
[  493.081755]  ret_from_fork+0x247/0x330
[  493.081761]  ? __pfx_kthread+0x10/0x10
[  493.081764]  ret_from_fork_asm+0x1a/0x30
[  493.081771]  </TASK>

Fixes: a72002cb18 ("drm/amdgpu: Make use of drm_wedge_task_info")
Link: https://github.com/HansKristian-Work/vkd3d-proton/pull/2670
Cc: SRINIVASAN.SHANMUGAM@amd.com
Cc: vitaly.prosyak@amd.com
Cc: christian.koenig@amd.com
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 20880a3fd5)
2025-12-16 14:16:08 -05:00
Charlene Liu
3886b198bd drm/amd/display: Fix DP no audio issue
[why]
need to enable APG_CLOCK_ENABLE enable first
also need to wake up az from D3 before access az block

Reviewed-by: Swapnil Patel <swapnil.patel@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bf5e396957)
2025-12-16 14:14:34 -05:00
Ray Wu
fd62aa13d3 drm/amd/display: Fix scratch registers offsets for DCN351
[Why]
Different platforms use different NBIO header files,
causing display code to use differnt offset and read
wrong accelerated status.

[How]
- Unified NBIO offset header file across platform.
- Correct scratch registers offsets to proper locations.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4667
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 576e032e90)
Cc: stable@vger.kernel.org
2025-12-16 14:12:16 -05:00
Ray Wu
69741d9ccc drm/amd/display: Fix scratch registers offsets for DCN35
[Why]
Different platforms use differnet NBIO header files,
causing display code to use differnt offset and read
wrong accelerated status.

[How]
- Unified NBIO offset header file across platform.
- Correct scratch registers offsets to proper locations.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4667
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 49a63bc8ed)
Cc: stable@vger.kernel.org
2025-12-16 14:11:41 -05:00
Mario Limonciello (AMD)
244a07c486 drm/amd: Resume the device in thaw() callback when console suspend is disabled
If console suspend has been disabled using `no_console_suspend` also
wake up during thaw() so that some messages can be seen for debugging.

Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4191
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 63387cbbb7)
2025-12-16 14:08:30 -05:00
Guido Günther
2bfca4fe1f drm/panel: visionox-rm69299: Depend on BACKLIGHT_CLASS_DEVICE
We handle backlight so need that dependency.

Fixes: 7911d8cab5 ("drm/panel: visionox-rm69299: Add backlight support")
Reported-by: kernelci.org bot <bot@kernelci.org>
Signed-off-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patch.msgid.link/20251017-visionox-rm69299-bl-v2-1-9dfa06606754@sigxcpu.org
2025-12-16 11:28:52 +01:00
Marijn Suijten
2b973ca48f drm/panel: sony-td4353-jdi: Enable prepare_prev_first
The DSI host must be enabled before our prepare function can run, which
has to send its init sequence over DSI.  Without enabling the host first
the panel will not probe.

Fixes: 9e15123eca ("drm/msm/dsi: Stop unconditionally powering up DSI hosts at modeset")
Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Martin Botka <martin.botka@somainline.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patch.msgid.link/20251130-sony-akari-fix-panel-v1-1-1d27c60a55f5@somainline.org
2025-12-15 08:14:20 -08:00
Jan Maslak
eed5b815fa drm/xe: Restore engine registers before restarting schedulers after GT reset
During GT reset recovery in do_gt_restart(), xe_uc_start() was called
before xe_reg_sr_apply_mmio() restored engine-specific registers. This
created a race window where the scheduler could run jobs before hardware
state was fully restored.

This caused failures in eudebug tests (xe_exec_sip_eudebug@breakpoint-
waitsip-*) where TD_CTL register (containing TD_CTL_GLOBAL_DEBUG_ENABLE)
wasn't restored before jobs started executing. Breakpoints would fail to
trigger SIP entry because the debug enable bit wasn't set yet.

Fix by moving xe_uc_start() after all MMIO register restoration,
including engine registers and CCS mode configuration, ensuring all
hardware state is fully restored before any jobs can be scheduled.

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Jan Maslak <jan.maslak@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251210145618.169625-2-jan.maslak@intel.com
(cherry picked from commit 825aed0328)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:17:04 +01:00
Jagmeet Randhawa
eafb6f6209 drm/xe: Increase TDF timeout
There are some corner cases where flushing transient
data may take slightly longer than the 150us timeout
we currently allow.  Update the driver to use a 300us
timeout instead based on the latest guidance from
the hardware team. An update to the bspec to formally
document this is expected to arrive soon.

Fixes: c01c6066e6 ("drm/xe/device: implement transient flush")
Signed-off-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/0201b1d6ec64d3651fcbff1ea21026efa915126a.1765487866.git.jagmeet.randhawa@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit d69d3636f5)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:16:57 +01:00
Satyanarayana K V P
c770467d28 drm/xe/vf: Fix queuing of recovery work
Ensure VF migration recovery work is only queued when no recovery is
already queued and teardown is not in progress.

Fixes: b47c0c07c3 ("drm/xe/vf: Teardown VF post migration worker on driver unload")
Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20251210052546.622809-5-satyanarayana.k.v.p@intel.com
(cherry picked from commit 8d8cf42b03)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:16:48 +01:00
Thomas Hellström
449bcd5d45 drm/xe/bo: Don't include the CCS metadata in the dma-buf sg-table
Some Xe bos are allocated with extra backing-store for the CCS
metadata. It's never been the intention to share the CCS metadata
when exporting such bos as dma-buf. Don't include it in the
dma-buf sg-table.

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/20251209204920.224374-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit a4ebfb9d95)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:16:39 +01:00
Junxiao Chang
17445af7dc drm/me/gsc: mei interrupt top half should be in irq disabled context
MEI GSC interrupt comes from i915 or xe driver. It has top half and
bottom half. Top half is called from i915/xe interrupt handler. It
should be in irq disabled context.

With RT kernel(PREEMPT_RT enabled), by default IRQ handler is in
threaded IRQ. MEI GSC top half might be in threaded IRQ context.
generic_handle_irq_safe API could be called from either IRQ or
process context, it disables local IRQ then calls MEI GSC interrupt
top half.

This change fixes B580 GPU boot issue with RT enabled.

Fixes: e02cea83d3 ("drm/xe/gsc: add Battlemage support")
Tested-by: Baoli Zhang <baoli.zhang@intel.com>
Signed-off-by: Junxiao Chang <junxiao.chang@intel.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251107033152.834960-1-junxiao.chang@intel.com
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
(cherry picked from commit 3efadf0287)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:14:47 +01:00
Tomasz Lis
61e6b711c3 drm/xe/vf: Stop waiting for ring space on VF post migration recovery
If wait for ring space started just before migration, it can delay
the recovery process, by waiting without bailout path for up to 2
seconds.

Two second wait for recovery is not acceptable, and if the ring was
completely filled even without the migration temporarily stopping
execution, then such a wait will result in up to a thousand new jobs
(assuming constant flow) being added while the wait is happening.

While this will not cause data corruption, it will lead to warning
messages getting logged due to reset being scheduled on a GT under
recovery. Also several seconds of unresponsiveness, as the backlog
of jobs gets progressively executed.

Add a bailout condition, to make sure the recovery starts without
much delay. The recovery is expected to finish in about 100 ms when
under moderate stress, so the condition verification period needs to be
below that - settling at 64 ms.

The theoretical max time which the recovery can take depends on how
many requests can be emitted to engine rings and be pending execution.
While stress testing, it was possible to reach 10k pending requests
on rings when a platform with two GTs was used. This resulted in max
recovery time of 5 seconds. But in real life situations, it is very
unlikely that the amount of pending requests will ever exceed 100,
and for that the recovery time will be around 50 ms - well within our
claimed limit of 100ms.

Fixes: a4dae94aad ("drm/xe/vf: Wakeup in GuC backend on VF post migration recovery")
Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20251204200820.2206168-1-tomasz.lis@intel.com
(cherry picked from commit a00e305fba)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:14:37 +01:00
Raag Jadav
17d52ab2a6 drm/xe/throttle: Skip reason prefix while emitting array
The newly introduced "reasons" attribute already signifies possible
reasons for throttling and makes the prefix in individual attribute
names redundant while emitting them as an array. Skip the prefix.

Fixes: 83ccde67a3 ("drm/xe/gt_throttle: Avoid TOCTOU when monitoring reasons")
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Sk Anirban <sk.anirban@intel.com>
Link: https://patch.msgid.link/20251203123355.571606-1-raag.jadav@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit b64a14334e)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:14:27 +01:00
Arnd Bergmann
9acc329581 drm/xe: fix drm_gpusvm_init() arguments
The Xe driver fails to build when CONFIG_DRM_XE_GPUSVM is disabled
but CONFIG_DRM_GPUSVM is turned on, due to the clash of two commits:

In file included from drivers/gpu/drm/xe/xe_vm_madvise.c:8:
drivers/gpu/drm/xe/xe_svm.h: In function 'xe_svm_init':
include/linux/stddef.h:8:14: error: passing argument 5 of 'drm_gpusvm_init' makes integer from pointer without a cast [-Wint-conversion]
drivers/gpu/drm/xe/xe_svm.h:217:38: note: in expansion of macro 'NULL'
  217 |                                NULL, NULL, 0, 0, 0, NULL, NULL, 0);
      |                                      ^~~~
In file included from drivers/gpu/drm/xe/xe_bo_types.h:11,
                 from drivers/gpu/drm/xe/xe_bo.h:11,
                 from drivers/gpu/drm/xe/xe_vm_madvise.c:11:
include/drm/drm_gpusvm.h:254:35: note: expected 'long unsigned int' but argument is of type 'void *'
  254 |                     unsigned long mm_start, unsigned long mm_range,
      |                     ~~~~~~~~~~~~~~^~~~~~~~
In file included from drivers/gpu/drm/xe/xe_vm_madvise.c:14:
drivers/gpu/drm/xe/xe_svm.h:216:16: error: too many arguments to function 'drm_gpusvm_init'; expected 10, have 11
  216 |         return drm_gpusvm_init(&vm->svm.gpusvm, "Xe SVM (simple)", &vm->xe->drm,
      |                ^~~~~~~~~~~~~~~
  217 |                                NULL, NULL, 0, 0, 0, NULL, NULL, 0);
      |                                                                 ~
include/drm/drm_gpusvm.h:251:5: note: declared here

Adapt the caller to the new argument list by removing the extraneous
NULL argument.

Fixes: 9e97874148 ("drm/xe/userptr: replace xe_hmm with gpusvm")
Fixes: 10aa5c8060 ("drm/gpusvm, drm/xe: Fix userptr to not allow device private pages")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20251204094704.1030933-1-arnd@kernel.org
(cherry picked from commit 29bce9c8b4)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:14:08 +01:00
Matthew Brost
224a6ac080 drm/xe: Do not reference loop variable directly
Do not reference the loop variable job after the loop has exited.
Instead, save the job from the last iteration of the loop.

Fixes: 3d98a7164d ("drm/xe/vf: Start re-emission from first unsignaled job during VF migration")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202511291102.jnnKP6IB-lkp@intel.com/
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Link: https://patch.msgid.link/20251203011809.968893-1-matthew.brost@intel.com
(cherry picked from commit 76ce231370)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:13:58 +01:00
Vinay Belgaumkar
c88a0731ed drm/xe: Apply Wa_14020316580 in xe_gt_idle_enable_pg()
Wa_14020316580 was getting clobbered by power gating init code
later in the driver load sequence. Move the Wa so that
it applies correctly.

Fixes: 7cd05ef89c ("drm/xe/xe2hpm: Add initial set of workarounds")
Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20251129052548.70766-1-vinay.belgaumkar@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit 8b55021453)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:13:48 +01:00
Shuicheng Lin
b32045d73b drm/xe: Fix freq kobject leak on sysfs_create_files failure
Ensure gt->freq is released when sysfs_create_files() fails
in xe_gt_freq_init(). Without this, the kobject would leak.
Add kobject_put() before returning the error.

Fixes: fdc81c43f0 ("drm/xe: use devm_add_action_or_reset() helper")
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Alex Zuo <alex.zuo@intel.com>
Reviewed-by: Xin Wang <x.wang@intel.com>
Link: https://patch.msgid.link/20251114205638.2184529-2-shuicheng.lin@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit 251be5fb49)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-12-15 14:13:41 +01:00
Maarten Lankhorst
84318277d6 Merge remote-tracking branch 'drm/drm-fixes' into drm-misc-fixes
Pull in rc1 to include all changes since the merge window closed,
and grab all fixes and changes from drm/drm-next.

Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-12-15 12:53:27 +01:00
Linus Torvalds
a859eca0e4 Merge tag 'drm-fixes-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel
Pull more drm fixes from Dave Airlie:
 "These are the enqueued fixes that ended up in our fixes branch,
  nouveau mostly, along with some small fixes in other places.

  plane:
   - Handle IS_ERR vs NULL in drm_plane_create_hotspot_properties()

  ttm:
   - fix devcoredump for evicted bos

  panel:
   - Fix stack usage warning in novatek-nt35560

  nouveau:
   - alloc fwsec sb at boot to avoid s/r problems
   - fix strcpy usage
   - fix i2c encoder crash

  bridge:
   - Ignore spurious PLL_UNLOCK bit in ti-sn65dsi83

  mgag200:
   - Fix bigendian handling in mgag200

  tilcdc:
   - Fix probe failure in tilcdc"

* tag 'drm-fixes-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel:
  drm/mgag200: Fix big-endian support
  drm/tilcdc: Fix removal actions in case of failed probe
  drm/ttm: Avoid NULL pointer deref for evicted BOs
  drm: nouveau: Replace sprintf() with sysfs_emit()
  drm/nouveau: fix circular dep oops from vendored i2c encoder
  drm/nouveau: refactor deprecated strcpy
  drm/plane: Fix IS_ERR() vs NULL check in drm_plane_create_hotspot_properties()
  drm/bridge: ti-sn65dsi83: ignore PLL_UNLOCK errors
  drm/nouveau/gsp: Allocate fwsec-sb at boot
  drm/panel: novatek-nt35560: avoid on-stack device structure
2025-12-13 17:39:28 +12:00
Linus Torvalds
237f1bbfe3 Merge tag 'drm-next-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "This is the weekly fixes for what is in next tree, mostly amdgpu and
  some i915, panthor and a core revert.

  core:
   - revert dumb bo 8 byte alignment

  amdgpu:
   - SI fix
   - DC reduce stack usage
   - HDMI fixes
   - VCN 4.0.5 fix
   - DP MST fix
   - DC memory allocation fix

  amdkfd:
   - SVM fix
   - Trap handler fix
   - VGPR fixes for GC 11.5

  i915:
   - Fix format string truncation warning
   - FIx runtime PM reference during fbdev BO creation

  panthor:
   - fix UAF

  renesas:
   - fix sync flag handling"

* tag 'drm-next-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel:
  Revert "drm/amd/display: Fix pbn to kbps Conversion"
  drm/amd: Fix unbind/rebind for VCN 4.0.5
  drm/i915: Fix format string truncation warning
  drm/i915/fbdev: Hold runtime PM ref during fbdev BO creation
  drm/amd/display: Improve HDMI info retrieval
  drm/amdkfd: bump minimum vgpr size for gfx1151
  drm/amd/display: shrink struct members
  drm/amdkfd: Export the cwsr_size and ctl_stack_size to userspace
  drm/amd/display: Refactor dml_core_mode_support to reduce stack frame
  drm/amdgpu: don't attach the tlb fence for SI
  drm/amd/display: Use GFP_ATOMIC in dc_create_plane_state()
  drm/amdkfd: Trap handler support for expert scheduling mode
  drm/amdkfd: Use huge page size to check split svm range alignment
  drm/rcar-du: dsi: Handle both DRM_MODE_FLAG_N.SYNC and !DRM_MODE_FLAG_P.SYNC
  drm/gem-shmem: revert the 8-byte alignment constraint
  drm/gem-dma: revert the 8-byte alignment constraint
  drm/panthor: Prevent potential UAF in group creation
2025-12-13 17:25:26 +12:00
Dave Airlie
5300831555 Merge tag 'drm-misc-fixes-2025-12-10' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
drm-misc-fixes for v6.19-rc1:
- Fix stack usage warning in novatek-nt35560.
- Fix s/r, i2c issues in nouveau and update string handling.
- Ignore spurious PLL_UNLOCK bit in ti-sn65dsi83.
- Handle IS_ERR vs NULL in drm_plane_create_hotspot_properties().
- Fix devcoredump crash on reading evicted bo's.
- Fix bigendian handling in mgag200.
- Fix probe failure in tilcdc.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patch.msgid.link/6c371dc1-08bf-4a34-895c-9ef348b6061b@linux.intel.com
2025-12-13 10:54:29 +10:00
Karol Wachowski
630efee949 drm: Fix object leak in DRM_IOCTL_GEM_CHANGE_HANDLE
Add missing drm_gem_object_put() call when drm_gem_object_lookup()
successfully returns an object. This fixes a GEM object reference
leak that can prevent driver modules from unloading when using
prime buffers.

Fixes: 53096728b8 ("drm: Add DRM prime interface to reassign GEM handle")
Cc: <stable@vger.kernel.org> # v6.18+
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20251212134133.475218-1-karol.wachowski@linux.intel.com
2025-12-12 14:52:37 +01:00
José Expósito
526aafabd7 drm/tests: Handle EDEADLK in set_up_atomic_state()
Fedora/CentOS/RHEL CI is reporting intermittent failures while running
the drm_validate_modeset test [1]:

    # drm_test_check_connector_changed_modeset: EXPECTATION FAILED at
    # drivers/gpu/drm/tests/drm_atomic_state_test.c:162
    Expected ret == 0, but
        ret == -35 (0xffffffffffffffdd)

Change the set_up_atomic_state() helper function to return on error and
restart the atomic sequence when the returned error is EDEADLK.

[1] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2106744096/test_x86_64/11762450343/artifacts/jobwatch/logs/recipes/19797909/tasks/204139142/results/945095586/logs/dmesg.log

Fixes: 73d934d7b6 ("drm/tests: Add test for drm_atomic_helper_commit_modeset_disables()")
Closes: https://datawarehouse.cki-project.org/issue/4004
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Link: https://patch.msgid.link/20251104102535.12212-2-jose.exposito89@gmail.com
2025-12-12 10:12:46 +01:00
José Expósito
141d95e428 drm/tests: Handle EDEADLK in drm_test_check_valid_clones()
Fedora/CentOS/RHEL CI is reporting intermittent failures while running
the drm_test_check_valid_clones() KUnit test.

The error log can be either [1]:

    # drm_test_check_valid_clones: ASSERTION FAILED at
    # drivers/gpu/drm/tests/drm_atomic_state_test.c:295
    Expected ret == param->expected_result, but
        ret == -35 (0xffffffffffffffdd)
        param->expected_result == 0 (0x0)

Or [2] depending on the test case:

    # drm_test_check_valid_clones: ASSERTION FAILED at
    # drivers/gpu/drm/tests/drm_atomic_state_test.c:295
    Expected ret == param->expected_result, but
        ret == -35 (0xffffffffffffffdd)
        param->expected_result == -22 (0xffffffffffffffea)

Restart the atomic sequence when EDEADLK is returned.

[1] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2113057246/test_x86_64/11802139999/artifacts/jobwatch/logs/recipes/19824965/tasks/204347800/results/946112713/logs/dmesg.log
[2] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2106744297/test_aarch64/11762450907/artifacts/jobwatch/logs/recipes/19797942/tasks/204139727/results/945094561/logs/dmesg.log

Fixes: 88849f24e2 ("drm/tests: Add test for drm_atomic_helper_check_modeset()")
Closes: https://datawarehouse.cki-project.org/issue/4004
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Link: https://patch.msgid.link/20251104102535.12212-1-jose.exposito89@gmail.com
2025-12-12 10:12:45 +01:00
José Expósito
fe27e709d9 drm/tests: hdmi: Handle drm_kunit_helper_enable_crtc_connector() returning EDEADLK
Fedora/CentOS/RHEL CI is reporting intermittent failures while running
the KUnit tests present in drm_hdmi_state_helper_test.c [1].

While the specific test causing the failure change between runs, all of
them are caused by drm_kunit_helper_enable_crtc_connector() returning
-EDEADLK. The error trace always follow this structure:

    # <test name>: ASSERTION FAILED at
    # drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c:<line>
    Expected ret == 0, but
        ret == -35 (0xffffffffffffffdd)

As documented, if the drm_kunit_helper_enable_crtc_connector() function
returns -EDEADLK (-35), the entire atomic sequence must be restarted.

Handle this error code for all function calls.

Closes: https://datawarehouse.cki-project.org/issue/4039  [1]
Fixes: 6a5c0ad7e0 ("drm/tests: hdmi_state_helpers: Switch to new helper")
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Link: https://patch.msgid.link/20251104102258.10026-1-jose.exposito89@gmail.com
2025-12-12 10:07:43 +01:00
Dave Airlie
37a1cefd4d Merge tag 'drm-intel-next-fixes-2025-12-12' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
drm/i915 fixes for v6.19-rc1:
- Fix format string truncation warning
- FIx runtime PM reference during fbdev BO creation

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patch.msgid.link/281309f78560bcceebac8d5c0511efe66baf641c@intel.com
2025-12-12 18:57:44 +10:00
Dave Airlie
6ae7ec86de Merge tag 'amd-drm-fixes-6.19-2025-12-11' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-fixes-6.19-2025-12-11:

amdgpu:
- SI fix
- DC reduce stack usage
- HDMI fixes
- VCN 4.0.5 fix
- DP MST fix
- DC memory allocation fix

amdkfd:
- SVM fix
- Trap handler fix
- VGPR fixes for GC 11.5

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20251211195600.1641924-1-alexander.deucher@amd.com
2025-12-12 10:13:20 +10:00
Dave Airlie
685f27c1c5 Merge tag 'drm-misc-next-fixes-2025-12-10' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next-fixes for v6.19-rc1:
- Fix uaf in panthor.
- Revert 8 byte alignment constraint for pitch in dumb bo's.
- Fix DRM_MODE_FLAG_N.SYNC and !DRM_MODE_FLAG_P.SYNC handling renasas.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patch.msgid.link/a82c2a2a-314f-403b-85bf-9b3ee09b903c@linux.intel.com
2025-12-12 09:20:24 +10:00
Mario Limonciello
72e24456a5 Revert "drm/amd/display: Fix pbn to kbps Conversion"
Deeply daisy chained DP/MST displays are no longer able to light
up. This reverts commit e0dec00f3d ("drm/amd/display: Fix pbn
to kbps Conversion")

Cc: Jerry Zuo <jerry.zuo@amd.com>
Reported-by: nat@nullable.se
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4756
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e1c94109c7)
Cc: stable@vger.kernel.org # 6.17+
2025-12-10 18:06:16 -05:00
Mario Limonciello (AMD)
93a01629c8 drm/amd: Fix unbind/rebind for VCN 4.0.5
Unbinding amdgpu has no problems, but binding it again leads to an
error of sysfs file already existing.  This is because it wasn't
actually cleaned up on unbind.  Add the missing cleanup step.

Fixes: 547aad32ed ("drm/amdgpu: add VCN4 ip block support")
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d717e62e9b)
Cc: stable@vger.kernel.org
2025-12-10 18:05:49 -05:00
René Rebe
6cb31fba13 drm/mgag200: Fix big-endian support
Unlike the original, deleted Matrox mga driver, the new mgag200 driver
has the XRGB frame-buffer byte swapped on big-endian "RISC"
systems. Fix by enabling byte swapping "PowerPC" OPMODE for any
__BIG_ENDIAN config.

Fixes: 414c453106 ("mgag200: initial g200se driver (v2)")
Signed-off-by: René Rebe <rene@exactco.de>
Cc: stable@kernel.org
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20251208.141827.965103015954471168.rene@exactco.de
2025-12-10 09:33:53 +01:00
Ard Biesheuvel
1c7f9e528f drm/i915: Fix format string truncation warning
GCC notices that the 16-byte uabi_name field could theoretically be too
small for the formatted string if the instance number exceeds 100.

So grow the field to 20 bytes.

drivers/gpu/drm/i915/intel_memory_region.c: In function ‘intel_memory_region_create’:
drivers/gpu/drm/i915/intel_memory_region.c:273:61: error: ‘%u’ directive output may be truncated writing between 1 and 5 bytes into a region of size between 3 and 11 [-Werror=format-truncation=]
  273 |         snprintf(mem->uabi_name, sizeof(mem->uabi_name), "%s%u",
      |                                                             ^~
drivers/gpu/drm/i915/intel_memory_region.c:273:58: note: directive argument in the range [0, 65535]
  273 |         snprintf(mem->uabi_name, sizeof(mem->uabi_name), "%s%u",
      |                                                          ^~~~~~
drivers/gpu/drm/i915/intel_memory_region.c:273:9: note: ‘snprintf’ output between 7 and 19 bytes into a destination of size 16
  273 |         snprintf(mem->uabi_name, sizeof(mem->uabi_name), "%s%u",
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  274 |                  intel_memory_type_str(type), instance);
      |                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: 3b38d35157 ("drm/i915: Add stable memory region names")
Cc: <stable@vger.kernel.org> # v6.8+
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Link: https://lore.kernel.org/r/20251205113500.684286-2-ardb@kernel.org
(cherry picked from commit 18476087f1)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-09 15:53:12 +02:00
Dibin Moolakadan Subrahmanian
460b317203 drm/i915/fbdev: Hold runtime PM ref during fbdev BO creation
During fbdev probe, the xe driver allocates and pins a framebuffer
BO (via xe_bo_create_pin_map_novm() → xe_ggtt_insert_bo()).

Without a runtime PM reference, xe_pm_runtime_get_noresume() warns about
missing outer PM protection as below:

	xe 0000:03:00.0: [drm] Missing outer runtime PM protection

Acquire a runtime PM reference before framebuffer allocation to ensure
xe_ggtt_insert_bo()  executes  under active runtime PM context.

Changes in v2:
 - Update commit message to add Fixes tag (Jani Nikula)

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6350
Fixes: 44e694958b ("drm/xe/display: Implement display support")
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Dibin Moolakadan Subrahmanian <dibin.moolakadan.subrahmanian@intel.com>
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Link: https://patch.msgid.link/20251111135403.3415947-1-dibin.moolakadan.subrahmanian@intel.com
(cherry picked from commit 37fc7b7b3a)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-09 15:25:34 +02:00
Ivan Lipski
2e1da46091 drm/amd/display: Improve HDMI info retrieval
[WHY & HOW]
Make a dedicated function to read HDMI-related monitor info, including
monitor's SCDC support.

Fixes: 3471b9a31c ("drm/amd/display: Rework HDMI data channel reads")
Suggested-by: Fangzhi Zuo <jerry.zuo@amd.com>
Reviewed-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c78e31bcf5)
2025-12-08 17:58:49 -05:00
Kory Maincent (TI.com)
a585c7ef9c drm/tilcdc: Fix removal actions in case of failed probe
The drm_kms_helper_poll_fini() and drm_atomic_helper_shutdown() helpers
should only be called when the device has been successfully registered.
Currently, these functions are called unconditionally in tilcdc_fini(),
which causes warnings during probe deferral scenarios.

[    7.972317] WARNING: CPU: 0 PID: 23 at drivers/gpu/drm/drm_atomic_state_helper.c:175 drm_atomic_helper_crtc_duplicate_state+0x60/0x68
...
[    8.005820]  drm_atomic_helper_crtc_duplicate_state from drm_atomic_get_crtc_state+0x68/0x108
[    8.005858]  drm_atomic_get_crtc_state from drm_atomic_helper_disable_all+0x90/0x1c8
[    8.005885]  drm_atomic_helper_disable_all from drm_atomic_helper_shutdown+0x90/0x144
[    8.005911]  drm_atomic_helper_shutdown from tilcdc_fini+0x68/0xf8 [tilcdc]
[    8.005957]  tilcdc_fini [tilcdc] from tilcdc_pdev_probe+0xb0/0x6d4 [tilcdc]

Fix this by rewriting the failed probe cleanup path using the standard
goto error handling pattern, which ensures that cleanup functions are
only called on successfully initialized resources. Additionally, remove
the now-unnecessary is_registered flag.

Cc: stable@vger.kernel.org
Fixes: 3c4babae3c ("drm: Call drm_atomic_helper_shutdown() at shutdown/remove time for misc drivers")
Signed-off-by: Kory Maincent (TI.com) <kory.maincent@bootlin.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patch.msgid.link/20251125090546.137193-1-kory.maincent@bootlin.com
2025-12-08 13:54:25 -08:00
Jonathan Kim
cf32644963 drm/amdkfd: bump minimum vgpr size for gfx1151
GFX1151 has 1.5x the number of available physical VGPRs per SIMD.
Bump total memory availability for acquire checks on queue creation.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b42f3bf953)
Cc: stable@vger.kernel.org
2025-12-08 15:31:10 -05:00