Commit Graph

117820 Commits

Author SHA1 Message Date
Shuicheng Lin
d52dea485c drm/xe: Prevent BIT() overflow when handling invalid prefetch region
If user provides a large value (such as 0x80) for parameter
prefetch_mem_region_instance in vm_bind ioctl, it will cause
BIT(prefetch_region) overflow as below:
"
 ------------[ cut here ]------------
 UBSAN: shift-out-of-bounds in drivers/gpu/drm/xe/xe_vm.c:3414:7
 shift exponent 128 is too large for 64-bit type 'long unsigned int'
 CPU: 8 UID: 0 PID: 53120 Comm: xe_exec_system_ Tainted: G        W           6.18.0-rc1-lgci-xe-kernel+ #200 PREEMPT(voluntary)
 Tainted: [W]=WARN
 Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 0812 02/24/2023
 Call Trace:
  <TASK>
  dump_stack_lvl+0xa0/0xc0
  dump_stack+0x10/0x20
  ubsan_epilogue+0x9/0x40
  __ubsan_handle_shift_out_of_bounds+0x10e/0x170
  ? mutex_unlock+0x12/0x20
  xe_vm_bind_ioctl.cold+0x20/0x3c [xe]
 ...
"
Fix it by validating prefetch_region before the BIT() usage.

v2: Add Closes and Cc stable kernels. (Matt)

Reported-by: Koen Koning <koen.koning@intel.com>
Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com>
Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6478
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20251112181005.2120521-2-shuicheng.lin@intel.com
(cherry picked from commit 8f565bdd14)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-18 10:04:41 -08:00
Dave Airlie
362a7d4fd5 Merge tag 'drm-xe-fixes-2025-11-13' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
 - New HW workarounds affecting PTL and WCL platforms
   (Nitin Gote, Tangudu Tilak Tirumalesh)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patch.msgid.link/ay2qztgonodwson6tuzcv5napjmqbgwzv27so4ybfola34guux@xgufrrmbzyws
2025-11-14 17:51:17 +10:00
Dave Airlie
538e0110fe Merge tag 'drm-intel-fixes-2025-11-13' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
- Fix PSR's pipe to vblank conversion (Jani)
- Disable Panel Replay on MST links (Imre)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/aRXdQnitzyFcokhF@intel.com
2025-11-14 17:50:52 +10:00
Dave Airlie
15ebea1bdf Merge tag 'drm-misc-fixes-2025-11-13' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:

client:
- Fix description of module parameter

panthor:
- Flush writes before mapping buffers

vmwgfx:
- Improve command validation
- Improve ref counting
- Fix cursor-plane support

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20251113132317.GA451885@linux.fritz.box
2025-11-14 17:24:57 +10:00
Randy Dunlap
0a4a18e888 drm/client: fix MODULE_PARM_DESC string for "active"
The MODULE_PARM_DESC string for the "active" parameter is missing a
space and has an extraneous trailing ']' character. Correct these.

Before patch:
$ modinfo -p ./drm_client_lib.ko
active:Choose which drm client to start, default isfbdev] (string)

After patch:
$ modinfo -p ./drm_client_lib.ko
active:Choose which drm client to start, default is fbdev (string)

Fixes: f7b42442c4 ("drm/log: Introduce a new boot logger to draw the kmsg on the screen")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20251112010920.2355712-1-rdunlap@infradead.org
2025-11-13 14:15:24 +01:00
Imre Deak
f2687d3cc9 drm/i915/dp_mst: Disable Panel Replay
Disable Panel Replay on MST links until it's properly implemented. For
instance the required VSC SDP is not programmed on MST and FEC is not
enabled if Panel Replay is enabled.

Fixes: 3257e55d3e ("drm/i915/panelreplay: enable/disable panel replay")
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/15174
Cc: Jouni Högander <jouni.hogander@intel.com>
Cc: Animesh Manna <animesh.manna@intel.com>
Cc: stable@vger.kernel.org # v6.8+
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patch.msgid.link/20251107124141.911895-1-imre.deak@intel.com
(cherry picked from commit e109f644b8)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-11-12 09:44:54 -05:00
Harish Kasiviswanathan
eac32ff423 drm/amdkfd: Fix GPU mappings for APU after prefetch
Fix the following corner case:-
 Consider a 2M huge page SVM allocation, followed by prefetch call for
the first 4K page. The whole range is initially mapped with single PTE.
After the prefetch, this range gets split to first page + rest of the
pages. Currently, the first page mapping is not updated on MI300A (APU)
since page hasn't migrated. However, after range split PTE mapping it not
valid.

Fix this by forcing page table update for the whole range when prefetch
is called.  Calling prefetch on APU doesn't improve performance. If all
it deteriotes. However, functionality has to be supported.

v2: Use apu_prefer_gtt as this issue doesn't apply to APUs with carveout
VRAM

v3: Simplify by setting the flag for all ASICs as it doesn't affect dGPU

v4: Remove v2 and v3 changes. Force update_mapping when range is split
at a size that is not aligned to prange granularity

Suggested-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Philip Yang<Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 076470b9f6)
2025-11-11 22:52:51 -05:00
Jonathan Kim
d15deafab5 drm/amdkfd: relax checks for over allocation of save area
Over allocation of save area is not fatal, only under allocation is.
ROCm has various components that independently claim authority over save
area size.

Unless KFD decides to claim single authority, relax size checks.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 15bd4958fe)
Cc: stable@vger.kernel.org
2025-11-11 22:52:27 -05:00
Sathishkumar S
bbe3c11503 drm/amdgpu/jpeg: Add parse_cs for JPEG5_0_1
enable parse_cs callback for JPEG5_0_1.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5479855799)
Cc: stable@vger.kernel.org
2025-11-11 22:51:49 -05:00
Sultan Alsawaf
7132f7e025 drm/amd/amdgpu: Ensure isp_kernel_buffer_alloc() creates a new BO
When the BO pointer provided to amdgpu_bo_create_kernel() points to
non-NULL, amdgpu_bo_create_kernel() takes it as a hint to pin that address
rather than allocate a new BO.

This functionality is never desired for allocating ISP buffers. A new BO
should always be created when isp_kernel_buffer_alloc() is called, per the
description for isp_kernel_buffer_alloc().

Ensure this by zeroing *bo right before the amdgpu_bo_create_kernel() call.

Fixes: 55d42f6169 ("drm/amd/amdgpu: Add helper functions for isp buffers")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 73c8c29baa)
2025-11-11 22:51:27 -05:00
Ivan Lipski
33c9957091 drm/amd/display: Allow VRR params change if unsynced with the stream
[Why]
When changing resolution (e.g., 4K → FHD) in mirror/clone mode with
certain monitors, the monitor blanks and loses connection due to an early
exit in vrr_settings_require_update(). The function only checks if VRR
state, fixed refresh target, or min/max refresh rate range has changed.

During mode changes, if the calculated min/max refresh values remain the
same even though the stream's v_total changed, the function returns early
without updating vrr_params.adjust.v_total_min/max, leaving the monitor's
VRR timing parameters unsynced with the new mode, causing it to blank out.

[How]
Explicitly adjust VRR parameters to the stream's nominal v_total when VRR
is supported, but inactive.

Fixes: 6d31602a9f ("drm/amd/display: more liberal vmin/vmax update for freesync")
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 607df8248a)
2025-11-11 22:50:53 -05:00
Jesse.Zhang
6623c5f9fd drm/amdgpu: fix lock warning in amdgpu_userq_fence_driver_process
Fix a potential deadlock caused by inconsistent spinlock usage
between interrupt and process contexts in the userq fence driver.

The issue occurs when amdgpu_userq_fence_driver_process() is called
from both:
- Interrupt context: gfx_v11_0_eop_irq() -> amdgpu_userq_fence_driver_process()
- Process context: amdgpu_eviction_fence_suspend_worker() ->
  amdgpu_userq_fence_driver_force_completion() -> amdgpu_userq_fence_driver_process()

In interrupt context, the spinlock was acquired without disabling
interrupts, leaving it in {IN-HARDIRQ-W} state. When the same lock
is acquired in process context, the kernel detects inconsistent
locking since the process context acquisition would enable interrupts
while holding a lock previously acquired in interrupt context.

Kernel log shows:
[ 4039.310790] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[ 4039.310804] kworker/7:2/409 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 4039.310818] ffff9284e1bed000 (&fence_drv->fence_list_lock){?...}-{3:3},
[ 4039.310993] {IN-HARDIRQ-W} state was registered at:
[ 4039.311004]   lock_acquire+0xc6/0x300
[ 4039.311018]   _raw_spin_lock+0x39/0x80
[ 4039.311031]   amdgpu_userq_fence_driver_process.part.0+0x30/0x180 [amdgpu]
[ 4039.311146]   amdgpu_userq_fence_driver_process+0x17/0x30 [amdgpu]
[ 4039.311257]   gfx_v11_0_eop_irq+0x132/0x170 [amdgpu]

Fix by using spin_lock_irqsave()/spin_unlock_irqrestore() to properly
manage interrupt state regardless of calling context.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ded3ad780c)
Cc: stable@vger.kernel.org
2025-11-11 22:50:22 -05:00
Pierre-Eric Pelloux-Prayer
9f8fd538e2 drm/amdgpu: jump to the correct label on failure
drm_sched_entity_init wasn't called yet, so the only thing to
do is to release allocated memory.
This doesn't fix any bug since entity is zero allocated and
drm_sched_entity_fini does nothing in this case.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ec49374ccb)
2025-11-11 22:49:46 -05:00
Vitaly Prosyak
22a36e660d drm/amdgpu: disable peer-to-peer access for DCC-enabled GC12 VRAM surfaces
Certain multi-GPU configurations (especially GFX12) may hit
data corruption when a DCC-compressed VRAM surface is shared across GPUs
using peer-to-peer (P2P) DMA transfers.

Such surfaces rely on device-local metadata and cannot be safely accessed
through a remote GPU’s page tables. Attempting to import a DCC-enabled
surface through P2P leads to incorrect rendering or GPU faults.

This change disables P2P for DCC-enabled VRAM buffers that are contiguous
and allocated on GFX12+ hardware.  In these cases, the importer falls back
to the standard system-memory path, avoiding invalid access to compressed
surfaces.

Future work could consider optional migration (VRAM→System→VRAM) if a
performance regression is observed when `attach->peer2peer = false`.

Tested on:
 - Dual RX 9700 XT (Navi4x) setup
 - GNOME and Wayland compositor scenarios
 - Confirmed no corruption after disabling P2P under these conditions
v2: Remove check TTM_PL_VRAM & TTM_PL_FLAG_CONTIGUOUS.
v3: simplify for upsteam and fix ip version check (Alex)

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9dff2bb709)
Cc: stable@vger.kernel.org
2025-11-11 22:49:19 -05:00
Nitin Gote
240372edaf drm/xe/xe3lpg: Extend Wa_15016589081 for xe3lpg
Wa_15016589081 applies to Xe3_LPG renderCS

Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Link: https://patch.msgid.link/20251106100516.318863-2-nitin.r.gote@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit 715974499a)
Cc: stable@vger.kernel.org # v6.16+
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-10 09:41:44 -08:00
Tangudu Tilak Tirumalesh
fa3376319b drm/xe/xe3: Extend wa_14023061436
Extend wa_14023061436 to Graphics Versions 30.03, 30.04
and 30.05.

Signed-off-by: Tangudu Tilak Tirumalesh <tilak.tirumalesh.tangudu@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20251030154626.3124565-1-tilak.tirumalesh.tangudu@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit 0dd656d06f)
Cc: stable@vger.kernel.org # v6.17+
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-10 09:41:29 -08:00
Nitin Gote
0b2f7be548 drm/xe/xe3: Add WA_14024681466 for Xe3_LPG
Apply WA_14024681466 to Xe3_LPG graphics IP versions from 30.00 to 30.05.

v2: (Matthew Roper)
   - Remove stepping filter as workaround applies to all steppings.
   - Add an engine class filter so it only applies to the RENDER engine.

Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Link: https://patch.msgid.link/20251027092643.335904-1-nitin.r.gote@intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit 071089a69e)
Cc: stable@vger.kernel.org # v6.16+
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-10 09:41:09 -08:00
Jani Nikula
994dec1099 drm/i915/psr: fix pipe to vblank conversion
First, we can't assume pipe == crtc index. If a pipe is fused off in
between, it no longer holds. intel_crtc_for_pipe() is the only proper
way to get from a pipe to the corresponding crtc.

Second, drivers aren't supposed to access or index drm->vblank[]
directly. There's drm_crtc_vblank_crtc() for this.

Use both functions to fix the pipe to vblank conversion.

Fixes: f02658c46c ("drm/i915/psr: Add mechanism to notify PSR of pipe enable/disable")
Cc: Jouni Högander <jouni.hogander@intel.com>
Cc: stable@vger.kernel.org # v6.16+
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
Link: https://patch.msgid.link/20251106200000.1455164-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 2750f6765d)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-11-10 10:12:31 -05:00
Boris Brezillon
576c930e5e drm/panthor: Flush shmem writes before mapping buffers CPU-uncached
The shmem layer zeroes out the new pages using cached mappings, and if
we don't CPU-flush we might leave dirty cachelines behind, leading to
potential data leaks and/or asynchronous buffer corruption when dirty
cachelines are evicted.

Fixes: 8a1cc07578 ("drm/panthor: Add GEM logical block")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://patch.msgid.link/20251107171214.1186299-1-boris.brezillon@collabora.com
2025-11-10 14:56:06 +00:00
Linus Torvalds
3461e958c1 Merge tag 'kbuild-fixes-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild fixes from Nathan Chancellor:

 - Strip trailing padding bytes from modules.builtin.modinfo to fix
   error during modules_install with certain versions of kmod

 - Drop unused static inline function warning in .c files with clang
   from W=1 to W=2

 - Ensure kernel-doc.py invocations use the PYTHON3 make variable to
   ensure user's choice of Python interpreter is always respected

* tag 'kbuild-fixes-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
  kbuild: Let kernel-doc.py use PYTHON3 override
  compiler_types: Move unused static inline functions warning to W=2
  kbuild: Strip trailing padding bytes from modules.builtin.modinfo
2025-11-09 09:22:08 -08:00
Jean Delvare
002621a4df kbuild: Let kernel-doc.py use PYTHON3 override
It is possible to force a specific version of python to be used when
building the kernel by passing PYTHON3= on the make command line.
However kernel-doc.py is currently called with python3 hard-coded and
thus ignores this setting.

Use $(PYTHON3) to run $(KERNELDOC) so that the desired version of
python is used.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://patch.msgid.link/20251107192933.2bfe9e57@endymion
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2025-11-08 19:42:22 -07:00
Dave Airlie
4113361590 Revert "drm/nouveau: set DMA mask before creating the flush page"
This reverts commit ebe7556050.

Tested the latest kernel on my GB203 and this seems to break it somehow.

Nov 09 04:16:14 bighp kernel: nouveau 0000:02:00.0: gsp: GSP-FMC boot failed (mbox: 0x0000000b)
Nov 09 04:16:14 bighp kernel: nouveau 0000:02:00.0: gsp: init failed, -5
Nov 09 04:16:14 bighp kernel: nouveau 0000:02:00.0: init failed with -5
Nov 09 04:16:14 bighp kernel: nouveau: drm:00000000:00000080: init failed with -5
Nov 09 04:16:14 bighp kernel: nouveau 0000:02:00.0: drm: Device allocation failed: -5
Nov 09 04:16:14 bighp kernel: nouveau 0000:02:00.0: probe with driver nouveau failed with error -5

Not sure why, I went over the patch and thought it should have worked, but there must be some
32-bit problem maybe in the FMC boot path.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-11-08 19:41:09 +10:00
Dave Airlie
d439acbbfb Merge tag 'drm-xe-fixes-2025-11-07' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
 - Fix missing  synchronization on unbind (Balasubramani Vivekanandan)
 - Fix device shutdown when doing FLR (Jouni Högander)
 - Fix user fence signaling order (Matthew Brost)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patch.msgid.link/mvfyflloncy76a7nmkatpj6f2afddavwsibz3y4u4wo6gznro5@rdulkuh5wvje
2025-11-08 07:39:54 +10:00
Matthew Brost
0995c2fc39 drm/xe: Enforce correct user fence signaling order using
Prevent application hangs caused by out-of-order fence signaling when
user fences are attached. Use drm_syncobj (via dma-fence-chain) to
guarantee that each user fence signals in order, regardless of the
signaling order of the attached fences. Ensure user fence writebacks to
user space occur in the correct sequence.

v7:
 - Skip drm_syncbj create of error (CI)

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20251031234050.3043507-2-matthew.brost@intel.com
(cherry picked from commit adda4e855a)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-07 03:55:19 -08:00
Jouni Högander
b11a020d91 drm/xe: Do clean shutdown also when using flr
Currently Xe driver is triggering flr without any clean-up on
shutdown. This is causing random warnings from pending related works as the
underlying hardware is reset in the middle of their execution.

Fix this by performing clean shutdown also when using flr.

Fixes: 501d799a47 ("drm/xe: Wire up device shutdown handler")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Maarten Lankhorst <dev@lankhorst.se>
Link: https://patch.msgid.link/20251031122312.1836534-1-jouni.hogander@intel.com
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
(cherry picked from commit a4ff26b7c8)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-07 03:05:32 -08:00
Tejas Upadhyay
9cd27eec87 drm/xe: Move declarations under conditional branch
The xe_device_shutdown() function was needing a few declarations
that were only required under a specific condition. This change
moves those declarations to be within that conditional branch
to avoid unnecessary declarations.

Reviewed-by: Nitin Gote <nitin.r.gote@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20251007100208.1407021-1-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
(cherry picked from commit 15b3036045)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-07 03:05:20 -08:00
Balasubramani Vivekanandan
95af8f4fdc drm/xe/guc: Synchronize Dead CT worker with unbind
Cancel and wait for any Dead CT worker to complete before continuing
with device unbinding. Else the worker will end up using resources freed
by the undind operation.

Cc: Zhanjun Dong <zhanjun.dong@intel.com>
Fixes: d2c5a5a926 ("drm/xe/guc: Dead CT helper")
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patch.msgid.link/20251103123144.3231829-6-balasubramani.vivekanandan@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 4926713391)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-11-07 03:01:54 -08:00
Ian Forbes
eef295a850 drm/vmwgfx: Restore Guest-Backed only cursor plane support
The referenced fixes commit broke the cursor plane for configurations
which have Guest-Backed surfaces but no cursor MOB support.

Fixes: 965544150d ("drm/vmwgfx: Refactor cursor handling")
Signed-off-by: Ian Forbes <ian.forbes@broadcom.com>
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://patch.msgid.link/20251103201920.381503-1-ian.forbes@broadcom.com
2025-11-07 00:01:15 -05:00
Ian Forbes
c1962742ff drm/vmwgfx: Use kref in vmw_bo_dirty
Rather than using an ad hoc reference count use kref which is atomic
and has underflow warnings.

Signed-off-by: Ian Forbes <ian.forbes@broadcom.com>
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://patch.msgid.link/20251030193640.153697-1-ian.forbes@broadcom.com
2025-11-07 00:00:53 -05:00
Ian Forbes
32b415a9dc drm/vmwgfx: Validate command header size against SVGA_CMD_MAX_DATASIZE
This data originates from userspace and is used in buffer offset
calculations which could potentially overflow causing an out-of-bounds
access.

Fixes: 8ce75f8ab9 ("drm/vmwgfx: Update device includes for DX device functionality")
Reported-by: Rohit Keshri <rkeshri@redhat.com>
Signed-off-by: Ian Forbes <ian.forbes@broadcom.com>
Reviewed-by: Maaz Mombasawala <maaz.mombasawala@broadcom.com>
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://patch.msgid.link/20251021190128.13014-1-ian.forbes@broadcom.com
2025-11-06 23:59:40 -05:00
Dave Airlie
a18033f130 Merge tag 'mediatek-drm-fixes-20251105' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes
Mediatek DRM Fixes - 20251105

1. Disable AFBC support on Mediatek DRM driver
2. Add pm_runtime support for GCE power control

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Link: https://patch.msgid.link/20251105151443.3909-1-chunkuang.hu@kernel.org
2025-11-07 12:41:42 +10:00
Dave Airlie
b57b47741e Merge tag 'amd-drm-fixes-6.18-2025-11-06' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.18-2025-11-06:

amdgpu:
- Reset fixes
- Misc fixes
- Panel scaling fixes
- HDMI fix
- S0ix fixes
- Hibernation fix
- Secure display fix
- Suspend fix
- MST fix

amdkfd:
- Process cleanup fix

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20251106201326.807230-1-alexander.deucher@amd.com
2025-11-07 09:20:48 +10:00
Dave Airlie
6ec8a47c55 Merge tag 'drm-intel-fixes-2025-11-06' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
- Avoid lock inversion when pinning to GGTT on CHV/BXT+VTD (Janusz)
- Fix conversion between clock ticks and nanoseconds (Umesh)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/aQyxT1D8IW-xcDbM@intel.com
2025-11-07 09:15:31 +10:00
Wayne Lin
3c6a743c69 drm/amd/display: Enable mst when it's detected but yet to be initialized
[Why]
drm_dp_mst_topology_queue_probe() is used under the assumption that
mst is already initialized. If we connect system with SST first
then switch to the mst branch during suspend, we will fail probing
topology by calling the wrong API since the mst manager is yet to
be initialized.

[How]
At dm_resume(), once it's detected as mst branc connected, check if
the mst is initialized already. If not, call
dm_helpers_dp_mst_start_top_mgr() instead to initialize mst

V2: Adjust the commit msg a bit

Fixes: bc068194f5 ("drm/amd/display: Don't write DP_MSTM_CTRL after LT")
Cc: Fangzhi Zuo <jerry.zuo@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 62320fb8d9)
Cc: stable@vger.kernel.org
2025-11-06 11:58:55 -05:00
Lijo Lazar
570a66b48c drm/amdgpu: Fix wait after reset sequence in S3
For a mode-1 reset done at the end of S3 on PSPv11 dGPUs, only check if
TOS is unloaded.

Fixes: 32f73741d6 ("drm/amdgpu: Wait for bootloader after PSPv11 reset")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4649
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1ad25fd272)
2025-11-06 11:58:32 -05:00
Mario Limonciello
b09cb2996c drm/amd: Fix suspend failure with secure display TA
commit c760bcda83 ("drm/amd: Check whether secure display TA loaded
successfully") attempted to fix extra messages, but failed to port the
cleanup that was in commit 5c6d52ff4b ("drm/amd: Don't try to enable
secure display TA multiple times") to prevent multiple tries.

Add that to the failure handling path even on a quick failure.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4679
Fixes: c760bcda83 ("drm/amd: Check whether secure display TA loaded successfully")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4104c0a454)
2025-11-06 11:58:10 -05:00
Samuel Zhang
eb6e7f520d drm/amdgpu: fix gpu page fault after hibernation on PF passthrough
On PF passthrough environment, after hibernate and then resume, coralgemm
will cause gpu page fault.

Mode1 reset happens during hibernate, but partition mode is not restored
on resume, register mmCP_HYP_XCP_CTL and mmCP_PSP_XCP_CTL is not right
after resume. When CP access the MQD BO, wrong stride size is used,
this will cause out of bound access on the MQD BO, resulting page fault.

The fix is to ensure gfx_v9_4_3_switch_compute_partition() is called
when resume from a hibernation.
KFD resume is called separately during a reset recovery or resume from
suspend sequence. Hence it's not required to be called as part of
partition switch.

Signed-off-by: Samuel Zhang <guoqing.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5d1b32cfe4)
2025-11-06 11:57:08 -05:00
LiangCheng Wang
b750f5a9d6 drm/tiny: pixpaper: add explicit dependency on MMU
The DRM_GEM_SHMEM_HELPER helper requires MMU enabled because it uses
vmf_insert_pfn() in its mmap implementation. On NOMMU configurations
(e.g. some RISC-V randconfig builds), this symbol is unavailable and
selecting DRM_GEM_SHMEM_HELPER causes a modpost undefined reference:

    ERROR: modpost: "vmf_insert_pfn" [drivers/gpu/drm/drm_shmem_helper.ko] undefined!

Normally, Kconfig prevents this helper from being selected when
CONFIG_MMU=n. However, in some randconfig builds (such as those used by
0day CI), select statements can override unmet dependencies, triggering
the issue.

Add an explicit dependency on MMU to DRM_PIXPAPER to prevent this.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202510280213.0rlYA4T3-lkp@intel.com/
Fixes: 0c4932f6dd ("drm/tiny: pixpaper: Fix missing dependency on DRM_GEM_SHMEM_HELPER")
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: LiangCheng Wang <zaq14760@gmail.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20251028-bar-v1-1-edfbd13fafff@gmail.com
2025-11-06 13:47:29 +01:00
James Jones
664ce10246 drm/nouveau: Advertise correct modifiers on GB20x
8 and 16 bit formats use a different layout on
GB20x than they did on prior chips. Add the
corresponding DRM format modifiers to the list of
modifiers supported by the display engine on such
chips, and filter the supported modifiers for each
format based on its bytes per pixel in
nv50_plane_format_mod_supported().

Note this logic will need to be updated when GB10
support is added, since it is a GB20x chip that
uses the pre-GB20x sector layout for all formats.

Fixes: 6cc6e08d45 ("drm/nouveau/kms: add support for GB20x")
Signed-off-by: James Jones <jajones@nvidia.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251030181153.1208-3-jajones@nvidia.com
2025-11-06 11:02:08 +10:00
Timur Tabi
ebe7556050 drm/nouveau: set DMA mask before creating the flush page
Set the DMA mask before calling nvkm_device_ctor(), so that when the
flush page is created in nvkm_fb_ctor(), the allocation will not fail
if the page is outside of DMA address space, which can easily happen if
IOMMU is disable.  In such situations, you will get an error like this:

nouveau 0000:65:00.0: DMA addr 0x0000000107c56000+4096 overflow (mask ffffffff, bus limit 0).

Commit 38f5359354 ("rm/nouveau/pci: set streaming DMA mask early")
set the mask after calling nvkm_device_ctor(), but back then there was
no flush page being created, which might explain why the mask wasn't
set earlier.

Flush page allocation was added in commit 5728d06419 ("drm/nouveau/fb:
handle sysmem flush page from common code").  nvkm_fb_ctor() calls
alloc_page(), which can allocate a page anywhere in system memory, but
then calls dma_map_page() on that page.  But since the DMA mask is still
set to 32, the map can fail if the page is allocated above 4GB.  This is
easy to reproduce on systems with a lot of memory and IOMMU disabled.

An alternative approach would be to force the allocation of the flush
page to low memory, by specifying __GFP_DMA32.  However, this would
always allocate the page in low memory, even though the hardware can
access high memory.

Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patch.msgid.link/20251014174512.3172102-1-ttabi@nvidia.com
2025-11-06 10:26:51 +10:00
Pierre-Eric Pelloux-Prayer
487df8b698 drm/sched: Fix deadlock in drm_sched_entity_kill_jobs_cb
The Mesa issue referenced below pointed out a possible deadlock:

[ 1231.611031]  Possible interrupt unsafe locking scenario:

[ 1231.611033]        CPU0                    CPU1
[ 1231.611034]        ----                    ----
[ 1231.611035]   lock(&xa->xa_lock#17);
[ 1231.611038]                                local_irq_disable();
[ 1231.611039]                                lock(&fence->lock);
[ 1231.611041]                                lock(&xa->xa_lock#17);
[ 1231.611044]   <Interrupt>
[ 1231.611045]     lock(&fence->lock);
[ 1231.611047]
                *** DEADLOCK ***

In this example, CPU0 would be any function accessing job->dependencies
through the xa_* functions that don't disable interrupts (eg:
drm_sched_job_add_dependency(), drm_sched_entity_kill_jobs_cb()).

CPU1 is executing drm_sched_entity_kill_jobs_cb() as a fence signalling
callback so in an interrupt context. It will deadlock when trying to
grab the xa_lock which is already held by CPU0.

Replacing all xa_* usage by their xa_*_irq counterparts would fix
this issue, but Christian pointed out another issue: dma_fence_signal
takes fence.lock and so does dma_fence_add_callback.

  dma_fence_signal() // locks f1.lock
  -> drm_sched_entity_kill_jobs_cb()
  -> foreach dependencies
     -> dma_fence_add_callback() // locks f2.lock

This will deadlock if f1 and f2 share the same spinlock.

To fix both issues, the code iterating on dependencies and re-arming them
is moved out to drm_sched_entity_kill_jobs_work().

Cc: stable@vger.kernel.org # v6.2+
Fixes: 2fdb8a8f07 ("drm/scheduler: rework entity flush, kill and fini")
Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13908
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
[phasta: commit message nits]
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://patch.msgid.link/20251104095358.15092-1-pierre-eric.pelloux-prayer@amd.com
2025-11-05 12:29:52 +01:00
Rong Zhang
6dd97ceb64 drm/amd/display: Fix NULL deref in debugfs odm_combine_segments
When a connector is connected but inactive (e.g., disabled by desktop
environments), pipe_ctx->stream_res.tg will be destroyed. Then, reading
odm_combine_segments causes kernel NULL pointer dereference.

 BUG: kernel NULL pointer dereference, address: 0000000000000000
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: Oops: 0000 [#1] SMP NOPTI
 CPU: 16 UID: 0 PID: 26474 Comm: cat Not tainted 6.17.0+ #2 PREEMPT(lazy)  e6a17af9ee6db7c63e9d90dbe5b28ccab67520c6
 Hardware name: LENOVO 21Q4/LNVNB161216, BIOS PXCN25WW 03/27/2025
 RIP: 0010:odm_combine_segments_show+0x93/0xf0 [amdgpu]
 Code: 41 83 b8 b0 00 00 00 01 75 6e 48 98 ba a1 ff ff ff 48 c1 e0 0c 48 8d 8c 07 d8 02 00 00 48 85 c9 74 2d 48 8b bc 07 f0 08 00 00 <48> 8b 07 48 8b 80 08 02 00>
 RSP: 0018:ffffd1bf4b953c58 EFLAGS: 00010286
 RAX: 0000000000005000 RBX: ffff8e35976b02d0 RCX: ffff8e3aeed052d8
 RDX: 00000000ffffffa1 RSI: ffff8e35a3120800 RDI: 0000000000000000
 RBP: 0000000000000000 R08: ffff8e3580eb0000 R09: ffff8e35976b02d0
 R10: ffffd1bf4b953c78 R11: 0000000000000000 R12: ffffd1bf4b953d08
 R13: 0000000000040000 R14: 0000000000000001 R15: 0000000000000001
 FS:  00007f44d3f9f740(0000) GS:ffff8e3caa47f000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 00000006485c2000 CR4: 0000000000f50ef0
 PKRU: 55555554
 Call Trace:
  <TASK>
  seq_read_iter+0x125/0x490
  ? __alloc_frozen_pages_noprof+0x18f/0x350
  seq_read+0x12c/0x170
  full_proxy_read+0x51/0x80
  vfs_read+0xbc/0x390
  ? __handle_mm_fault+0xa46/0xef0
  ? do_syscall_64+0x71/0x900
  ksys_read+0x73/0xf0
  do_syscall_64+0x71/0x900
  ? count_memcg_events+0xc2/0x190
  ? handle_mm_fault+0x1d7/0x2d0
  ? do_user_addr_fault+0x21a/0x690
  ? exc_page_fault+0x7e/0x1a0
  entry_SYSCALL_64_after_hwframe+0x6c/0x74
 RIP: 0033:0x7f44d4031687
 Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00>
 RSP: 002b:00007ffdb4b5f0b0 EFLAGS: 00000202 ORIG_RAX: 0000000000000000
 RAX: ffffffffffffffda RBX: 00007f44d3f9f740 RCX: 00007f44d4031687
 RDX: 0000000000040000 RSI: 00007f44d3f5e000 RDI: 0000000000000003
 RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000202 R12: 00007f44d3f5e000
 R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000040000
  </TASK>
 Modules linked in: tls tcp_diag inet_diag xt_mark ccm snd_hrtimer snd_seq_dummy snd_seq_midi snd_seq_oss snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device x>
  snd_hda_codec_atihdmi snd_hda_codec_realtek_lib lenovo_wmi_helpers think_lmi snd_hda_codec_generic snd_hda_codec_hdmi snd_soc_core kvm snd_compress uvcvideo sn>
  platform_profile joydev amd_pmc mousedev mac_hid sch_fq_codel uinput i2c_dev parport_pc ppdev lp parport nvme_fabrics loop nfnetlink ip_tables x_tables dm_cryp>
 CR2: 0000000000000000
 ---[ end trace 0000000000000000 ]---
 RIP: 0010:odm_combine_segments_show+0x93/0xf0 [amdgpu]
 Code: 41 83 b8 b0 00 00 00 01 75 6e 48 98 ba a1 ff ff ff 48 c1 e0 0c 48 8d 8c 07 d8 02 00 00 48 85 c9 74 2d 48 8b bc 07 f0 08 00 00 <48> 8b 07 48 8b 80 08 02 00>
 RSP: 0018:ffffd1bf4b953c58 EFLAGS: 00010286
 RAX: 0000000000005000 RBX: ffff8e35976b02d0 RCX: ffff8e3aeed052d8
 RDX: 00000000ffffffa1 RSI: ffff8e35a3120800 RDI: 0000000000000000
 RBP: 0000000000000000 R08: ffff8e3580eb0000 R09: ffff8e35976b02d0
 R10: ffffd1bf4b953c78 R11: 0000000000000000 R12: ffffd1bf4b953d08
 R13: 0000000000040000 R14: 0000000000000001 R15: 0000000000000001
 FS:  00007f44d3f9f740(0000) GS:ffff8e3caa47f000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 00000006485c2000 CR4: 0000000000f50ef0
 PKRU: 55555554

Fix this by checking pipe_ctx->stream_res.tg before dereferencing.

Fixes: 07926ba8a4 ("drm/amd/display: Add debugfs interface for ODM combine info")
Signed-off-by: Rong Zhang <i@rong.moe>
Reviewed-by: Mario Limoncello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f19bbecd34)
Cc: stable@vger.kernel.org
2025-11-04 13:40:42 -05:00
Philip Yang
597eb70f7f drm/amdkfd: Don't clear PT after process killed
If process is killed. the vm entity is stopped, submit pt update job
will trigger the error message "*ERROR* Trying to push to a killed
entity", job will not execute.

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 10c382ec6c)
Cc: stable@vger.kernel.org
2025-11-04 13:40:42 -05:00
Alex Deucher
7c5609b72b drm/amdgpu/smu: Handle S0ix for vangogh
Fix the flows for S0ix.  There is no need to stop
rlc or reintialize PMFW in S0ix.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4659
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reported-by: Antheas Kapenekakis <lkml@antheas.dev>
Tested-by: Antheas Kapenekakis <lkml@antheas.dev>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit fd39b5a583)
Cc: <stable@vger.kernel.org> # c81f5cebe8: drm/amdgpu: Drop PMFW RLC notifier from amdgpu_device_suspend()
Cc: <stable@vger.kernel.org>
2025-11-04 13:39:27 -05:00
Alex Deucher
c81f5cebe8 drm/amdgpu: Drop PMFW RLC notifier from amdgpu_device_suspend()
For S3 on vangogh, PMFW needs to be notified before the
driver powers down RLC.  This already happens in smu_disable_dpms()
so drop the superfluous call in amdgpu_device_suspend().

Co-developed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 960e30a61e)
2025-11-04 13:28:20 -05:00
Alex Hung
fdc93beead drm/amd/display: Fix black screen with HDMI outputs
[Why & How]
This fixes the black screen issue on certain APUs with HDMI,
accompanied by the following messages:

amdgpu 0000:c4:00.0: amdgpu: [drm] Failed to setup vendor info
                     frame on connector DP-1: -22
amdgpu 0000:c4:00.0: [drm] Cannot find any crtc or sizes [drm]
                     Cannot find any crtc or sizes

Fixes: 489f0f600c ("drm/amd/display: Fix DVI-D/HDMI adapters")
Suggested-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 678c901443)
2025-11-04 13:24:40 -05:00
Mario Limonciello (AMD)
3362692fea drm/amd/display: Don't stretch non-native images by default in eDP
commit 978fa2f6d0 ("drm/amd/display: Use scaling for non-native
resolutions on eDP") started using the GPU scaler hardware to scale
when a non-native resolution was picked on eDP. This scaling was done
to fill the screen instead of maintain aspect ratio.

The idea was supposed to be that if a different scaling behavior is
preferred then the compositor would request it.  The not following
aspect ratio behavior however isn't desirable, so adjust it to follow
aspect ratio and still try to fill screen.

Note: This will lead to black bars in some cases for non-native
resolutions. Compositors can request the previous behavior if desired.

Fixes: 978fa2f6d0 ("drm/amd/display: Use scaling for non-native resolutions on eDP")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4538
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 825df7ff4b)
2025-11-04 13:23:24 -05:00
Yang Wang
37e3567dee drm/amd/pm: fix missing device_attr cleanup in amdgpu_pm_sysfs_init()
Use the correct label to complete all cleanup work.

Fixes: 4d154b1ca5 ("drm/amd/pm: Add support for DPM policies")
Fixes: 25e82f2e2c ("drm/amd/pm: Add temperature metrics sysfs entry")
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4c4c138a1c)
2025-11-04 13:18:05 -05:00
Alex Deucher
90b75e12a6 drm/amdgpu: set default gfx reset masks for gfx6-8
These were not set so soft recovery was inadvertantly
disabled.

Fixes: 6ac55eab4f ("drm/amdgpu: move reset support type checks into the caller")
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1972763505)
2025-11-04 13:15:43 -05:00
Umesh Nerlige Ramappa
7d44ad6b43 drm/i915: Fix conversion between clock ticks and nanoseconds
When tick values are large, the multiplication by NSEC_PER_SEC is larger
than 64 bits and results in bad conversions.

The issue is seen in PMU busyness counters that look like they have
wrapped around due to bad conversion. i915 PMU implementation returns
monotonically increasing counters. If a count is lesser than previous
one, it will only return the larger value until the smaller value
catches up. The user will see this as zero delta between two
measurements even though the engines are busy.

Fix it by using mul_u64_u32_div()

Fixes: 77cdd054dd ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu")
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14955
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://lore.kernel.org/r/20251016000350.1152382-2-umesh.nerlige.ramappa@intel.com
(cherry picked from commit 2ada9cb1df)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Rodrigo: Added the Fixes tag while cherry-picking to fixes]
2025-11-03 11:18:15 -05:00