Commit Graph

1339371 Commits

Author SHA1 Message Date
Huang Rui
793fa8ce4e drm/amdgpu: cleanup sriov function for psp v12
PSP v12 won't have SRIOV function.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-08 11:20:43 -04:00
Alex Deucher
4a89b7698e drm/amdgpu/hdp5.2: use memcfg register to post the write for HDP flush
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.

Fixes: f756dbac1c ("drm/amdgpu/hdp5.2: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-08 11:20:19 -04:00
Alex Deucher
a5cb344033 drm/amdgpu/hdp5: use memcfg register to post the write for HDP flush
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.

Fixes: cf424020e0 ("drm/amdgpu/hdp5.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-08 11:18:30 -04:00
Huang Rui
518e22b42c drm/amdgpu: remove re-route ih in psp v12
APU doesn't have second IH ring, so re-routing action here is a no-op.
It will take a lot of time to wait timeout from PSP during the
initialization. So remove the function in psp v12.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-08 11:18:24 -04:00
Mario Limonciello
b54695dae9 drm/amd: Add per-ring reset for vcn v5.0.0 use
If there is a problem requiring a reset of the VCN engine, it is better to
reset the VCN engine rather than the entire GPU.

Add a reset callback for the ring which will stop and start VCN if an
issue happens.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250506204948.12048-4-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:48:24 -04:00
Mario Limonciello
b8b6e6f165 drm/amd: Add per-ring reset for vcn v4.0.0 use
If there is a problem requiring a reset of the VCN engine, it is better to
reset the VCN engine rather than the entire GPU.

Add a reset callback for the ring which will stop and start VCN if an
issue happens.

Link: https://lore.kernel.org/r/20250506204948.12048-3-mario.limonciello@amd.com
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:48:19 -04:00
Mario Limonciello
d1a46cdd00 drm/amd: Add per-ring reset for vcn v4.0.5 use
There is a problem occurring on VCN 4.0.5 where in some situations a job
is timing out.  This triggers a job timeout which then causes a GPU
reset for recovery.  That has exposed a number of issues with GPU reset
that have since been fixed. But also a GPU reset isn't actually needed
for this circumstance. Just restarting the ring is enough.

Add a reset callback for the ring which will stop and start VCN if the
issue happens.

Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12528
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3909
Link: https://lore.kernel.org/r/20250506204948.12048-2-mario.limonciello@amd.com
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:48:11 -04:00
Alex Deucher
5c937b4a60 drm/amdgpu/hdp4: use memcfg register to post the write for HDP flush
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.

Fixes: c9b8dcabb5 ("drm/amdgpu/hdp4.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:47:33 -04:00
Alex Deucher
e8614fc769 Revert "drm/amdgpu: Use generic hdp flush function"
This reverts commit 18a878fd8a.

Revert this temporarily to make it easier to fix a regression
in the HDP handling.

Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:47:02 -04:00
Dr. David Alan Gilbert
4c83d4538b drm/amd/pm/smu13: Remove unused smu_v3 functions
smu_v13_0_display_clock_voltage_request() and
smu_v13_0_set_min_deep_sleep_dcefclk() were added in 2020 by
commit c05d1c4015 ("drm/amd/swsmu: add aldebaran smu13 ip support (v3)")
but have remained unused.

Remove them.

smu_v13_0_display_clock_voltage_request() was the only user
of smu_v13_0_set_hard_freq_limited_range().  Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:46:42 -04:00
Dr. David Alan Gilbert
2c599d66b9 drm/amd/pm/smu11: Remove unused smu_v11_0_get_dpm_level_range
The last use of smu_v11_0_get_dpm_level_range() was removed in 2020 by
commit 46a301e14e ("drm/amd/powerplay: drop unnecessary Navi1x specific
APIs")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:46:21 -04:00
Dr. David Alan Gilbert
1d8d8b8d14 drm/amd/pm/smu7: Remove unused smu7_copy_bytes_from_smc
smu7_copy_bytes_from_smc() was added in 2016 by
commit 1ff55f4651 ("drm/amd/powerplay: implement smu7_smumgr for asics
with smu ip version 7.")

but never used.

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:46:00 -04:00
Sunil Khatri
c2a3bac7c8 drm/amdgpu: fix the indentation
fix the indentation
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:6992 gfx_v11_ip_dump

compiler: gcc-11 (Debian 11.3.0-12) 11.3.0

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202505071619.7sHTLpNg-lkp@intel.com/
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Arvind Yadav <Arvind.Yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:45:27 -04:00
Huang Rui
8465f0a372 drm/amdgpu: remove mdelay in psp v12
Since secure firmware is more stable than bring up phase, I believe we
don't need such mdelays any more before wait PSP response on PSP v12.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Trigger Huang <Trigger.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:45:16 -04:00
Shane Xiao
2d274bf709 amd/amdkfd: Trigger segfault for early userptr unmmapping
If applications unmap the memory before destroying the userptr, it needs
trigger a segfault to notify user space to correct the free sequence in
VM debug mode.

v2: Send gpu access fault to user space
v3: Report gpu address to user space, remove unnecessary params
v4: update pr_err into one line, remove userptr log info

Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:45:09 -04:00
Shane Xiao
8e320f67d4 drm/amdgpu: Add debug bit for userptr usage
In VM debug mode, it is desirable to notify the application
to correct the freeing sequence by unmapping the memory before
destroying the userptr in the old userptr path. Add a bitmask
to decide whether to send gpu vm fault to the applition.

Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:45:04 -04:00
Prike Liang
def41146b9 drm/amdgpu: unreserve the gem BO before returning from attach error
It requires unlocking the reserved gem BO before returning from
attaching the eviction fence error.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:44:59 -04:00
Prike Liang
926c79ad6e drm/amdgpu: promote the implicit sync to the dependent read fences
The driver doesn't want to implicitly sync on the DMA_RESV_USAGE_BOOKKEEP
usage fences, and the BOOKEEP fences should be synced explicitly. So, as
the VM implicit syncing only need to return and sync the dependent read
fences.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:44:51 -04:00
Alex Deucher
6edc89645c drm/amdgpu/psp: mark securedisplay TA as optional
This is an optional TA which is only available on
certain embedded systems.  Mark it as optional to avoid
user confusion.  This mirrors what we already do for
other optional TAs.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4181
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:44:45 -04:00
Alex Deucher
06f2dcc241 drm/amdgpu: fix pm notifier handling
Set the s3/s0ix and s4 flags in the pm notifier so that we can skip
the resource evictions properly in pm prepare based on whether
we are suspending or hibernating.  Drop the eviction as processes
are not frozen at this time, we we can end up getting stuck trying
to evict VRAM while applications continue to submit work which
causes the buffers to get pulled back into VRAM.

v2: Move suspend flags out of pm notifier (Mario)

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4178
Fixes: 2965e6355d ("drm/amd: Add Suspend/Hibernate notification callback support")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:43:18 -04:00
Ellen Pan
086809c82c drm/amdgpu: Implement unrecoverable error message handling for VFs
This notification may arrive in VF mailbox while polling for response from
another event.

This patches covers the following scenarios:

- If VF is already in RMA state, then do not attempt to contact the host.
  Host will ignore the VF after sending the notification.

- If the notification is detected during polling, then set the RMA status,
  and return error to caller.

- If the notification arrives by interrupt, then set the RMA status and
  queue a reset.  This reset will fail and VF will stop runtime services.

Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com>
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Signed-off-by: Ellen Pan <yunru.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:43:13 -04:00
Ellen Pan
6be34e1d1f drm/amdgpu: Add unrecoverable error message definitions for VFs
Host may stop runtime services after reaching a bad page threshold.

This notification will indicate to the VF that it no longer has
access to the GPU.

Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com>
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Signed-off-by: Ellen Pan <yunru.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:43:07 -04:00
Alex Deucher
ce8f7d9589 Revert "drm/amd: Stop evicting resources on APUs in suspend"
This reverts commit 3a9626c816.

This breaks S4 because we end up setting the s3/s0ix flags
even when we are entering s4 since prepare is used by both
flows.  The causes both the S3/s0ix and s4 flags to be set
which breaks several checks in the driver which assume they
are mutually exclusive.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3634
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:42:30 -04:00
Ruijing Dong
5c89ceda99 drm/amdgpu/vcn: using separate VCN1_AON_SOC offset
VCN1_AON_SOC_ADDRESS_3_0 offset varies on different
VCN generations, the issue in vcn4.0.5 is caused by
a different VCN1_AON_SOC_ADDRESS_3_0 offset.

This patch does the following:

    1. use the same offset for other VCN generations.
    2. use the vcn4.0.5 special offset
    3. update vcn_4_0 and vcn_5_0

Acked-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:42:19 -04:00
Prike Liang
af7160c25c drm/amdgpu: fix the eviction fence dereference
The dma_resv_add_fence() already refers to the added fence.
So when attaching the evciton fence to the gem bo, it needn't
refer to it anymore.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:41:54 -04:00
Ellen Pan
5da3d8820d drm/amdgpu: Implement Runtime Bad Page query for VFs
Host will send a notification when new bad pages are available.

Uopn guest request, the first 256 bad page addresses
will be placed into the PF2VF region.
Guest should pause the PF2VF worker thread while
the copy is in progress.

Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com>
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Signed-off-by: Ellen Pan <yunru.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:41:49 -04:00
Ellen Pan
6615f1ad34 drm/amdgpu: Add Runtime Bad Page message definitions for VFs
Currently VFs rely on poison consumption interrupt from HW
to kick off the bad page retirement process. Part of this process
includes a VF reset.

This patch adds the following:

1) Host Bad Pages notification message.
2) Guest request bad pages message.

When combined, VFs are able to reserve the pages early, and potentially
avoid future poison consumption that will disrupt user services
from consequent FLR.

Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com>
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Signed-off-by: Ellen Pan <yunru.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:41:43 -04:00
Rodrigo Siqueira
dd3d035a78 Documentation/gpu: Add new entries to amdgpu glossary
Add some additional entries.

Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:41:39 -04:00
Rodrigo Siqueira
c8305c6327 drm/amdgpu: Add documentation to some parts of the AMDGPU ring and wb
Add some random documentation associated with the ring buffer
manipulations and writeback.

Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:41:35 -04:00
Eric Huang
f0be138691 drm/amdkfd: change error to warning message for SDMA queues creation
SDMA doesn't support oversubsciption, it is the user matter to create
queues over HW limit, but not supposed to be a KFD error.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:41:27 -04:00
Harry Wentland
d01ca8708d drm/amd/display: Don't check for NULL divisor in fixpt code
[Why]
We check for a NULL divisor but don't act on it.
This check does nothing other than throw a warning.
It does confuse static checkers though:
See https://lkml.org/lkml/2025/4/26/371

[How]
Drop the ASSERTs in both DC and SPL variants.

Fixes: 4562236b3b ("drm/amd/dc: Add dc display driver (v2)")
Fixes: 6efc0ab3b0 ("drm/amd/display: add back quality EASF and ISHARP and dc dependency changes")
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:40:15 -04:00
Ivan Shamliev
e2255687c8 drm/amd/display: Use true/false for boolean variables in DML2 core files
Replace 0 and 1 with false and true for boolean variables in
dml2_core_dcn4_calcs.c and dml2_core_utils.c to align with the Linux
kernel coding style guidelines, which recommend using C99 bool type
with true/false values.

Signed-off-by: Ivan Shamliev <ivan.shamliev.dev@abv.bg>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:40:02 -04:00
James Flowers
3e71fc7c4c drm/amd/display: adds kernel-doc comment for dc_stream_remove_writeback()
Adds kernel-doc for externally linked dc_stream_remove_writeback function.

Signed-off-by: James Flowers <bold.zone2373@fastmail.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07 17:39:55 -04:00
Arvind Yadav
3e50b1d625 drm/amdgpu: only keep most recent fence for each context
Keep only the latest fences to reduce the number of values
given back to userspace

v2: - Export this code from dma-fence-unwrap.c(by Christian).
v3: - To split this in a dma_buf patch and amd userq patch(by Sunil).
    - No need to add a new function just re-use existing(by Christian).
v4: Export dma_fence_dedub_array function and used it(by Christian).

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-05 13:29:58 -04:00
Srinivasan Shanmugam
68071eb0ae drm/amdgpu: Add Support for enforcing isolation without Cleaner Shader
Adjusted the enforce isolation setting handling to include the ability
to disable the cleaner shader without affecting isolation between tasks.

v2: Updated enforce isolation documentation and parameters. (Alex)

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-05 13:29:53 -04:00
Arvind Yadav
575ec9b0c2 dma-fence: Add helper to sort and deduplicate dma_fence arrays
Export a new helper function `dma_fence_dedup_array()` that sorts
an array of dma_fence pointers by context, then deduplicates the array
by retaining only the most recent fence per context.

This utility is useful when merging or optimizing sets of fences where
redundant entries from the same context can be pruned. The operation is
performed in-place and releases references to dropped fences using
dma_fence_put().

v2: - Export this code from dma-fence-unwrap.c(by Christian).
v3: - To split this in a dma_buf patch and amd userq patch(by Sunil).
    - No need to add a new function just re-use existing(by Christian).
v4: - Export dma_fence_dedub_array and use it(by Christian).

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-05 13:29:44 -04:00
Sunil Khatri
71353c1a4f drm/amdgpu: change DRM_DBG_DRIVER to drm_dbg_driver
update the functions in amdgpu_userqueues.c from
DRM_DBG_DRIVER to drm_dbg_driver so multi gpu instance
can be logged in.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-05 13:29:38 -04:00
Sunil Khatri
c46a37628a drm/amdgpu: change DRM_ERROR to drm_file_err in amdgpu_userq.c
change the DRM_ERROR and drm_err to drm_file_err
to add process name and pid to the logging.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-05 13:29:33 -04:00
Sunil Khatri
8c97cdb1a6 drm/amdgpu: use drm_file_err in fence timeouts
use drm_file_err instead of DRM_ERROR which adds
process and pid information in the userqueue error
logging.

Sample log:

[   19.802315] amdgpu 0000:0a:00.0: [drm] *ERROR* comm: ibus-x11 pid: 2055 client: Unset ... Couldn't unmap all the queues
[   19.802319] amdgpu 0000:0a:00.0: [drm] *ERROR* comm: ibus-x11 pid: 2055 client: Unset ... Failed to evict userqueue
[   19.838432] amdgpu 0000:0a:00.0: [drm] *ERROR* comm: systemd-logind pid: 1042 client: Unset ... Couldn't unmap all the queues
[   19.838436] amdgpu 0000:0a:00.0: [drm] *ERROR* comm: systemd-logind pid: 1042 client: Unset ... Failed to evict userqueue

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-05 13:29:25 -04:00
Sunil Khatri
30ff75809d drm/amdgpu: add drm_file reference in userq_mgr
drm_file will be used in usermode queues code to
enable better process information in logging and hence
add drm_file part of the userq_mgr struct.

update the drm_file pointer in userq_mgr for each
amdgpu_driver_open_kms.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-05 13:29:18 -04:00
Sunil Khatri
fc3817fb49 drm: add drm_file_err function to add process info
Add a drm helper function which appends the process information for
the drm_file over drm_err formatted output.

v5: change to macro from function (Christian Koenig)
    add helper functions for lock/unlock (Christian Koenig)

v6: remove __maybe_unused and make function inline (Jani Nikula)
    remove drm_print.h

v7: Use va_format and %pV to concatenate fmt and vargs (Jani Nikula)

v8: Code formatting and typos (Ursulin tvrtko)

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-05 13:29:09 -04:00
Taimur Hassan
c38de9db74 drm/amd/display: Promote DC to 3.2.331
Summary

* Remove redundant NULL check
* Fix invalid context error in dml helper
* Prepare for Fused I2C-over-AUX
* Allow DSCClock disable
* Vmax / Vmin update for Vsync
* Fix race condition in DPIA AUX transfer
* Fix wrong handling for AUX_DEFER case
* Only wait for required space in DMUB mailbox

Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-05 13:29:06 -04:00
Dillon Varone
dbc5b24fff drm/amd/display: Only wait for required free space in DMUB mailbox
[WHY&HOW]
When command submission is blocked by a full mailbox, only wait for
enough space to free to submit the command, instead of waiting for idle.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-05 13:28:59 -04:00
Meenakshikumar Somasundaram
59510792ba drm/amd/display: Assign preferred stream encoder instance to dpia
[Why]
For dpia, preferred engine instance availability is not checked
when assigning stream encoder instance.

[How]
Check for dpia preferred engine id and assign the same stream
encoder instance for the stream if available.

Reviewed-by: PeiChen Huang <peichen.huang@amd.com>
Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-05 13:28:47 -04:00
Wayne Lin
3637e457eb drm/amd/display: Fix wrong handling for AUX_DEFER case
[Why]
We incorrectly ack all bytes get written when the reply actually is defer.
When it's defer, means sink is not ready for the request. We should
retry the request.

[How]
Only reply all data get written when receive I2C_ACK|AUX_ACK. Otherwise,
reply the number of actual written bytes received from the sink.
Add some messages to facilitate debugging as well.

Fixes: ad6756b4d7 ("drm/amd/display: Shift dc link aux to aux_payload")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-05 13:26:43 -04:00
Wayne Lin
9b540e3fe6 drm/amd/display: Copy AUX read reply data whenever length > 0
[Why]
amdgpu_dm_process_dmub_aux_transfer_sync() should return all exact data
reply from the sink side. Don't do the analysis job in it.

[How]
Remove unnecessary check condition AUX_TRANSACTION_REPLY_AUX_ACK.

Fixes: ead08b95fa ("drm/amd/display: Fix race condition in DPIA AUX transfer")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-05 13:25:31 -04:00
Wayne Lin
81b5c6fa62 drm/amd/display: Remove incorrect checking in dmub aux handler
[Why & How]
"Request length != reply length" is expected behavior defined in spec.
It's not an invalid reply. Besides, replied data handling logic is not
designed to be written in amdgpu_dm_process_dmub_aux_transfer_sync().
Remove the incorrectly handling section.

Fixes: ead08b95fa ("drm/amd/display: Fix race condition in DPIA AUX transfer")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-05 13:01:50 -04:00
Wayne Lin
1db6c9e9b6 drm/amd/display: Fix the checking condition in dmub aux handling
[Why & How]
Fix the checking condition for detecting AUX_RET_ERROR_PROTOCOL_ERROR.
It was wrongly checking by "not equals to"

Reviewed-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-05 13:01:44 -04:00
Wayne Lin
d5c9ade755 drm/amd/display: Shift DMUB AUX reply command if necessary
[Why]
Defined value of dmub AUX reply command field get updated but didn't
adjust dm receiving side accordingly.

[How]
Check the received reply command value to see if it's updated version
or not. Adjust it if necessary.

Fixes: ead08b95fa ("drm/amd/display: Fix race condition in DPIA AUX transfer")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-05 12:58:55 -04:00
Dillon Varone
4465dd0e41 drm/amd/display: Refactor SubVP cursor limiting logic
[WHY]
There are several gaps that can result in SubVP being enabled with
incompatible HW cursor sizes, and unjust restrictions to cursor size due
to wrong predictions on future usage of SubVP.

[HOW]
- remove "prediction" logic in favor of tagging based on previous SubVP
  usage
- block SubVP if current HW cursor settings are incompatible
- provide interface for DM to determine if HW cursor should be disabled
  due to an attempt to enable SubVP

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-05 12:58:49 -04:00