Commit Graph

1323631 Commits

Author SHA1 Message Date
Michel Dänzer
695c2c745e drm/amdgpu: Handle NULL bo->tbo.resource (again) in amdgpu_vm_bo_update
Third time's the charm, I hope?

Fixes: d3116756a7 ("drm/ttm: rename bo->mem and make it a pointer")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3837
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:39:08 -05:00
Mirsad Todorovac
a21ab06b8c drm/admgpu: replace kmalloc() and memcpy() with kmemdup()
The static analyser tool gave the following advice:

./drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:1266:7-14: WARNING opportunity for kmemdup

 → 1266         tmp = kmalloc(used_size, GFP_KERNEL);
   1267         if (!tmp)
   1268                 return -ENOMEM;
   1269
 → 1270         memcpy(tmp, &host_telemetry->body.error_count, used_size);

Replacing kmalloc() + memcpy() with kmemdump() doesn't change semantics.
Original code works without fault, so this is not a bug fix but proposed improvement.

Link: https://lwn.net/Articles/198928/
Fixes: 84a2947ecc ("drm/amdgpu: Implement virt req_ras_err_count")
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Xinhui Pan <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Zhigang Luo <Zhigang.Luo@amd.com>
Cc: Victor Skvortsov <victor.skvortsov@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Lijo Lazar <lijo.lazar@amd.com>
Cc: Yunxiang Li <Yunxiang.Li@amd.com>
Cc: Jack Xiao <Jack.Xiao@amd.com>
Cc: Vignesh Chander <Vignesh.Chander@amd.com>
Cc: Danijel Slivka <danijel.slivka@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mirsad Todorovac <mtodorovac69@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:39:08 -05:00
Srinivasan Shanmugam
b64f2f3e87 drm/amd/display: Fix NULL pointer dereference in dmub_tracebuffer_show
It corrects the issue by checking if 'adev->dm.dmub_srv' is NULL before
accessing its 'meta_info' member. This ensures that we do not
dereference a NULL pointer.

Fixes the below:
	drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_debugfs.c:917 dmub_tracebuffer_show()
	warn: address of 'adev->dm.dmub_srv->meta_info' is non-NULL

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_debugfs.c
    901 static int dmub_tracebuffer_show(struct seq_file *m, void *data)
    902 {
    903         struct amdgpu_device *adev = m->private;
    904         struct dmub_srv_fb_info *fb_info = adev->dm.dmub_fb_info;
    905         struct dmub_fw_meta_info *fw_meta_info = &adev->dm.dmub_srv->meta_info;
                                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Even if adev->dm.dmub_srv is NULL, the address of ->meta_info can't be NULL

    906         struct dmub_debugfs_trace_entry *entries;
    907         uint8_t *tbuf_base;
    908         uint32_t tbuf_size, max_entries, num_entries, first_entry, i;
    909
    910         if (!fb_info)
    911                 return 0;
    912
    913         tbuf_base = (uint8_t *)fb_info->fb[DMUB_WINDOW_5_TRACEBUFF].cpu_addr;
    914         if (!tbuf_base)
    915                 return 0;
    916
--> 917         tbuf_size = fw_meta_info ? fw_meta_info->trace_buffer_size :
                            ^^^^^^^^^^^^ Always non-NULL

    918                                    DMUB_TRACE_BUFFER_SIZE;
    919         max_entries = (tbuf_size - sizeof(struct dmub_debugfs_trace_header)) /
    920                       sizeof(struct dmub_debugfs_trace_entry);
    921
    922         num_entries =

v2: Initialize struct dmub_fw_meta_info *fw_meta_info to NULL (Dan Carpenter)

Fixes: 5a498172c8 ("drm/amd/display: Make DMCUB tracebuffer debugfs chronological")
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:39:08 -05:00
Philip Yang
e37ccf44ac drm/amdgpu: Show warning message if IH ring overflow
If IH primary ring and KFD ih fifo overflows, we may miss CP, SDMA
interrupts and cause application soft hang. Show warning message with
ring name if overflow happens.

Add function to get ih ring name to avoid duplicating it. To keep
warning message consistent between GPU generations, change all
*_ih.c except ASICs older than Vega which has only one ih ring.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:39:07 -05:00
Philip Yang
de844846f7 drm/amdkfd: Improve signal event slow path
If event slot is not signaled, kfd_signal_event_interrupt goes to slow
path to scan all event slots to find the signaled event, this is needed
for old ASICs that don't have the event ID or the event IDs are
incorrect in the IH payload.

There is case that GPU signal the same event twice, then driver process
the first event interrupt, set_event and event slot is auto-reset, then
for the second event interrupt, KFD goes to slow path as event is not
signaled, just drop the second event interrupt because the application
only need wakeup once.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:39:07 -05:00
Philip Yang
34db5a3261 drm/amdkfd: Queue interrupt work to different CPU
For CPX mode, each KFD node has interrupt worker to process ih_fifo to
send events to user space. Currently all interrupt workers of same adev
queue to same CPU, all workers execution are actually serialized and
this cause KFD ih_fifo overflow when CPU usage is high.

Use per-GPU unbounded highpri queue with number of workers equals to
number of partitions, let queue_work select the next CPU round robin
among the local CPUs of same NUMA.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:39:07 -05:00
Philip Yang
1b00143231 drm/amdgpu: Optimize gfx v9 GPU page fault handling
After GPU page fault, there are lots of page fault interrupts generated
at short period even with CAM filter enabled because the fault address
is different. Each page fault copy to KFD ih fifo to send event to user
space by KFD interrupt worker, this could cause KFD ih fifo overflow
while other processes generate events at same time.

KFD process is aborted after GPU page fault, we only need one GPU page
fault interrupt sent to KFD ih fifo to send memory exception event to
user space.

Incease KFD ih fifo size to 2 times of IH primary ring size, to handle
the burst events case.

This patch handle the gfx v9 path, cover retry on/off and CAM filter
on/off cases.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:39:07 -05:00
Philip Yang
f607b2b867 drm/amdkfd: KFD interrupt access ih_fifo data in-place
To handle 40000 to 80000 interrupts per second running CPX mode with 4
streams/queues per KFD node, KFD interrupt handler becomes the
performance bottleneck.

Remove the kfifo_out memcpy overhead by accessing ih_fifo data in-place
and updating rptr with kfifo_skip_count.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:39:07 -05:00
Christian König
11815bb0e3 drm/amdgpu: partially revert "reduce reset time"
This partially reverts commit 194eb174cb.

This commit introduced a new state variable into adev without even
remotely worrying about CPU barriers.

Since we already have the amdgpu_in_reset() function exactly for this
use case partially revert that.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:39:07 -05:00
Christian König
26c95e838e drm/amdgpu: set the VM pointer to NULL in amdgpu_job_prepare
As soon as the prepare phase is completed the VM might be released,
better set it to NULL.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:39:07 -05:00
Christian König
57f812d171 drm/amdgpu: fix amdgpu_coredump
The VM pointer might already be outdated when that function is called.
Use the PASID instead to gather the information instead.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:39:07 -05:00
Candice Li
d1ebe307b4 drm/amdgpu: Enable psp v14_0_3 RAS support for non-SRIOV configurations.
Enable psp v14_0_3 RAS support for non-SRIOV configurations.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:39:07 -05:00
Philip Yang
b4b7271e5c drm/amdgpu: Don't enable sdma 4.4.5 CTXEMPTY interrupt
The sdma context empty interrupt is dropped in amdgpu_irq_dispatch
as unregistered interrupt src_id 243, this interrupt accounts to 1/3 of
total interrupts and causes IH primary ring overflow when running
stressful benchmark application. Disable this interrupt has no side
effect found.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:39:07 -05:00
Karol Przybylski
34c4eb7d4e drm/amdgpu: Fix potential integer overflow in scheduler mask calculations
The use of 1 << i in scheduler mask calculations can result in an
unintentional integer overflow due to the expression being
evaluated as a 32-bit signed integer.

This patch replaces 1 << i with 1ULL << i to ensure the operation
is performed as a 64-bit unsigned integer, preventing overflow

Discovered in coverity scan, CID 1636393, 1636175, 1636007, 1635853

Fixes: c5c63d9cb5 ("drm/amdgpu: add amdgpu_gfx_sched_mask and amdgpu_compute_sched_mask debugfs")
Signed-off-by: Karol Przybylski <karprzy7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:38:42 -05:00
Alex Deucher
8f2cd1067a drm/amdgpu/smu14.0.2: fix IP version check
Use the helper function rather than reading it directly.

Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:22:27 -05:00
Alex Deucher
f1fd1d0f40 drm/amdgpu/gfx12: fix IP version check
Use the helper function rather than reading it directly.

Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:22:24 -05:00
Alex Deucher
63bfd24088 drm/amdgpu/mmhub4.1: fix IP version check
Use the helper function rather than reading it directly.

Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:22:22 -05:00
Alex Deucher
2c8eeaaa0f drm/amdgpu/nbio7.11: fix IP version check
Use the helper function rather than reading it directly.

Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:22:19 -05:00
Alex Deucher
0ec43fbece drm/amdgpu/nbio7.0: fix IP version check
Use the helper function rather than reading it directly.

Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:22:16 -05:00
Alex Deucher
22b9555bc9 drm/amdgpu/nbio7.7: fix IP version check
Use the helper function rather than reading it directly.

Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:22:03 -05:00
Aric Cyr
824ed4cb62 drm/amd/display: 3.2.314
DC 3.2.314 contains some improvements as summarized below:

* Update DML21 code.
* Fixes for FAMS2 interface.
* HDMI fixes.
* Compilation warning fixes.

Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:21:56 -05:00
George Shen
83626efdce drm/amd/display: Disable MPC rate control on ODM pipe update
[Why]
Seamless boot skips MPC init for the active pipe, resulting in stale MPC
rate control state being retained. This will cause issues since other
logic assumes it is disabled (as DCN30 and newer does not need it).

[How]
Disable MPC rate control on ODM pipe update to cover the seamless boot
case.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:21:50 -05:00
Chris Park
95265e4b2b drm/amd/display: Block Invalid TMDS operation
[Why]
When sink type is TMDS, PHY programming does not block against pixel
clock greater than 600MHz.

[How]
Based on sink type, block greater than 600MHz phy programming.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Chris Park <chris.park@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:21:37 -05:00
Dillon Varone
f9dfa31ff7 drm/amd/display: Re-validate streams on commit_streams
To prevent invalid HW programming, streams should be revalidated first
before committing to HW.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:21:26 -05:00
Rodrigo Siqueira
04d6273fae Revert "drm/amd/display: Fix green screen issue after suspend"
This reverts commit 87b7ebc2e1.

A long time ago, we had an issue with the Raven system when it was
connected to two displays: one with DP and another with HDMI. After the
system woke up from suspension, we saw a solid green screen caused by an
underflow generated by bad DCC metadata. To workaround this issue, the
'commit 87b7ebc2e1 ("drm/amd/display: Fix green screen issue after
suspend")' was introduced to disable the DCC for a few frames after in
the resume phase. However, in hindsight, this solution was probably a
workaround at the kernel level for some issues from another part
(probably other driver components or user space). After applying this
patch and trying to reproduce the green issue in a similar hardware
system but using the latest kernel and userspace, we cannot see the
issue, which makes this workaround obsolete and creates extra
unnecessary complexity to the code; for all of this reason, this commit
reverts the original change.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:20:21 -05:00
Alex Hung
1b0cbcf888 drm/amd/display: Fix uninitialized variables in amdgpu_dm_debugfs
[WHAT]
Some fields in struct dc_link_settings and link_training_settings are
not initialized and using them can cause unexpected results.

[HOW]
Initialize struct dc_link_settings and link_training_settings to zero.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:20:14 -05:00
Nicholas Kazlauskas
57a793a74f drm/amd/display: Apply (some) policy for DML2 formulation on DCN35/DCN351
[Why]
Dropping the entirety of dml2_policy_build_synthetic_soc_states exposes
an issue for states that cannot be filled via bbox_overrides and rely on
the default parameters that may or may not be present depending on the
DM.

For amdgpu_dm this results in missing parameters for most of the struct
in higher states:

- sr_exit_time_us
- sr_enter_plus_exit_time_us
- sr_exit_z8_time_us
- sr_enter_plus_exit_z8_time_us
- urgent_latency_pixel_data_only_us
- urgent_latency_pixel_mixed_with_vm_data_us
- urgent_latency_vm_data_only_us
- dram_clock_change_latency_us
- fclk_change_latency_us
- usr_retraining_latency_us
- writeback_latency_us
- urgent_latency_adjustment_fabric_clock_component_us
- urgent_latency_adjustment_fabric_clock_reference_mhz
- dscclk_mhz
- phyclk_mhz
- phyclk_d18_mhz
- phyclk_d32_mhz
- use_ideal_dram_bw_strobe

[How]
Copy from the first state, applying a minimal policy to set max clocks
for SOC independent values.

Then copy the SOC dependent ones from the states modified by
bbox_overrides.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:20:05 -05:00
Shunlu Zhang
5b0766f2de drm/amd/display: delete legacy code
Delete unused code.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Jun Lei <jun.lei@amd.com>
Signed-off-by: Shunlu Zhang <Shunlu.Zhang@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:19:57 -05:00
Dillon Varone
b486bc9e87 drm/amd/display: Add new message for DF throttling optimization on dcn401
[WHY]
When effective bandwidth from the SoC is enough to perform SubVP
prefetchs, then DF throttling is not required.

[HOW]
Provide SMU the required clocks for which DF throttling is not required.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:19:40 -05:00
Austin Zheng
be4e350931 drm/amd/display: DML21 Reintegration For Various Fixes
Reintegrate latest DML21 code.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:19:34 -05:00
Harry VanZyllDeJong
bb4090cda9 drm/amd/display: Fix brightness adjustment on MiniLED
[Why]
Older Asics were changed to target new DCN while still needing older
support causing brightness adjustments to fail.

[How]
Reverted the DCN targets on required DCNs

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Iswara Nagulendran <iswara.nagulendran@amd.com>
Signed-off-by: Harry VanZyllDeJong <hvanzyll@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:19:21 -05:00
Fangzhi Zuo
e56ad45e99 drm/amd/display: Fix Mode Cutoff in DSC Passthrough to DP2.1 Monitor
Source --> DP2.1 MST hub --> DP1.4/2.1 monitor

When change from DP1.4 to DP2.1 from monitor manual, modes higher than
4k120 are all cutoff by mode validation. Switch back to DP1.4 gets all
the modes up to 4k240 available to be enabled by dsc passthrough.

[why]
Compared to DP1.4 link from hub to monitor, DP2.1 link has larger
full_pbn value that causes overflow in the process of doing conversion
from pbn to kbps.

[how]
Change the data type accordingly to fit into the data limit during
conversion calculation.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:18:53 -05:00
Charlene Liu
e823421d6c drm/amd/display: init dc_power_state
Initialize the power state for dc use

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Chris Park <chris.park@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:18:46 -05:00
Meera Patel
12e4ec5d45 drm/amd/display: initialize uninitialized variable
[WHY]
There is one uninitialized variable in file
dc/hwss/dcn401/dcn401_hwseq.c, which trigger com compile warnings.

[HOW]
Initialize the unininitialized variable.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Ariel Bernstein <eric.bernstein@amd.com>
Signed-off-by: Meera Patel <meera.patel@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:18:40 -05:00
Dillon Varone
55eeaaec0d drm/amd/display: Add support for FAMS2+ interface versions
Current driver interface does not allow for flexibility in coexistence
of multiple interface versions, so add support for checking minor
interface revisions and providing appropriate programming.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:18:33 -05:00
Alvin Lee
3f238a6bd2 drm/amd/display: Update FAMS2 config cmd
The FAMS2 stream and sub-state have been separated into
2 different commands. Update the cmd function to send
one command each for the stream and sub-state.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:18:26 -05:00
Andrew Martin
357ef5b3b7 drm/amdgpu: Failed to check various return code
Clean up code to quiet the compiler on us failing to check the return
code.

Signed-off-by: Andrew Martin <Andrew.Martin@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:18:20 -05:00
Andrew Martin
88a45aa608 drm/amdkfd: Failed to check various return code
This patch checks and warns if pdd is NULL.

Signed-off-by: Andrew Martin <Andrew.Martin@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:18:11 -05:00
Lijo Lazar
635c659fce drm/amdgpu: Use dbg level for VBIOS check messages
Driver has different ways to fetch VBIOS. If one of the methods doesn't
find an authentic one, it will show misleading info messages eventhough
a subsequent method finds a valid VBIOS. Keep the message level at debug
and add device context.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:18:04 -05:00
Pierre-Eric Pelloux-Prayer
54a1b36d4b drm/amdgpu: remove useless init from amdgpu_job_alloc
This init is useless because base.sched will be cleared to 0 in drm_sched_job_init
because of commit 2320c9e6a7 ("drm/sched: memset() 'job' in drm_sched_job_init()").

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:17:46 -05:00
Pierre-Eric Pelloux-Prayer
0014952b17 drm/amdgpu: drop the amdgpu_device argument from amdgpu_ib_free
It's unused.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:17:32 -05:00
Pierre-Eric Pelloux-Prayer
2ae520cb12 drm/amdgpu: don't access invalid sched
Since 2320c9e6a7 ("drm/sched: memset() 'job' in drm_sched_job_init()")
accessing job->base.sched can produce unexpected results as the initialisation
of (*job)->base.sched done in amdgpu_job_alloc is overwritten by the
memset.

This commit fixes an issue when a CS would fail validation and would
be rejected after job->num_ibs is incremented. In this case,
amdgpu_ib_free(ring->adev, ...) will be called, which would crash the
machine because the ring value is bogus.

To fix this, pass a NULL pointer to amdgpu_ib_free(): we can do this
because the device is actually not used in this function.

The next commit will remove the ring argument completely.

Fixes: 2320c9e6a7 ("drm/sched: memset() 'job' in drm_sched_job_init()")
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:16:35 -05:00
Jiapeng Chong
6f685a8134 drm/amd/display: use swap() in update_phy_id_mapping()
Use existing swap() function rather than duplicating its implementation.

./drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c:185:47-48: WARNING opportunity for swap().
./drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c:125:53-54: WARNING opportunity for swap().

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=12335
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:16:30 -05:00
Dheeraj Reddy Jonnalagadda
69b54d7c7c drm/amdgpu: simplify return statement in amdgpu_ras_eeprom_init
Remove the logically dead code in the last return statement of
amdgpu_ras_eeprom_init. The condition res < 0 is redundant since
res is already checked for a negative value earlier. Replace
return res < 0 ? res : 0; with return 0 to improve clarity.

Fixes: 63d4c081a5 ("drm/amdgpu: Optimize EEPROM RAS table I/O")
Closes: https://scan7.scan.coverity.com/#/project-view/52337/11354?selectedIssue=1602413
Signed-off-by: Dheeraj Reddy Jonnalagadda <dheeraj.linuxdev@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:16:05 -05:00
Alex Deucher
7e50642d41 drm/amd/display: add non-DC drm_panic support
Add support for the drm_panic module, which displays a pretty user
friendly message on the screen when a Linux kernel panic occurs.

Adapt Lu Yao's code to use common helpers derived from
Jocelyn's patch.  This extends the non-DC code to enable
access to non-CPU accessible VRAM and adds support for
other DCE versions.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Lu Yao <yaolu@kylinos.cn>
Cc: Jocelyn Falempe <jfalempe@redhat.com>
Cc: Harry Wentland <harry.wentland@amd.com>
2024-12-18 12:16:01 -05:00
Jocelyn Falempe
736692c3b7 drm/amd/display: add DC drm_panic support
Add support for the drm_panic module, which displays a pretty user
friendly message on the screen when a Linux kernel panic occurs.

It doesn't work yet on laptop panels, maybe due to PSR.

Adapted from Jocelyn's original patch to add DC drm_panic
support.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Lu Yao <yaolu@kylinos.cn>
Cc: Jocelyn Falempe <jfalempe@redhat.com>
Cc: Harry Wentland <harry.wentland@amd.com>
2024-12-18 12:15:56 -05:00
Mario Limonciello
1ad5bdc28b drm/amd: Require CONFIG_HOTPLUG_PCI_PCIE for BOCO
If the kernel hasn't been compiled with PCIe hotplug support this
can lead to problems with dGPUs that use BOCO because they effectively
drop off the bus.

To prevent issues, disable BOCO support when compiled without PCIe hotplug.

Reported-by: Gabriel Marcano <gabemarcano@yahoo.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1707#note_2696862
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20241211155601.3585256-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:15:40 -05:00
Bokun Zhang
3676f37a88 drm/amdgpu/vcn: reset fw_shared under SRIOV
- The previous patch only considered the case for baremetal
  and is not applicable for SRIOV code path. We also need to
  init fw_share for SRIOV VF

Fixes: 928cd772e1 ("drm/amdgpu/vcn: reset fw_shared when VCPU buffers corrupted on vcn v4.0.3")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Bokun Zhang <bokun.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18 12:14:16 -05:00
Alex Deucher
b7a287fa0c drm/amd/display/dc: add helper for panic updates
Add a DC helper for panic updates.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Lu Yao <yaolu@kylinos.cn>
Cc: Jocelyn Falempe <jfalempe@redhat.com>
Cc: Harry Wentland <harry.wentland@amd.com>
2024-12-18 12:14:09 -05:00
Alex Deucher
98471006ae drm/amd/display: add clear_tiling mi callbacks
This adds clear_tiling callbacks to the mi structure that
will be used for drm panic support to clear the tiling on
a display.  Mem input (mi) is used on DCE based display
IPs.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Lu Yao <yaolu@kylinos.cn>
Cc: Jocelyn Falempe <jfalempe@redhat.com>
Cc: Harry Wentland <harry.wentland@amd.com>
2024-12-18 12:14:04 -05:00