On ALDEBARAN, we need to traverse all column bits higher than
BIT11(C4C3C2) in a row, the shift of R14 bit should be also taken
into account. Retire all pages we find.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
One piece of umc normalizing address can be mapped to 16 pieces of
physical address in each umc channel on ALDEBARAN.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Create common amdgpu_umc_fill_error_record function for all versions
of UMC and clean up related codes.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
A number of BIOS versions have a problem with the watermarks table not
being configured properly. This manifests as a very scary looking warning
during resume from s0i3. This should be harmless in most cases and is well
understood, so decrease the assertion to a clearer warning about the problem.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On GPUs with RAS, poison can propagate between processes if VRAM is not
cleared when it is freed or allocated. The reason is, that not all write
accesses clear RAS poison. 32-byte writes by the SDMA engine do clear RAS
poison. Clearing memory in the background when it is freed should avoid
major performance impact. KFD has been doing this already for a long time.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
In rmmod procedure, kfd sends cp a dequeue request, but the
request does not get response, then an error message "cp
queue pipe 4 queue 0 preemption failed" printed.
[how]
Performing kfd suspending after disabling gfxoff can fix it.
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The sub-routine(amdgpu_gfx_off_ctrl) tried to obtain the lock
adev->pm.mutex which was actually hold by amdgpu_dpm_force_performance_level.
A deadlock happened then.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The existing way cannot handle Beige Goby well as a different
PPTable data structure(PPTable_beige_goby_t instead of PPTable_t)
is used there.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
configure_dp_hpo_throttled_vcp_size() was missing promotion before, but it was covered by
not calling the missing function hook in the old interface hpo_dp_link_encoder->funcs.
Recent refactor replaces with new caller link_hwss->set_throttled_vcp_size
which needs that hook, and that causes null ptr hang.
Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
kfd_process_notifier_release flush svm_range_restore_work
which calls svm_range_list_lock_and_flush_work to flush deferred_list
work, but if deferred_list work mmput release the last user, it will
call exit_mmap -> notifier_release, it is deadlock with below backtrace.
Move flush svm_range_restore_work to kfd_process_wq_release to avoid
deadlock. Then svm_range_restore_work take task->mm ref to avoid mm is
gone while validating and mapping ranges to GPU.
Workqueue: events svm_range_deferred_list_work [amdgpu]
Call Trace:
wait_for_completion+0x94/0x100
__flush_work+0x12a/0x1e0
__cancel_work_timer+0x10e/0x190
cancel_delayed_work_sync+0x13/0x20
kfd_process_notifier_release+0x98/0x2a0 [amdgpu]
__mmu_notifier_release+0x74/0x1f0
exit_mmap+0x170/0x200
mmput+0x5d/0x130
svm_range_deferred_list_work+0x104/0x230 [amdgpu]
process_one_work+0x220/0x3c0
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reported-by: Ruili Ji <ruili.ji@amd.com>
Tested-by: Ruili Ji <ruili.ji@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
svm_deferred_list work should continue to handle deferred_range_list
which maybe split to child range to avoid child range leak, and remove
ranges mmu interval notifier to avoid mm mm_count leak. So taking mm
reference when adding range to deferred list, to ensure mm is valid in
the scheduled deferred_list_work, and drop the mm referrence after range
is handled.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reported-by: Ruili Ji <ruili.ji@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
SVM ioctls take proper svms->lock to handle race conditions, don't need
take process mutex to serialize ioctls. This also fixes circular locking
warning:
WARNING: possible circular locking dependency detected
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock((work_completion)(&svms->deferred_list_work));
lock(&process->mutex);
lock((work_completion)(&svms->deferred_list_work));
lock(&process->mutex);
*** DEADLOCK ***
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Unused. Convert the divisions into asserts on the divisor, to
debug why it is zero. The divide by zero is suspected of causing
kernel panics.
While I have no idea where the zero is coming from I think this
patch is a positive either way.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Change a static function's comment from "/**" (indicating kernel-doc
notation) to "/*" (indicating a regular C language comment).
This prevents multiple kernel-doc warnings:
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_dp.c:4343: warning: Function parameter or member 'max_supported_frl_bw_in_kbps' not described in 'intersect_frl_link_bw_support'
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_dp.c:4343: warning: Function parameter or member 'hdmi_encoded_link_bw' not described in 'intersect_frl_link_bw_support'
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_dp.c:4343: warning: expecting prototype for Return PCON's post FRL link training supported BW if its non(). Prototype was for intersect_frl_link_bw_support() instead
Fixes: c022375ae0 ("drm/amd/display: Add DP-HDMI FRL PCON Support in DC")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Fangzhi Zuo <Jerry.Zuo@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The pointer reg is being assigned a value that is not read, the
exit path via label 'out' never accesses it. The assignment is
redundant and can be removed.
Cleans up clang scan build warning:
drivers/gpu/drm/radeon/radeon_object.c:570:3: warning: Value
stored to 'reg' is never read [deadcode.DeadStores]
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
clang static analysis reports this represenative problem
amdgpu_smu.c:144:18: warning: The left operand of '*' is a garbage value
return clk_freq * 100;
~~~~~~~~ ^
If there is no get_dpm_ultimate_freq function,
smu_get_dpm_freq_range returns success without setting the
output min,max parameters. So return an -ENOTSUPP error.
Fixes: e5ef784b1e ("drm/amd/powerplay: revise calling chain on retrieving frequency range")
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mirrors the logic for dcn30. Cue lots of WARNs and some
kernel panics without this fix.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It calls populate_dml_pipes which uses doubles to initialize the
scale_ratio_depth params. Mirrors the dcn20 logic.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In amdgpu_dm_connector_add_common_modes(), amdgpu_dm_create_common_mode()
is assigned to mode and is passed to drm_mode_probed_add() directly after
that. drm_mode_probed_add() passes &mode->head to list_add_tail(), and
there is a dereference of it in list_add_tail() without recoveries, which
could lead to NULL pointer dereference on failure of
amdgpu_dm_create_common_mode().
Fix this by adding a NULL check of mode.
This bug was found by a static analyzer.
Builds with 'make allyesconfig' show no new warnings,
and our static analyzer no longer warns about this code.
Fixes: e7b07ceef2 ("drm/amd/display: Merge amdgpu_dm_types and amdgpu_dm")
Signed-off-by: Zhou Qingyang <zhou1615@umn.edu>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In calculate_bandwidth(), the tag free_sclk and free_yclk are reversed,
which could lead to a memory leak of yclk.
Fix this bug by changing the location of free_sclk and free_yclk.
This bug was found by a static analyzer.
Builds with 'make allyesconfig' show no new warnings,
and our static analyzer no longer warns about this code.
Fixes: 2be8989d0f ("drm/amd/display/dc/calcs/dce_calcs: Move some large variables from the stack to the heap")
Signed-off-by: Zhou Qingyang <zhou1615@umn.edu>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Extend secondary function handling for runtime pm beyond audio
to USB and UCSI.
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Seems more logical to enable runtime pm at the end of
the init sequence so we don't end up entering runtime
suspend before init is finished.
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We need to set the APU flag from IP discovery before
we evaluate this code.
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
A porting error resulted in the stream assignment for the link
being retained without being released - a memory leak.
[How]
Fix the porting error by adding back the dc_stream_release() intended
as part of the original patch.
Fixes: 0bb2455585 ("drm/amd/display: retain/release at proper places in link_enc assignment")
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
To help triage issues and coordinate driver/bios release dependency
[How]
Only enable the new Z9 interface when debug option is set, otherwise
treat Z10 only support case as Zstate disallowed.
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Eric Yang <Eric.Yang2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
psr_feature_enabled flag is dynamically updated, and sometimes when
zstate allow status is determined the flag has not been set to true yet
even on PSR enabled config, lid off/on is such a case, which will result
in zstate disabled even though PSR is supported.
[How]
Check the supported PSR version and the PSR disable status instead.
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Eric Yang <Eric.Yang2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
There is a change previously to disallow DM to set dp drive setings when
stream is not present. The logic might not work well with DP PHY
complaince scenario with a PHY test fixture attachment. We need to make
the method allow DP link drive settings changes even without stream
attached to it.
[how]
revert back to previous code in set drive setting function then add an
empty link_resource structure, then assign link resource based on
current link resource if link resource is allocated to the current pipe.
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Wayne Lin <Wayne.Lin@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
OnSetSourceContentAttribute it does not trigger an update for the VSC
with TF change.
[how]
In this call, create a new VSC infoPacket based on the new config, and
allow DisplayTarget decide if an update and pursuant passive flip is
necessary
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Reza Amini <Reza.Amini@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
Due to bad hardware, the PHY repeater count in LTTPR cap is read as 0xFF
in some monitors while the LTTPR is actually present.
[how]
Remove PHY repeater counter check when configuring LTTPR mode.
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Sung Joon Kim <sungkim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
indirect register index/data pair may be used by multi-threads. when it
happens, it would cause register access issue that is hard to trace.
[How]
Using cgs service, which provide a sync indirect reg access api.
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Roy Chan <roy.chan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>