what:
some os preemption path is messed up with world switch preemption
fix:
cleanup those logics so os preemption not mixed with world switch
this patch is a general fix for issues comes from SRIOV MCBP, but
there is still UMD side issues not resovlved yet, so this patch
cannot fix all world switch bug.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1)
we shouldn't load PSP kdb and sys/sos for VF, they are
supposed to be handled by hypervisor
2)
ih reroute doesn't work on VF thus we should avoid calling
it, besides VF should not use those PSP register sets for PF
3)
shouldn't load SMU ucode under SRIOV, otherwise PSP would report
error
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As the firmware versions of arcturus are different from other gfx9
ASICs. And the warning("CP firmware version too old, please update!")
caused by this check can be eliminated.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The rlc version of raven_kicer_rlc is different from the legacy rlc
version of raven_rlc. So it needs to add a judgement function for
raven_kicer_rlc and avoid disable GFXOFF when loading raven_kicer_rlc.
Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When NBIO's RAS error happens, before trigging GPU reset, it's needed
to record error counter information, which can correct the error counter
value missed issue when reading from debugfs.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Once sync flood interrupt is triggered by RAS error, before
actual GPU recovery job, it's necessary to log on and print
non-zero error counter, this will help user knows where the
RAS error source is from quickly.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Clang warns:
../drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:980:36:
warning: address of 'sink->edid_caps.panel_patch.skip_scdc_overwrite'
will always evaluate to 'true' [-Wpointer-bool-conversion]
if (&sink->edid_caps.panel_patch.skip_scdc_overwrite)
~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~
1 warning generated.
This is probably not what was intended so remove the address of
operator, which matches how skip_scdc_overwrite is handled in the rest
of the driver.
While we're here, drop an extra newline after this if block.
Fixes: a760fc1bff ("drm/amd/display: add monitor patch to disable SCDC read/write")
Link: https://github.com/ClangBuiltLinux/linux/issues/879
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Since 'smu_feature_is_enabled(smu, SMU_FEATURE_BACO_BIT)' will always return
false considering the 'smu_system_features_control(smu, false)' disabled
all SMU features.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The copy_to_user() function returns the number of bytes remaining to be
copied, but we want to return a negative error code to the user.
Fixes: 030d5b97a5 ("drm/amdgpu: use amdgpu_device_vram_access in amdgpu_ttm_vram_read")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Otherwise we readback all ones. Fixes rlc counter
readback while gfxoff is active.
Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Otherwise we readback all ones. Fixes rlc counter
readback while gfxoff is active.
Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It's 25 Mhz (refclk / 4). This fixes the interpretation
of the rlc clock counter.
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In certain situations the message could be reported dozens-to-hundreds of
times, based on how often the function is called.
E.g. If MCLK DPM, any calls to get/set MCLK will result in a failure
message, potentially flooding dmesg. Ratelimit the warnings to avoid
this flood.
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
there was a type in the terminate command.
We should be calling psp_dtm_unload() instead of psp_hdcp_unload()
Fixes: 143f230533 ("drm/amdgpu: psp DTM init")
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
HDCP DTM needs to be aware of the upto date display topology
information in order to validate hardware consistency.
[how]
update HDCP DTM on update_stream_config call.
Signed-off-by: Wenjing Liu <Wenjing.Liu@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
According to the specs when bksv or ksv list fails SRM check,
HDCP TX should abort hdcp immediately.
However with the current code HDCP will be reattampt upto 4 times.
[how]
Add the logic that stop HDCP retry if bksv or ksv list
is revoked.
Signed-off-by: Wenjing Liu <Wenjing.Liu@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
PSP added a new return code for revoked receivers (SRM). We need to
handle that so we don't retry hdcp
This is already being handled on windows
[How]
Add the enums to psp interface header and handle them.
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Seems to work reliably on VI+ except for a few so enable runpm barring
those where baco for runtime power management is not supported.
[rajneesh] Picked https://patchwork.freedesktop.org/patch/335402/ to
enable runtime pm with baco for kfd. Also fixed a checkpatch warning and
added extra checks for VEGA20 and ARCTURUS.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
So far the kfd driver implemented same routines for runtime and system
wide suspend and resume (s2idle or mem). During system wide suspend the
kfd aquires an atomic lock that prevents any more user processes to
create queues and interact with kfd driver and amd gpu. This mechanism
created problem when amdgpu device is runtime suspended with BACO
enabled. Any application that relies on kfd driver fails to load because
the driver reports a locked kfd device since gpu is runtime suspended.
However, in an ideal case, when gpu is runtime suspended the kfd driver
should be able to:
- auto resume amdgpu driver whenever a client requests compute service
- prevent runtime suspend for amdgpu while kfd is in use
This change refactors the amdgpu and amdkfd drivers to support BACO and
runtime power management.
Reviewed-by: Oak Zeng <oak.zeng@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
During system suspend the kfd driver aquires a lock that prohibits
further kfd actions unless the gpu is resumed. This adds some info which
can be useful while debugging.
Reviewed-by: Oak Zeng <oak.zeng@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_hw_sequencer.c:
In function dcn10_post_unlock_program_front_end:
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_hw_sequencer.c:2623:29:
warning: variable stream_status set but not used [-Wunused-but-set-variable]
commit bbf5f6c3f8 ("drm/amd/display: Split program front end part that occur outside lock")
involved this unused variable.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[WHY]
Programming DPPCLK to the same value currently set may cause
underflow while playing video in certain conditions.
[HOW]
Only program DPPCLK if clock is not the same as the
previous value programmed.
Signed-off-by: Sung Lee <sung.lee@amd.com>
Reviewed-by: Yongqiang Sun <yongqiang.sun@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
In the case where we do have these passed in then we should be using
these instead of the registers - those are only programmed when booting
the adapter as primary.
[How]
Check if fb_base/fb_offset are non-zero. The value of 0 is invalid
for fb_base but is valid for fb_offset on dGPU.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Reviewed-by: Yongqiang Sun <yongqiang.sun@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
In DCN hardware sequencer we do actually call ATOM_INIT correctly per
pipe. The workaround is not necessary for command table offloading.
[How]
Drop the workaround since it's not needed.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Chris Park <Chris.Park@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
customer issue: found that for their specific panel, EDID register space
being overwritten during SCDC read write
[how]
customer accepted HDMI 2 features not working - disabled SCDC read/write
as well as HDMI 2 in general based on monitor patch
Signed-off-by: Martin Leung <martin.leung@amd.com>
Reviewed-by: Chris Park <Chris.Park@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Currently odm scaling calculations are only done when adding initial
odm pipe. Any scaling re-calculations will mess up odm because of this.
This change resolves the problem by updating scaling split logic to
handle odm.
Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Eric Yang <eric.yang2@amd.com>
Acked-by: Michael Strauss <Michael.Strauss@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>