Aldebaran is using jpeg2.6, and the main change is jpeg2.6 using
AMDGPU_MMHUB_0, and jpeg2.5 using AMDGPU_MMHUB_1.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aldebaran should be the same as Arcturus in the PTE SNOOPED bit handling.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aldebaran is using vcn2.6, and the main change is vcn2.6 using
AMDGPU_MMHUB_0, and vcn2.5 using AMDGPU_MMHUB_1
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Currently host-gpu io link is always reported as PCIe however, on some
A+A systems, there could be one xgmi link available. This change exposes
xgmi link via sysfs when it is present.
v2: fix includes (Alex)
Reviewed-by: Oak Zeng <oak.zeng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aldebaran uses registers defined in header gc_9_4_2 but much of the xgmi
related functionality can be obtained by reusing the exisitng definition
from gfxhub_v1_1_get_xgmi_info. While adding support for Aldebaran, also
refactored code to better handle the new scenario.
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This applies to AMD Accelerated Processing Platforms that support host
gpu interconnect throguh a special link (xgmi). Aldebaran systems will
support this special feature for utilizing the benefits of host-gpu
cache coherence. This change outlines the basic framework for mapping
the GPU VRAM (HBM) to system address space making it accesible to the
host but managed by the amdgpu driver since this region is marked as
reserved memory in host address space by the underlying system firmware.
v2: switch to smuio callback function to check the type
of host-gpu interface (Hawking)
v3: use hub callbacks rather than direct function calls (Alex)
Reviewed-by: Oak Zeng <oak.zeng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Like its predecessors Aldebran also supports advanced high bandwidth
GPU-GPU communication interface known as xgmi. This enables the basic
xgmi support while refactoring the code slightly.
Detection of xgmi link between host cpu and gpu will be introduced in a
different patch.
Reviewed-by: Oak Zeng <oak.zeng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
PMFW should be loaded before any operation that
may toggling DF-Cstate. otherwsie, tOS has no
choice but to locally toggle DF Cstate (i.e.
disable DF-Cstate even it already enabled by VBIOS)
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Similar to arcturus, but ARCH/ACC VGPRs may now be split unevenly.
A new field in SQ_WAVE_GPR_ALLOC tracks the boundary between the two
sets of VGPRs.
Squash below patches:
drm/amdkfd: Use preprocessor for IP-specific trap handler code
drm/amdkfd: Fix VGPR restore race in gfx8/gfx9 trap handler
drm/amdkfd: Remove duplicated code in gfx9 trap handler
drm/amdkfd: Separate ARCH/ACC VGPR restore in trap handler
drm/amdkfd: Reverse order of ARCH/ACC VGPR restore in trap handler
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v1: dupilcate mmhub_v1_7.c from mmhub_v1_0.c because
mmhub register address for aldebaran is different
from existing asics (Le)
v2: switch to latest mmhub_v9_4_2 register headers (Hawking)
v3: squash in init VM_L2_CNTL3 default value for mmhub v1_7
Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v1: re-use arct ip base offset array for aldebaran (Le)
v2: create aldebaran ip base offset array for major ip
blocks (Hawking)
v3: re-use arct VCN ip base offset array for aldebaran
(James)
v4: correct MP1 ip base offset array (Hawking)
v5: update VCN ip base offset array to aldebaran one
(Hawking)
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
After fixing nested FPU contexts caused by 41401ac677 we're still seeing
complaints about spurious kernel_fpu_end(). As it turns out this was
already fixed for dcn20 in commit f41ed88cbd ("drm/amdgpu/display:
use GFP_ATOMIC in dcn20_validate_bandwidth_internal") but never moved
forward to dcn21.
Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Commit 41401ac677 added FPU wrappers to dcn21_validate_bandwidth(),
which was correct. Unfortunately a nested function alredy contained
DC_FP_START()/DC_FP_END() calls, which results in nested FPU context
enter/exit and complaints by kernel_fpu_begin_mask().
This can be observed e.g. with 5.10.20, which backported 41401ac677
and now emits the following warning on boot:
WARNING: CPU: 6 PID: 858 at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask+0xa5/0xc0
Call Trace:
dcn21_calculate_wm+0x47/0xa90 [amdgpu]
dcn21_validate_bandwidth_fp+0x15d/0x2b0 [amdgpu]
dcn21_validate_bandwidth+0x29/0x40 [amdgpu]
dc_validate_global_state+0x3c7/0x4c0 [amdgpu]
The warning is emitted due to the additional DC_FP_START/END calls in
patch_bounding_box(), which is inlined into dcn21_calculate_wm(),
its only caller. Removing the calls brings the code in line with
dcn20 and makes the warning disappear.
Fixes: 41401ac677 ("drm/amd/display: Add FPU wrappers to dcn21_validate_bandwidth()")
Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].
Refactor the code according to the use of a flexible-array member in
struct SISLANDS_SMC_SWSTATE, instead of a one-element array, and use
the struct_size() helper to calculate the size for the allocation.
Also, this helps with the ongoing efforts to enable -Warray-bounds by
fixing the following warnings:
drivers/gpu/drm/radeon/si_dpm.c: In function ‘si_convert_power_state_to_smc’:
drivers/gpu/drm/radeon/si_dpm.c:2350:20: warning: array subscript 1 is above array bounds of ‘SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’ {aka ‘struct SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’} [-Warray-bounds]
2350 | smc_state->levels[i].dpm2.MaxPS = (u8)((SISLANDS_DPM2_MAX_PULSE_SKIP * (max_sclk - min_sclk)) / max_sclk);
| ~~~~~~~~~~~~~~~~~^~~
drivers/gpu/drm/radeon/si_dpm.c:2351:20: warning: array subscript 1 is above array bounds of ‘SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’ {aka ‘struct SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’} [-Warray-bounds]
2351 | smc_state->levels[i].dpm2.NearTDPDec = SISLANDS_DPM2_NEAR_TDP_DEC;
| ~~~~~~~~~~~~~~~~~^~~
drivers/gpu/drm/radeon/si_dpm.c:2352:20: warning: array subscript 1 is above array bounds of ‘SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’ {aka ‘struct SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’} [-Warray-bounds]
2352 | smc_state->levels[i].dpm2.AboveSafeInc = SISLANDS_DPM2_ABOVE_SAFE_INC;
| ~~~~~~~~~~~~~~~~~^~~
drivers/gpu/drm/radeon/si_dpm.c:2353:20: warning: array subscript 1 is above array bounds of ‘SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’ {aka ‘struct SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’} [-Warray-bounds]
2353 | smc_state->levels[i].dpm2.BelowSafeInc = SISLANDS_DPM2_BELOW_SAFE_INC;
| ~~~~~~~~~~~~~~~~~^~~
drivers/gpu/drm/radeon/si_dpm.c:2354:20: warning: array subscript 1 is above array bounds of ‘SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’ {aka ‘struct SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’} [-Warray-bounds]
2354 | smc_state->levels[i].dpm2.PwrEfficiencyRatio = cpu_to_be16(pwr_efficiency_ratio);
| ~~~~~~~~~~~~~~~~~^~~
drivers/gpu/drm/radeon/si_dpm.c:5105:20: warning: array subscript 1 is above array bounds of ‘SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’ {aka ‘struct SISLANDS_SMC_HW_PERFORMANCE_LEVEL[1]’} [-Warray-bounds]
5105 | smc_state->levels[i + 1].aT = cpu_to_be32(a_t);
| ~~~~~~~~~~~~~~~~~^~~~~~~
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.9/process/deprecated.html#zero-length-and-one-element-arrays
Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/109
Build-tested-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/603f9a8f.aDLrpMFzzSApzVYQ%25lkp@intel.com/
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>