Commit Graph

1339055 Commits

Author SHA1 Message Date
Rodrigo Siqueira
e7aaa5fbf4 Documentation/gpu: Create a GC entry in the amdgpu documentation
GC is a large block that plays a vital role for amdgpu; for this reason,
this commit creates one specific page for GC and adds extra information
about the CP component.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Rodrigo Siqueira
4ede6d2004 Documentation/gpu: Add explanation about AMD Pipes and Queues
Pipes and Queues are two common vocabulary that pervades discussions
around amdgpu core features. The definition and explanation of those
components are spread around multiple places in the code, mailing list,
and Gitlab, which sometimes leads to the wrong interpretation of these
concepts. This commit attempts to centralize the definition and
explanation of Pipe and Queue from amdgpu perspective in a kernel doc.
Most of the information in this doc was derived from:

- https://lore.kernel.org/amd-gfx/CADnq5_Pcz2x4aJzKbVrN3jsZhD6sTydtDw=6PaN4O3m4t+Grtg@mail.gmail.com/T/#m9a670b55ab20e0f7c46c80f802a0a4be255a719d
- https://gitlab.freedesktop.org/mesa/mesa/-/issues/11759

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Rodrigo Siqueira
c6a1c23d10 Documentation/gpu: Create a documentation entry just for hardware info
The APU and dGPU tables are hidden in the driver misc info, which makes
it hard to find specific hardware info when users need it. This commit
creates a single page for this information and adds it to the top of the
amdgpu list to improve searchability.

Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Rodrigo Siqueira
5acd17d6d1 Documentation/gpu: Change index order to show driver core first
Since driver-core has an overview of the AMD GPU hardware structure, it
makes more sense to keep it first. This commit move driver-core up in
the index list.

Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Rodrigo Siqueira
8f1366fcb8 Documentation/gpu: Add new acronyms
This commit introduces some new acronyms extracted from the source code
and found on some web pages around the internet (most of them came from
ArchLinux, Gentoo, and Wikipedia links).

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Alex Deucher
a9a8bccaa3 drm/amdgpu/gfx11: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Alex Deucher
683308af03 drm/amdgpu/gfx10: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Alex Deucher
a4a4c0ae67 drm/amdgpu/gfx9: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
c8b8d7a4f1 drm/amdgpu/gfx8: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
be7652c23d drm/amdgpu/gfx7: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
8307ebc15c drm/amdgpu/gfx6: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
9cffd67e80 drm/amdgpu/gfx: assign the actual me0 queues per pipe
Set the actual number of queues per pipe for ME0 (gfx).
This way we will dump all of the queues properly in
dev core dumps.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
8f970c46b5 drm/amdgpu/gfx: decouple the number of kgqs from the hw
The driver currently sets up one kgq per pipe.  As such
adev->gfx.me.num_queue_per_pipe is hardcoded to 1 everywhere.
This is fine for kernel queues, but when we enable user queues
we need to know that actual number of queues per pipe.  Decouple
the kgq setup from the actual hardware count.  For dev core
dumps and user queues, we want to know the actual number
of queues per pipe.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
0d47bb77b5 drm/amdgpu/gfx: make amdgpu_gfx_me_queue_to_bit() static
It's not used outside of amdgpu_gfx.c.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Srinivasan Shanmugam
9eab245326 drm/amdgpu/gfx10: Add Cleaner Shader Support for GFX10.3.x GPUs
Enable the cleaner shader for other GFX10.3.x series of GPUs to provide
data isolation between GPU workloads. The cleaner shader is responsible
for clearing the Local Data Store (LDS), Vector General Purpose
Registers (VGPRs), and Scalar General Purpose Registers (SGPRs), which
helps prevent data leakage and ensures accurate computation results.

This update extends cleaner shader support to GFX10.3.x GPUs, previously
available for GFX10.3.0. It enhances security by clearing GPU memory
between processes and maintains a consistent GPU state across KGD and
KFD workloads.

Cc: Mario Sopena-Novales <mario.novales@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
60d4952d89 drm/amdgpu: drop some dead code
Drop the cgs smu firmware code for SI, it's not used.
The smu firmware fetching for SI is done in si_dpm.c.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
8ae1a4eef7 drm/amdgpu: add initial documentation for debugfs files
Describes what debugfs files are available and what
they are used for.

v2: fix some typos (Mark Glines)
v3: Address comments from Siqueira and Kent

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alexandre Demers
160d3d39f6 drm/amdgpu: continue cleaning up sid.h and si_enums.h
Remove more duplicated defines and move some in sid.h for coherence with
CIK.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Ananta Srikar
3470f80bd3 drm/amd/amdgpu: Fix typo
Fixes a typo in the word "version" in an error message.

Signed-off-by: Ananta Srikar <srikarananta01@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Andres Urian Florez
c6ae8d587e drm/amdgpu: Replace deprecated function strcpy() with strscpy()
Instead of using the strcpy() deprecated function to populate the
fw_name, use the strscpy() function

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
Signed-off-by: Andres Urian Florez <andres.emb.sys@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alex Deucher
48b733d99b drm/amdgpu: add rebar parameter
Add a new parameter to disable BAR resizing.  Note that this
only disables the driver from attempting to resize the BAR,
The BIOS may have resized the BAR at boot.

Some teams have found this useful in debugging P2P DMA
issues on systems where the available MMIO space did not allow
for all of the GPUs present to resize their BARs.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
b71b7cd91c drm/amdgpu: cleanup DCE6 a bit more
Use shifts already available in DCE6's defines, masks and shifts.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
d35a412910 drm/amdgpu: keep removing sid.h dependency from si_dma.c
Move and rename DMA_SEM_INCOMPLETE_TIMER_CNTL and DMA_SEM_WAIT_FAIL_TIMER_CNTL
in oss_1_0_d.h

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
14f15aa054 drm/amdgpu: move si_dma.c away from sid.h and si_enums.h
Replace defines for the ones in oss_1_0_d.h and oss_1_0_sh_mask.h

Taking the opportunity to add some comments taken from cik_sdma.c

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
230a4b0528 drm/amdgpu: make GFX6 easier to read
Just fix the style and add a comment for reading easiness

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
535b619190 drm/amdgpu: add missing GFX6 defines
They will be used later when switching away from sid.h/si_enums.h.

v2: fix whitespace (Alex)

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
0ba7e47e8e drm/amdgpu: add missing DMA defines, shifts and masks
They will be used later when switching away from sid.h/si_enums.h.

v2: fix up whitespace (Alex)

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
6168cb7a31 drm/amdgpu: move DCE6 away from sid.h and si_enums.h defines
This cleans up DCE6.

I added some minor tweaks taken from CIK to exit early

v2: minor fixes (Alex)

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
76eb396db3 drm/amdgpu: use GRPH_SECONDARY_SURFACE_ADDRESS_MASK with GRPH_SECONDARY_SURFACE_ADDRESS in DCE6
It seems a copy-paste error: since we are working with
mmGRPH_SECONDARY_SURFACE_ADDRESS,
GRPH_SECONDARY_SURFACE_ADDRESS__GRPH_SECONDARY_SURFACE_ADDRESS_MASK
should be used.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
c82d915fe1 drm/amdgpu: move si_ih.c away from sid.h defines
They are properly defined under oss_1_0_d.h

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
cbd8207e23 drm/amdgpu: remove PACKET3 duplicated defines from si_enums.h
PACKET3 is already in sid.h, as it is done under cikd.h for CIK

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
193e088015 drm/amdgpu: use proper defines, shifts and masks in DCE6 code
By replacing VGA_VSTATUS_CNTL by VGA_RENDER_CONTROL__VGA_VSTATUS_CNTL_MASK,
we also need to fix its usage in GMC6.

Note: VGA_VSTATUS_CNTL's binary value was inverted in dce_6_0_sh_mask.h,
so we need to invert its value where it was used.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
de81b86e96 drm/amdgpu: wire up defines, shifts and masks through SI code
To be able to remove as much duplicated defines, the different files
containing definitions, shifts and masks must be properly included.

Once done, the code will be migrated where needed to shifts and masks and
proper defines, before removing useless defines in the end.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
8e46cabf8e drm/amdgpu: move GFX6 defines into gfx_v6_0.c
Send a few GFX6 defines where it's used in GFX6.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
9aadb02fa2 drm/radeon: fix MAX_POWER_SHIFT value
While I don't think it is being used anywhere, if it were used, it would
be wrong. We can base this assumption on MAX_POWER_MASK, where the shift is
by 16 bits.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
1be0ae9e12 drm/amdgpu: move X_GB_ADDR_CONFIG_GOLDEN in GFX7
[BONAIRE|HAWAII]_GB_ADDR_CONFIG_GOLDEN are only used by GFX7. So keep them
where they are needed.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
e319f9ec36 drm/amdgpu: small cleanup to CIK SDMA
Tidy cik_sdma_hw_init() by returning directly cik_sdma_start()'s result.

Keep amdgpu_cik_gpu_check_soft_reset() early declaration with others.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
60c53fe7bc drm/amdgpu: use cik_sdma_is_idle() in CIK SDMA
cik_sdma_is_idle() does exactly what we need, so use it.

V2: fix parameter (Alex)

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
62e0b8f766 drm/amdgpu: use gmc_v7_0_is_idle() since it is available under GMC7
gmc_v7_0_is_idle() does exactly what we need, so use it.

v2: fix parameter (Alex)

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Saleemkhan Jamadar
cc9428d533 drm/amd/display: add proper error message for vblank init
v1 - DRM_ERROR to drm_err (Mario)

Update message to identifiy the vblank initialization fail case

Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Saleemkhan Jamadar
43f668edae drm/amd/display: add proper error message for vblank init
v1 - DRM_ERROR to dev_err (Mario)

Update message to identifiy the vblank initialization fail case

Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Ce Sun
969fd18c8d drm/amdgpu/vcn: during dpc recovery will corrupt VCPU buffer
err_event_athub and dpc recovery will corrupt VCPU buffer,
so we need to restore fw data and clear buffer in amdgpu_vcn_resume()

Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Ce Sun
8ba904f541 drm/amdgpu: Multi-GPU DPC recovery support
Add support for DPC recover based on refactored code

Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Ce Sun
11bb33766f drm/amdgpu: refactor amdgpu_device_gpu_recover
Split amdgpu_device_gpu_recover into the following stages:
halt activities,asic reset,schedule resume and amdgpu resume.
The reason is that the subsequent addition of dpc recover
code will have a high similarity with gpu reset

Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Ce Sun
921c040efe drm/amd/pm: Add link reset for SMU 13.0.6
Add link reset implementation

Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Lijo Lazar
8e0793b6c5 drm/amdkfd: Use dev_* instead of pr_* for messages
To get the device context, replace pr_ with dev_ functions.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Aric Cyr
eed269da71 drm/amd/display: DC v3.2.326
Summary:

* DML 2.1 resync
* Vblank disable fixes
* Visual confirm debug improvements
* Add command for reading ABM histogram
* Bug fixes & improvements

Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
JinZe.Xu
a8f83d0c2d drm/amd/display: Use sync version of indirect register access.
[Why]
Access to indirect registers by DC and other components are not synchronized.

[How]
Use sync version of indirect register access.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: JinZe.Xu <JinZe.Xu@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Aric Cyr
c82d84d1e4 drm/amd/display: Create a temporary scratch dc_link
Create a temporary scratch dc_link for programming purposes
and fix a copy of pipe_ctx on the stack to a pointer reference.

Reviewed-by: Josip Pavic <josip.pavic@amd.com>
Signed-off-by: Aric Cyr <Aric.Cyr@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Charlene Liu
d5a7fdc88a drm/amd/display: fix zero value for APU watermark_c
[why]
the guard of is_apu not in sync, caused no watermark_c output.

Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00