Commit Graph

1338962 Commits

Author SHA1 Message Date
Aurabindo Pillai
93717be16e drm/amd/display: use drm_err in hpd rx offload
add amdgpu_device pointer to data associated with the work struct
such that hpd handlers has access to the drm device for use with
drm_err()

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:36 -04:00
Aurabindo Pillai
0f774fce44 drm/amd/display: convert DRM_ERROR to drm_err in hpd_rx_irq_create_workqueue()
pass in a pointer to amdgpu_device directly to the function.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:36 -04:00
Asad Kamal
7ac66f9355 drm/amd/pm: Use gpu_metrics_v1_8 for smu_v13_0_12
Use gpu_metrics_v1_8 for smu_v13_0_12 to fill metrics data

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:36 -04:00
Asad Kamal
084769f493 drm/amd/pm: Use gpu_metrics_v1_8 for smu_v13_0_6
Use gpu_metrics_v1_8 for smu_v13_0_6 to fill metrics data

v2: Move exposing caps to separate patch, move smu_v13.0.12 gpu metrics
1.8 usage to separate patch (Lijo)

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Asad Kamal
1189c4fb6f drm/amd/pm: Expose smu_v13_0_6 caps
Expose smu_v13_0_6 caps by moving it to common header

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Charles Han
aa52eb6d16 Documentation: Remove repeated word in docs
Remove the repeated word "the" in docs.

Signed-off-by: Charles Han <hanchunchao@inspur.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Rodrigo Siqueira
74f0ff369f Documentation/gpu: Add an intro about MES
MES is an important firmware that lacks some essential documentation.
This commit introduces an overview of it and how it works.

Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Rodrigo Siqueira
e7aaa5fbf4 Documentation/gpu: Create a GC entry in the amdgpu documentation
GC is a large block that plays a vital role for amdgpu; for this reason,
this commit creates one specific page for GC and adds extra information
about the CP component.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Rodrigo Siqueira
4ede6d2004 Documentation/gpu: Add explanation about AMD Pipes and Queues
Pipes and Queues are two common vocabulary that pervades discussions
around amdgpu core features. The definition and explanation of those
components are spread around multiple places in the code, mailing list,
and Gitlab, which sometimes leads to the wrong interpretation of these
concepts. This commit attempts to centralize the definition and
explanation of Pipe and Queue from amdgpu perspective in a kernel doc.
Most of the information in this doc was derived from:

- https://lore.kernel.org/amd-gfx/CADnq5_Pcz2x4aJzKbVrN3jsZhD6sTydtDw=6PaN4O3m4t+Grtg@mail.gmail.com/T/#m9a670b55ab20e0f7c46c80f802a0a4be255a719d
- https://gitlab.freedesktop.org/mesa/mesa/-/issues/11759

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Rodrigo Siqueira
c6a1c23d10 Documentation/gpu: Create a documentation entry just for hardware info
The APU and dGPU tables are hidden in the driver misc info, which makes
it hard to find specific hardware info when users need it. This commit
creates a single page for this information and adds it to the top of the
amdgpu list to improve searchability.

Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Rodrigo Siqueira
5acd17d6d1 Documentation/gpu: Change index order to show driver core first
Since driver-core has an overview of the AMD GPU hardware structure, it
makes more sense to keep it first. This commit move driver-core up in
the index list.

Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Rodrigo Siqueira
8f1366fcb8 Documentation/gpu: Add new acronyms
This commit introduces some new acronyms extracted from the source code
and found on some web pages around the internet (most of them came from
ArchLinux, Gentoo, and Wikipedia links).

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Alex Deucher
a9a8bccaa3 drm/amdgpu/gfx11: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Alex Deucher
683308af03 drm/amdgpu/gfx10: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:35 -04:00
Alex Deucher
a4a4c0ae67 drm/amdgpu/gfx9: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
c8b8d7a4f1 drm/amdgpu/gfx8: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
be7652c23d drm/amdgpu/gfx7: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
8307ebc15c drm/amdgpu/gfx6: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
9cffd67e80 drm/amdgpu/gfx: assign the actual me0 queues per pipe
Set the actual number of queues per pipe for ME0 (gfx).
This way we will dump all of the queues properly in
dev core dumps.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
8f970c46b5 drm/amdgpu/gfx: decouple the number of kgqs from the hw
The driver currently sets up one kgq per pipe.  As such
adev->gfx.me.num_queue_per_pipe is hardcoded to 1 everywhere.
This is fine for kernel queues, but when we enable user queues
we need to know that actual number of queues per pipe.  Decouple
the kgq setup from the actual hardware count.  For dev core
dumps and user queues, we want to know the actual number
of queues per pipe.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
0d47bb77b5 drm/amdgpu/gfx: make amdgpu_gfx_me_queue_to_bit() static
It's not used outside of amdgpu_gfx.c.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Srinivasan Shanmugam
9eab245326 drm/amdgpu/gfx10: Add Cleaner Shader Support for GFX10.3.x GPUs
Enable the cleaner shader for other GFX10.3.x series of GPUs to provide
data isolation between GPU workloads. The cleaner shader is responsible
for clearing the Local Data Store (LDS), Vector General Purpose
Registers (VGPRs), and Scalar General Purpose Registers (SGPRs), which
helps prevent data leakage and ensures accurate computation results.

This update extends cleaner shader support to GFX10.3.x GPUs, previously
available for GFX10.3.0. It enhances security by clearing GPU memory
between processes and maintains a consistent GPU state across KGD and
KFD workloads.

Cc: Mario Sopena-Novales <mario.novales@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
60d4952d89 drm/amdgpu: drop some dead code
Drop the cgs smu firmware code for SI, it's not used.
The smu firmware fetching for SI is done in si_dpm.c.

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alex Deucher
8ae1a4eef7 drm/amdgpu: add initial documentation for debugfs files
Describes what debugfs files are available and what
they are used for.

v2: fix some typos (Mark Glines)
v3: Address comments from Siqueira and Kent

Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Alexandre Demers
160d3d39f6 drm/amdgpu: continue cleaning up sid.h and si_enums.h
Remove more duplicated defines and move some in sid.h for coherence with
CIK.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:34 -04:00
Ananta Srikar
3470f80bd3 drm/amd/amdgpu: Fix typo
Fixes a typo in the word "version" in an error message.

Signed-off-by: Ananta Srikar <srikarananta01@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Andres Urian Florez
c6ae8d587e drm/amdgpu: Replace deprecated function strcpy() with strscpy()
Instead of using the strcpy() deprecated function to populate the
fw_name, use the strscpy() function

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
Signed-off-by: Andres Urian Florez <andres.emb.sys@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alex Deucher
48b733d99b drm/amdgpu: add rebar parameter
Add a new parameter to disable BAR resizing.  Note that this
only disables the driver from attempting to resize the BAR,
The BIOS may have resized the BAR at boot.

Some teams have found this useful in debugging P2P DMA
issues on systems where the available MMIO space did not allow
for all of the GPUs present to resize their BARs.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
b71b7cd91c drm/amdgpu: cleanup DCE6 a bit more
Use shifts already available in DCE6's defines, masks and shifts.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
d35a412910 drm/amdgpu: keep removing sid.h dependency from si_dma.c
Move and rename DMA_SEM_INCOMPLETE_TIMER_CNTL and DMA_SEM_WAIT_FAIL_TIMER_CNTL
in oss_1_0_d.h

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
14f15aa054 drm/amdgpu: move si_dma.c away from sid.h and si_enums.h
Replace defines for the ones in oss_1_0_d.h and oss_1_0_sh_mask.h

Taking the opportunity to add some comments taken from cik_sdma.c

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
230a4b0528 drm/amdgpu: make GFX6 easier to read
Just fix the style and add a comment for reading easiness

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
535b619190 drm/amdgpu: add missing GFX6 defines
They will be used later when switching away from sid.h/si_enums.h.

v2: fix whitespace (Alex)

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
0ba7e47e8e drm/amdgpu: add missing DMA defines, shifts and masks
They will be used later when switching away from sid.h/si_enums.h.

v2: fix up whitespace (Alex)

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
6168cb7a31 drm/amdgpu: move DCE6 away from sid.h and si_enums.h defines
This cleans up DCE6.

I added some minor tweaks taken from CIK to exit early

v2: minor fixes (Alex)

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:33 -04:00
Alexandre Demers
76eb396db3 drm/amdgpu: use GRPH_SECONDARY_SURFACE_ADDRESS_MASK with GRPH_SECONDARY_SURFACE_ADDRESS in DCE6
It seems a copy-paste error: since we are working with
mmGRPH_SECONDARY_SURFACE_ADDRESS,
GRPH_SECONDARY_SURFACE_ADDRESS__GRPH_SECONDARY_SURFACE_ADDRESS_MASK
should be used.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
c82d915fe1 drm/amdgpu: move si_ih.c away from sid.h defines
They are properly defined under oss_1_0_d.h

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
cbd8207e23 drm/amdgpu: remove PACKET3 duplicated defines from si_enums.h
PACKET3 is already in sid.h, as it is done under cikd.h for CIK

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
193e088015 drm/amdgpu: use proper defines, shifts and masks in DCE6 code
By replacing VGA_VSTATUS_CNTL by VGA_RENDER_CONTROL__VGA_VSTATUS_CNTL_MASK,
we also need to fix its usage in GMC6.

Note: VGA_VSTATUS_CNTL's binary value was inverted in dce_6_0_sh_mask.h,
so we need to invert its value where it was used.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
de81b86e96 drm/amdgpu: wire up defines, shifts and masks through SI code
To be able to remove as much duplicated defines, the different files
containing definitions, shifts and masks must be properly included.

Once done, the code will be migrated where needed to shifts and masks and
proper defines, before removing useless defines in the end.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
8e46cabf8e drm/amdgpu: move GFX6 defines into gfx_v6_0.c
Send a few GFX6 defines where it's used in GFX6.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
9aadb02fa2 drm/radeon: fix MAX_POWER_SHIFT value
While I don't think it is being used anywhere, if it were used, it would
be wrong. We can base this assumption on MAX_POWER_MASK, where the shift is
by 16 bits.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
1be0ae9e12 drm/amdgpu: move X_GB_ADDR_CONFIG_GOLDEN in GFX7
[BONAIRE|HAWAII]_GB_ADDR_CONFIG_GOLDEN are only used by GFX7. So keep them
where they are needed.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
e319f9ec36 drm/amdgpu: small cleanup to CIK SDMA
Tidy cik_sdma_hw_init() by returning directly cik_sdma_start()'s result.

Keep amdgpu_cik_gpu_check_soft_reset() early declaration with others.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
60c53fe7bc drm/amdgpu: use cik_sdma_is_idle() in CIK SDMA
cik_sdma_is_idle() does exactly what we need, so use it.

V2: fix parameter (Alex)

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Alexandre Demers
62e0b8f766 drm/amdgpu: use gmc_v7_0_is_idle() since it is available under GMC7
gmc_v7_0_is_idle() does exactly what we need, so use it.

v2: fix parameter (Alex)

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:32 -04:00
Saleemkhan Jamadar
cc9428d533 drm/amd/display: add proper error message for vblank init
v1 - DRM_ERROR to drm_err (Mario)

Update message to identifiy the vblank initialization fail case

Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Saleemkhan Jamadar
43f668edae drm/amd/display: add proper error message for vblank init
v1 - DRM_ERROR to dev_err (Mario)

Update message to identifiy the vblank initialization fail case

Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Ce Sun
969fd18c8d drm/amdgpu/vcn: during dpc recovery will corrupt VCPU buffer
err_event_athub and dpc recovery will corrupt VCPU buffer,
so we need to restore fw data and clear buffer in amdgpu_vcn_resume()

Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00
Ce Sun
8ba904f541 drm/amdgpu: Multi-GPU DPC recovery support
Add support for DPC recover based on refactored code

Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07 15:18:31 -04:00