Kernel gfx queues do not need to be reinitialized or
remapped after a reset. Align with gfx11.
v2: preserve init and remap for MMIO case.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
wptr is a 64 bit value and we need to update the
full value, not just 32 bits. Align with what we
already do for KCQs.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
wptr is a 64 bit value and we need to update the
full value, not just 32 bits. Align with what we
already do for KCQs.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
wptr is a 64 bit value and we need to update the
full value, not just 32 bits. Align with what we
already do for KCQs.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead of returning feature bit mask of allowed features, initialize
the allowed features in the callback implementation itself.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
MES FW uses address(mqd_addr + sizeof(struct mqd) + 3*sizeof(uint32_t))
as fence address and writes a 32 bit fence value to this address. Driver
needs to allocate some extra memory(at least 4 DWs) in addition to
sizeof(struct mqd) as mqd memory(limited to gfx/compute/sdma queue).
For gfx11/12, sizeof(struct mqd) < PAGE_SIZE, KGD allocates mqd memory with
PAGE_SIZE aligned works. For gfx12.1, sizeof(struct mqd) == PAGE_SIZE,
it doesn't work.
KFD mqd manager hardcodes mqd size to PAGE_SIZE/MQD_SIZE across different
IP versions to solve this issue.
To avoid hardcoding in differnet places and across different IP versions.
Let's use AMDGPU_MQD_SIZE_ALIGN instead. It is used in two places.
1. mqd memory alloction
2. mqd stride handling for multi xcc config
v2: Use AMDGPU_GPU_PAGE_ALIGN. (Mukul)
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: David Belanger <david.belanger@amd.com> (v1)
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add validation to ensure user queue sizes meet hardware requirements:
- Size must be a power of two for efficient ring buffer wrapping
- Size must be at least AMDGPU_GPU_PAGE_SIZE to prevent undersized allocations
This prevents invalid configurations that could lead to GPU faults or
unexpected behavior.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The EXEC_COUNT field must be > 0. In the gfx shadow
handling we always emit a cond_exec packet after the gfx_shadow
packet, but the EXEC_COUNT never gets patched. This leads
to a hang when we try and reset queues on gfx11 APUs.
Fixes: c68cbbfd54 ("drm/amdgpu: cleanup conditional execution")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4789
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
A trap may occur in the middle of VOP3PX instruction co-issue.
The PC would be restored incorrectly if left unmodified.
Identify this case by examining the instruction opcode and
rewind the PC 8 bytes if it occurs.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Reviewed-by: Vladimir Indic <vladimir.indic@amd.com>
Cc: Shweta Khatri <shweta.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
DRM Rust changes for v7.0-rc1
DRM:
- Fix documentation for Registration constructors.
- Use pin_init::zeroed() for fops initialization.
- Annotate DRM helpers with __rust_helper.
- Improve safety documentation for gem::Object::new().
- Update AlwaysRefCounted imports.
MM:
- Prevent integer overflow in page_align().
Nova (Core):
- Prepare for Turing support. This includes parsing and handling
Turing-specific firmware headers and sections as well as a Turing
Falcon HAL implementation.
- Get rid of the Result<impl PinInit<T, E>> anti-pattern.
- Relocate initializer-specific code into the appropriate initializer.
- Use CStr::from_bytes_until_nul() to remove custom helpers.
- Improve handling of unexpected firmware values.
- Clean up redundant debug prints.
- Replace c_str!() with native Rust C-string literals.
- Update nova-core task list.
Nova (DRM):
- Align GEM object size to system page size.
Tyr:
- Use generated uAPI bindings for GpuInfo.
- Replace manual sleeps with read_poll_timeout().
- Replace c_str!() with native Rust C-string literals.
- Suppress warnings for unread fields.
- Fix incorrect register name in print statement.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: "Danilo Krummrich" <dakr@kernel.org>
Link: https://patch.msgid.link/DFYW1WV6DUCG.3K8V2DAVD1Q4A@kernel.org
Changes for v6.20
GPU:
- Document a612/RGMU dt bindings
- UBWC 6.0 support (for A840 / Kaanapali)
- a225 support
- Fixes
DPU:
- Switched to use virtual planes by default
- Fixed DSI CMD panels on DPU 3.x
- Rewrote format handling to remove intermediate representation
- Fixed watchdog on DPU 8.x+
- Fixed TE / Vsync source setting on DPU 8.x+
- Added 3D_Mux on SC7280
- Kaanapali platform support
- Fixed UBWC register programming
- Made RM reserve DSPP-enabled mixers for CRTCs with LMs.
- Gamma correction support
DP:
- Enabled support for eDP 1.4+ link rate tables
- Fixed MDSS1 DP indices on SA8775P, making them to work
- Fixed msm_dp_ctrl_config_msa() to work with LLVM 20
DSI:
- Documented QCS8300 as compatible with SA8775P
- Kaanapali platform support
DSI PHY:
- switched to divider_determine_rate()
MDP5:
- Dropped support for MSM8998, SDM660 and SDM630 (switched over
to DPU)
MDSS:
- Kaanapali platform support
- Fixed UBWC register programming
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <rob.clark@oss.qualcomm.com>
Link: https://patch.msgid.link/CACSVV03Sbeca93A+gGh-TKpzFYVabbkWVgPCCicG0_NQG+5Y2A@mail.gmail.com
Some older builds weren't sending RMA CPERs when the bad page threshold
was exceeded. Newer builds have resolved this, but there could be
systems out there with bad page numbers higher than the threshold, that
haven't sent out an RMA CPER. To be thorough and safe, send an RMA CPER
when we load the table, if the threshold is met or exceeded, instead of
waiting for the next UE to trigger the CPER.
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix the vcn reset sequence in vcn_v4_0_3_ring_reset() to restore
JPEG power state and unlock the JPEG powergating mutex before
running the JPEG ring post-reset helper.
Fixes: d25c67fd9d ("drm/amdgpu/vcn4.0.3: rework reset handling")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The power state check in amdgpu_dpm_set_powergating_by_smu() is done
before acquiring the pm mutex, leading to a race condition where:
1. Thread A checks state and thinks no change is needed
2. Thread B acquires mutex and modifies the state
3. Thread A returns without updating state, causing inconsistency
Fix this by moving the mutex lock before the power state check,
ensuring atomicity of the state check and modification.
Fixes: 6ee27ee27b ("drm/amd/pm: avoid duplicate powergate/ungate setting")
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* Fw release 0.1.44.0
* Fixes for corruption on platforms older than DCN4x.
* Bug fixes related to USB4 link training
* Fixes related to FP guard
* Debug helpers and other stability fixes.
* Some refactors to improve code quality
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
When usb4 link training fails, the dpia sym clock will be disabled and SYMCLK
source should be changed back to phy clock. In enable_streams, it is
assumed that link training succeeded and will switch from refclk to
phy clock. But phy clk here might not be on. Dig reg access timeout
will occur.
[How]
When enable_stream is hit, check if link training failed for usb4.
If it did, fall back to the ref clock to avoid reg access timeout.
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Zhongwei <Zhongwei.Zhang@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why & How]
For dcn2x_fast_validate_bw(), not only populate_dml_pipes needs FP guard
but also dml_get_voltage_level().
Remove unnecessary DC_FP_START/DC_FP_END guard in dcn20_fast_validate_bw
and dcn21_fast_validate_bw. FP guard is already there before calling
dcn2x_validate_bandwidth_fp().
Reviewed-by: ChiaHsuan (Tom) Chung <chiahsuan.chung@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On APUs such as Raven and Renoir (GC 9.1.0, 9.2.2, 9.3.0), the ih1 and
ih2 interrupt ring buffers are not initialized. This is by design, as
these secondary IH rings are only available on discrete GPUs. See
vega10_ih_sw_init() which explicitly skips ih1/ih2 initialization when
AMD_IS_APU is set.
However, amdgpu_gmc_filter_faults_remove() unconditionally uses ih1 to
get the timestamp of the last interrupt entry. When retry faults are
enabled on APUs (noretry=0), this function is called from the SVM page
fault recovery path, resulting in a NULL pointer dereference when
amdgpu_ih_decode_iv_ts_helper() attempts to access ih->ring[].
The crash manifests as:
BUG: kernel NULL pointer dereference, address: 0000000000000004
RIP: 0010:amdgpu_ih_decode_iv_ts_helper+0x22/0x40 [amdgpu]
Call Trace:
amdgpu_gmc_filter_faults_remove+0x60/0x130 [amdgpu]
svm_range_restore_pages+0xae5/0x11c0 [amdgpu]
amdgpu_vm_handle_fault+0xc8/0x340 [amdgpu]
gmc_v9_0_process_interrupt+0x191/0x220 [amdgpu]
amdgpu_irq_dispatch+0xed/0x2c0 [amdgpu]
amdgpu_ih_process+0x84/0x100 [amdgpu]
This issue was exposed by commit 1446226d32 ("drm/amdgpu: Remove GC HW
IP 9.3.0 from noretry=1") which changed the default for Renoir APU from
noretry=1 to noretry=0, enabling retry fault handling and thus
exercising the buggy code path.
Fix this by adding a check for ih1.ring_size before attempting to use
it. Also restore the soft_ih support from commit dd29944165 ("drm/amdgpu:
Rework retry fault removal"). This is needed if the hardware doesn't
support secondary HW IH rings.
v2: additional updates (Alex)
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3814
Fixes: dd29944165 ("drm/amdgpu: Rework retry fault removal")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Jon Doron <jond@wiz.io>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v1:
resolve the issue where some freq frequencies cannot be set correctly
due to insufficient floating-point precision.
v2:
patch this convert on 'max' value only.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v1:
resolve the issue where some freq frequencies cannot be set correctly
due to insufficient floating-point precision.
v2:
patch this convert on 'max' value only.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Commit
cb17fff3a2 ("drm/amdgpu/mes: remove unused functions")
removed most of the code using these IDRs but forgot to remove the struct
members and init/destroy paths.
There is also interrupt handling code in SDMA 5.0 and 5.2 which appears to
be using it, but is is unreachable since nothing ever allocates the
relevant IDR. We replace those with one time warnings just to avoid any
functional difference, but it is also possible they should be removed.
v2: also fix up gfx_v12_1.c and sdma_v7_1.c
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
References: cb17fff3a2 ("drm/amdgpu/mes: remove unused functions")
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pointer bridge->driver_private in imx8qxp_pxl2dpi_bridge_destroy()
is NULL when imx8qxp_pxl2dpi_bridge_probe() returns error, because
the pointer is initialized only when imx8qxp_pxl2dpi_bridge_probe()
returns 0. The NULL pointer would be set to pointer p2d and then
NULL pointer p2d would be dereferenced. Fix this by returning early
from imx8qxp_pxl2dpi_bridge_destroy() if !p2d is true.
Fixes: 900699ba83 ("drm/bridge: imx8qxp-pxl2dpi: get/put the companion bridge")
Signed-off-by: Liu Ying <victor.liu@nxp.com>
Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260123-imx8qxp-drm-bridge-fixes-v1-2-8bb85ada5866@nxp.com
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>