When the GART placement is set to AMDGPU_GART_PLACEMENT_LOW:
Make sure that GART does not overlap with VRAM when
VRAM is configured to be in the low address space.
Solve this according to the following logic:
- When GART fits before VRAM, use zero address for GART
- Otherwise, put GART after the end of VRAM, aligned to 4 GiB
Previously, I had assumed this was not possible
so it was OK to not handle it, but now we got a report
from a user who has a board that is configured this way.
Fixes: 917f91d8d8 ("drm/amdgpu/gmc: add a way to force a particular placement for GART")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3d9de5d86a1658cadb311461b001eb1df67263ad)
Add a function which groups filling of reserve region information. It
may not cover all as info on some regions are still filled outside like
those from atomfirmware tables.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
add support to query vram info from firmware
v2: change APU vram type, add multi-aid check
v3: seperate vram info query function into 3 parts and
call them in a helper func when requirements
are met.
v4: calculate vram_width for v9.x
Signed-off-by: Gangliang Xie <ganglxie@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead of a dynamic allocation, use stack variable and let the caller
pass the maximum ranges that can be held in the buffer.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If multiple entities share the same window we must make sure
that jobs using them are executed sequentially.
This commit gives separate windows to each entity, so jobs
from multiple entities could execute in parallel if needed.
(for now they all use the first sdma engine, so it makes no
difference yet).
The entity stores the gart window offsets to centralize the
"window id" to "window offset" in a single place.
default_entity doesn't get any windows reserved since there is
no use for them.
---
v3:
- renamed gart_window_lock -> lock (Christian)
- added amdgpu_ttm_buffer_entity_init (Christian)
- fixed gart_addr in svm_migrate_gart_map (Felix)
- renamed gart_window_idX -> gart_window_offs[]
- added amdgpu_compute_gart_address
v4:
- u32 -> u64
- added kerneldoc
v5:
- removed gtt_window_lock
- simplified gart window creation and use: entities using a
single window now uses window #0 instead of #1
- fix dst_addr calculation in kfd_migrate.c
---
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
amdgpu_discovery_get_nps_info() internally allocates memory for ranges
using kvcalloc(), which may use vmalloc() for large allocation. Using
kfree() to release vmalloc memory will lead to a memory corruption.
Use kvfree() to safely handle both kmalloc and vmalloc allocations.
Compile tested only. Issue found using a prototype static analysis tool
and code review.
Fixes: b194d21b9b ("drm/amdgpu: Use NPS ranges from discovery table")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On APUs such as Raven and Renoir (GC 9.1.0, 9.2.2, 9.3.0), the ih1 and
ih2 interrupt ring buffers are not initialized. This is by design, as
these secondary IH rings are only available on discrete GPUs. See
vega10_ih_sw_init() which explicitly skips ih1/ih2 initialization when
AMD_IS_APU is set.
However, amdgpu_gmc_filter_faults_remove() unconditionally uses ih1 to
get the timestamp of the last interrupt entry. When retry faults are
enabled on APUs (noretry=0), this function is called from the SVM page
fault recovery path, resulting in a NULL pointer dereference when
amdgpu_ih_decode_iv_ts_helper() attempts to access ih->ring[].
The crash manifests as:
BUG: kernel NULL pointer dereference, address: 0000000000000004
RIP: 0010:amdgpu_ih_decode_iv_ts_helper+0x22/0x40 [amdgpu]
Call Trace:
amdgpu_gmc_filter_faults_remove+0x60/0x130 [amdgpu]
svm_range_restore_pages+0xae5/0x11c0 [amdgpu]
amdgpu_vm_handle_fault+0xc8/0x340 [amdgpu]
gmc_v9_0_process_interrupt+0x191/0x220 [amdgpu]
amdgpu_irq_dispatch+0xed/0x2c0 [amdgpu]
amdgpu_ih_process+0x84/0x100 [amdgpu]
This issue was exposed by commit 1446226d32 ("drm/amdgpu: Remove GC HW
IP 9.3.0 from noretry=1") which changed the default for Renoir APU from
noretry=1 to noretry=0, enabling retry fault handling and thus
exercising the buggy code path.
Fix this by adding a check for ih1.ring_size before attempting to use
it. Also restore the soft_ih support from commit dd29944165 ("drm/amdgpu:
Rework retry fault removal"). This is needed if the hardware doesn't
support secondary HW IH rings.
v2: additional updates (Alex)
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3814
Fixes: dd29944165 ("drm/amdgpu: Rework retry fault removal")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Jon Doron <jond@wiz.io>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On GFX 12.1, use the correct MES pipe instance for readiness before
sending MES commands on that pipe. Additionally, send the TLB requests
on the correct MES pipe in non-SPX modes.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Michael Chen <michael.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No functional change for now, but this struct will have more
fields added in the next commit.
This change would introduce synchronisation issue, because
dependencies between successive jobs are not taken care of
properly. For instance, amdgpu_ttm_clear_buffer uses
amdgpu_ttm_map_buffer then amdgpu_ttm_fill_mem which should
use different entities (default_entity then move/clear entity).
To prevent failures for this commit, we limit ourselves to
2 entities: default_entity (which replaces high_pr usages) and
clear_entity (which replaces low_pr usages).
The next commits will deal with these dependencies correctly,
and then we'll be able to use move_entity.
---
v2: renamed amdgpu_ttm_buffer_entity
v4: don't use move_entity in ttm yet
---
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com> (v3)
Acked-by: Felix Kuehling <felix.kuehling@amd.com> (v3)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
a. extend mes pipe instances to num_xcc * max_mes_pipe
b. initialize mes schq/kiq pipes per xcc
c. submit mes packet to mes ring according to xcc_id
v2: rebase (Alex)
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Userspace jobs have drm_file.client_id as a unique identifier
as job's owners. For kernel jobs, we can allocate arbitrary
values - the risk of overlap with userspace ids is small (given
that it's a u64 value).
In the unlikely case the overlap happens, it'll only impact
trace events.
Since this ID is traced in the gpu_scheduler trace events, this
allows to determine the source of each job sent to the hardware.
To make grepping easier, the IDs are defined as they will appear
in the trace output.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Link: https://lore.kernel.org/r/20250604122827.2191-1-pierre-eric.pelloux-prayer@amd.com
The query_memory_partition does not need to remain
as soc specific callbacks. They can be shared across
multiple products
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When switching to new GPU index after hibernation and then resume,
VRAM offset of each VRAM BO will be changed, and the cached gpu
addresses needed to updated.
This is to enable pdb0 and switch to use pdb0-based virtual gpu
address by default in amdgpu_bo_create_reserved(). since the virtual
addresses do not change, this can avoid the need to update all
cached gpu addresses all over the codebase.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Samuel Zhang <guoqing.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reject queries to get current partition modes during reset. Also, don't
accept sysfs interface requests to switch compute partition mode while
in reset.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
KIQ invalidate_tlbs request has been seen to marginally exceed the
configured 100 ms timeout on systems under load.
All other KIQ requests in the driver use a 10 second timeout. Use a
similar timeout implementation on the invalidate_tlbs path.
v2: Poll once before msleep
v3: Fix return value
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Cc: Kent Russell <kent.russell@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- Modify the VM invalidation engine allocation logic to handle SDMA page rings.
SDMA page rings now share the VM invalidation engine with SDMA gfx rings instead of
allocating a separate engine. This change ensures efficient resource management and
avoids the issue of insufficient VM invalidation engines.
- Add synchronization for GPU TLB flush operations in gmc_v9_0.c.
Use spin_lock and spin_unlock to ensure thread safety and prevent race conditions
during TLB flush operations. This improves the stability and reliability of the driver,
especially in multi-threaded environments.
v2: replace the sdma ring check with a function `amdgpu_sdma_is_page_queue`
to check if a ring is an SDMA page queue.(Lijo)
v3: Add GC version check, only enabled on GC9.4.3/9.4.4/9.5.0
v4: Fix code style and add more detailed description (Christian)
v5: Remove dependency on vm_inv_eng loop order, explicitly lookup shared inv_eng(Christian/Lijo)
v6: Added search shared ring function amdgpu_sdma_get_shared_ring (Lijo)
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
And initialize it, this is a pure software ring to store RAS CPER data.
v2: change ring size to 0x100000
v2: update the initialization of count_dw of cper ring, it's dword
variable
v3: skip VM inv eng for cper
v3: init/fini when aca enabled
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Set GC 9.5.0 noretry default as 1 for better performance. It can be
changed by the administrator using amdgpu.noretry=0 or by the user using
HSA_XNACK=1 environment variable.
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviwanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add a callback to check if there is any condition detected by GMC block
for reset on init. One case is if a pending NPS change request is
detected. If reset is done because of NPS switch, refresh NPS info from
discovery table.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If a user has requested NPS mode switch, place the request through PSP
during unload of the driver. For devices which are part of a hive, all
requests are placed together. If one of them fails, revert back to the
current NPS mode.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add a sysfs interface to see available NPS modes to switch to -
cat /sys/bus/pci/devices/../available_memory_paritition
Make the current_memory_partition sysfs node read/write for requesting a
new NPS mode. The request is only cached and at a later point a driver
unload/reload is required to switch to the new NPS mode.
Ex:
echo NPS1 > /sys/bus/pci/devices/../current_memory_paritition
echo NPS4 > /sys/bus/pci/devices/../current_memory_paritition
The above interfaces will be available only if the SOC supports more than
one NPS mode.
Also modify the current memory partition sysfs logic to be more
generic.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add a common interface in GMC to request NPS mode through PSP. Also add
a variable in hive and gmc control to track the last requested mode.
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In certain use cases, NPS data needs to be refreshed again from
discovery table. Add API parameter to refresh NPS data from discovery
table.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdgpu_gmc_vram_cpu_pa has been unused since commit
087451f372 ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Which method is used to flush tlb does not depend on whether a reset is
in progress or not. We should skip flush altogether if the GPU will get
reset. So put both path under reset_domain read lock.
Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: stable@vger.kernel.org
Here since we are in reset and takes the reset_domain write side lock
already. We can't use the flush tlb helper which tries to take the read
side.
Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The following commit updated gmc->noretry from 0 to 1 for GC HW IP
9.3.0:
commit 5f3854f1f4 ("drm/amdgpu: add more cases to noretry=1")
This causes the device to hang when a page fault occurs, until the
device is rebooted. Instead, revert back to gmc->noretry=0 so the device
is still responsive.
Fixes: 5f3854f1f4 ("drm/amdgpu: add more cases to noretry=1")
Signed-off-by: Tim Van Patten <timvp@google.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add GMC API to fetch NPS range information from discovery table. Use NPS
range information in GMC 9.4.3 SOCs when available, otherwise fallback
to software method.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>