[Why]
NV14 is using its own ip params that's different from other
DCN2.0 ASICs.
[How]
Add ASIC revision check to make sure NV14 gets correct
ip params.
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
"num_vmid" and "num_dsc" are missing within NV14's resource caps structure.
[How]
Add the missing parts.
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
KFD has been verified to function on POWER systems (Talos II / Vega 64).
It should be available as a kernel configuration option on these systems.
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Be less pessimistic about estimated page table use for KFD. Most
allocations use 2MB pages and therefore need less VRAM for page
tables. This allows more VRAM to be used for applications especially
on large systems with many GPUs and hundreds of GB of system memory.
Example: 8 GPUs with 32GB VRAM each + 256GB system memory = 512GB
Old page table reservation per GPU: 1GB
New page table reservation per GPU: 32MB
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It may lose gpuvm invalidate acknowldege state across power-gating off
cycle. To avoid this issue in gmc9/gmc10 invalidation, add semaphore acquire
before invalidation and semaphore release after invalidation.
After adding semaphore acquire before invalidation, the semaphore
register become read-only if another process try to acquire semaphore.
Then it will not be able to release this semaphore. Then it may cause
deadlock problem. If this deadlock problem happens, it needs a semaphore
firmware fix.
Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
SW must acquire/release one of the vm_invalidate_eng*_sem around the
invalidation req/ack. Through this way,it can avoid losing invalidate
acknowledge state across power-gating off cycle.
To use vm_invalidate_eng*_sem, it needs to initialize
vm_invalidate_eng*_sem firstly.
Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
After rlcg fw 2.1, kmd driver starts to load extra fw for
LIST_CNTL,GPM_MEM,SRM_MEM. We needs to skip the three fw
because all rlcg related fw have been loaded by host driver.
Guest driver would load the three fw fail without this change.
Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Temporarily skip ras,dtm,hdcp initialize and terminate for arcturus VF
Currently the three features haven't been enabled at SRIOV, it would
trigger guest driver load fail with the bare-metal path of the three
features.
Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch fixes 2nd baco reset failure with gfxoff enabled on navi1x.
clear state buffer (resides in vram) is corrupted after 1st baco reset,
upon gfxoff exit, CPF gets garbage header in CSIB and hangs.
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
The INTERRUPT_CNTL2 register expects a valid DMA address, but is
currently set with a GPU MC address. This can cause problems on
systems that detect the resulting DMA read from an invalid address
(found on a Power8 guest).
Instead, use the DMA address of the dummy page because it will always
be safe.
Fixes: 27ae10641e ("drm/amdgpu: add interupt handler implementation for si v3")
Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The INTERRUPT_CNTL2 register expects a valid DMA address, but is
currently set with a GPU MC address. This can cause problems on
systems that detect the resulting DMA read from an invalid address
(found on a Power8 guest).
Instead, use the DMA address of the dummy page because it will always
be safe.
Fixes: d8f60cfc93 ("drm/radeon/kms: Add support for interrupts on r6xx/r7xx chips (v3)")
Fixes: 25a857fbe9 ("drm/radeon/kms: add support for interrupts on SI")
Fixes: a59781bbe5 ("drm/radeon: add support for interrupts on CIK (v5)")
Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
Previous patch allowed to initialize debugfs entries on both MST
and SST connectors, but MST connectors get registered much later
which exposed an issue of debugfs entries being initialized in the
same folder.
[how]
Return SST debugfs entries' initialization back to where it was.
For MST connectors we should initialize debugfs entries in connector
register function after the connector is registered.
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There is one regression from 042f3d7b745cd76aa
To put flush_delayed_work after adev->shutdown = true
which will make amdgpu_ih_process not response the irq
At last, all ib ring tests will be failed just like below
[drm] amdgpu: finishing device.
[drm] Fence fallback timer expired on ring gfx
[drm] Fence fallback timer expired on ring comp_1.0.0
[drm] Fence fallback timer expired on ring comp_1.1.0
[drm] Fence fallback timer expired on ring comp_1.2.0
[drm] Fence fallback timer expired on ring comp_1.3.0
[drm] Fence fallback timer expired on ring comp_1.0.1
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.1.1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.2.1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.3.1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma0 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma1 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on uvd_enc_0.0 (-110).
amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on vce0 (-110).
[drm:amdgpu_device_delayed_init_work_handler [amdgpu]] *ERROR* ib ring test failed (-110).
v2: replace cancel_delayed_work_sync() with flush_delayed_work()
Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For fine grained dpm, there is only two levels supported. However
to reflect correctly the current clock frequency, there is an
intermediate level faked. Thus on forcing level setting, we
need to treat level 2 correctly as level 1.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Otherwise, without RLC reinitialization, the DPM reenablement
will fail. That affects the custom pptable uploading.
V2: setting/clearing uploading_custom_pp_table in
smu_sys_set_pp_table()
Reported-by: Matt Coffin <mcoffin13@gmail.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Tested-by: Matt Coffin <mcoffin13@gmail.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1. no need to allocate an extra member for 'mqd_backup' array
2. backup/restore mqd to/from the correct 'mqd_backup' array slot
v2: warning fix (Alex)
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
From inside an active timeline in the execbuf ioctl, we may try to
reclaim some space in the GGTT. We need GGTT space for all objects on
!full-ppgtt platforms, and for context images everywhere. However, to
free up space in the GGTT we may need to remove some pinned objects
(e.g. context images) that require flushing the idle barriers to remove.
For this we use the big hammer of intel_gt_wait_for_idle()
However, commit 7936a22dd4 ("drm/i915/gt: Wait for new requests in
intel_gt_retire_requests()") will continue spinning on the wait if a
timeline is active but lacks requests, as is the case during execbuf
reservation. Spinning forever is quite time consuming, so revert that
commit and start again.
In practice, the effect commit 7936a22dd4 was trying to achieve is
accomplished by commit 1683d24c14 ("drm/i915/gt: Move new timelines
to the end of active_list"), so there is no immediate rush to replace
the looping.
Testcase: igt/gem_exec_reloc/basic-range
Fixes: a46bfdc83f ("drm/i915/gt: Wait for new requests in intel_gt_retire_requests()")
References: 1683d24c14 ("drm/i915/gt: Move new timelines to the end of active_list")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191121071044.97798-1-chris@chris-wilson.co.uk
(cherry picked from commit 689122dcc3)
[Joonas: Corrected Fixes: tag ref to match drm-intel-next-fixes]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Our callers fall into two categories, those passing timeout=0 who just
want to flush request retirements and those passing a timeout that need
to wait for submission completion (e.g. intel_gt_wait_for_idle()).
Currently, we only wait for a snapshot of timelines at the start of the
wait (but there was an expectation that new requests would cause timelines
to appear at the end). However, our callers, such as
intel_gt_wait_for_idle() before suspend, do require us to wait for the
power management requests emitted by retirement as well. If we don't,
then it takes an extra second or two for the background worker to flush
the queue and mark the GT as idle.
Fixes: 7e80576266 ("drm/i915: Drop struct_mutex from around i915_retire_requests()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114225736.616885-1-chris@chris-wilson.co.uk
(cherry picked from commit 7936a22dd4)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
io_mapping_map_atomic/kmap_atomic are occasionally taken in error capture
(if there is no aperture preallocated for the use of error capture), but
the error capture and compression routines are now run in normal
context:
<3> [113.316247] BUG: sleeping function called from invalid context at mm/page_alloc.c:4653
<3> [113.318190] in_atomic(): 1, irqs_disabled(): 0, pid: 678, name: debugfs_test
<4> [113.319900] no locks held by debugfs_test/678.
<3> [113.321002] Preemption disabled at:
<4> [113.321130] [<ffffffffa02506d4>] i915_error_object_create+0x494/0x610 [i915]
<4> [113.327259] Call Trace:
<4> [113.327871] dump_stack+0x67/0x9b
<4> [113.328683] ___might_sleep+0x167/0x250
<4> [113.329618] __alloc_pages_nodemask+0x26b/0x1110
<4> [113.334614] pool_alloc.constprop.19+0x14/0x60 [i915]
<4> [113.335951] compress_page+0x7c/0x100 [i915]
<4> [113.337110] i915_error_object_create+0x4bd/0x610 [i915]
<4> [113.338515] i915_capture_gpu_state+0x384/0x1680 [i915]
However, it is not a good idea to run the slow compression inside atomic
context, so we choose not to.
Fixes: 895d8ebeaa ("drm/i915: error capture with no ggtt slot")
Signed-off-by: Bruce Chang <yu.bruce.chang@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113231104.24208-1-yu.bruce.chang@intel.com
(cherry picked from commit 48715f7001)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>