[Why]
Needs more frames waiting before the PSR_Exit sending for the specific
TCON.
[How]
Add relock_delay_frame_cnt to control how many frames waiting are needed
before the PSR_Exit sending. The default value is 0. The Driver side can
set this variable for specific TCONs.
Reviewed-by: Robin Chen <robin.chen@amd.com>
Acked-by: Alan Liu <HaoPing.Liu@amd.com>
Signed-off-by: Ryan Lin <tsung-hua.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cable ID is a DP2 feature to identify max certified link rate that
a cable can carry. The cable identification method requires both
cable and display hardware support. Since the specs comes late, it is
anticipated that the first round of DP2 cables and displays may not
be fully compatible to reliably return cable ID data. Therefore the
decision of our cable id policy is that if the cable can return non
zero cable id data, we will take cable's link rate capability into
account. However if we get zero data, the cable link rate capability
is considered inconclusive. In this case, we will not take cable's
capability into account to avoid of over limiting hardware capability
from users. The max overall link rate capability is still determined
after actual dp pre-training. Cable id is considered as an auxiliary
method of determining max link bandwidth capability.
Reviewed-by: George Shen <George.Shen@amd.com>
Acked-by: Alan Liu <HaoPing.Liu@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Checkpoint BOs last. That way we don't need to close dmabuf FDs if
something else fails later. This avoids problematic access to user mode
memory in the error handling code path.
criu_checkpoint_bos has its own error handling and cleanup that does not
depend on access to user memory.
In the private data, keep BOs before the remaining objects. This is
necessary to restore things in the correct order as restoring events
depends on the events-page BO being restored first.
Fixes: be072b06c7 ("drm/amdkfd: CRIU export BOs as prime dmabuf objects")
Reported-by: Jann Horn <jannh@google.com>
CC: Rajneesh Bhardwaj <Rajneesh.Bhardwaj@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-and-tested-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
mutex_unlock before the exit label because all the error code paths that
jump there didn't take that lock. This fixes unbalanced locking errors
in case of restore errors.
Fixes: 40e8a766a7 ("drm/amdkfd: CRIU checkpoint and restore events")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is the pull request for a whole bunch of fixes and prep-work that
was done to support Ampere acceleration prior to GSP-RM being
available. It uses the ACR firmware released by NVIDIA in
linux-firmware, as we do on earlier GPUs. The work to support running
on top of GSP-RM also heavily depends on various pieces of this
series.
In addition to the new HW support, general stability of the driver
should be improved, especially around recovering HW from bugs that can
be generated by userspace driver components.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Ben Skeggs <bskeggs@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CABDvA==s+nZD0n7CuRWLPE=Pj+02CN13r+ZQJxoHQ_EmR+o=XQ@mail.gmail.com
We weren't sending the high bits, though they're zero currently anyway.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
NVIDIA provided this on Turing, but we kept using the hardcoded version
from Volta (where they didn't).
Switch to the firmware version prior to Ampere.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Essentially ripped verbatim from NVGPU, comments and all, and adapted to
nvkm's structs and style.
- maybe fixes an nvgpu bug though, a small tweak was needed to match RM
v2:
- remove unnecessary WARN_ON
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
We'll want to reuse the former for loading from proper netlist images.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
We're going to be pulling in a chunk of code from NVGPU to fixup our
SMID mappings on Volta and above, which depends on ppc_nr[gpc]
reflecting the actual number of PPCs present, not the maximum number.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
This doesn't fix any known issue, but RM started doing it at some point,
so presumably it's needed for something.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
This won't work on Ampere, and, it's questionable whether we should have
been using our FW's method of storing the golden context image with NV's
firmware to begin with.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
This doesn't work on Ampere for some reason, switch to directly modifying
NV_PGRAPH_FECS_CTXSW_MAILBOX instead.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
This was thought to be per-channel initially - it's not. The backing
pages for the VMM mappings are shared for all channels.
- switches to more straight-forward patch interfaces
- prepares for sub-context support
- this is saving a *sizeable* amount of vram
v2:
- whitespace
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
This was thought to be per-channel initially - it's not. The backing
pages for the VMM mappings are shared for all channels.
- switches to more straight-forward patch interfaces
- prepares for sub-context support
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
This was thought to be per-channel initially - it's not. The backing
pages for the VMM mappings are shared for all channels.
- switches to more straight-forward patch interfaces
- prepares for sub-context support
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Needed for GV100 (and only GV100 for some reason) for WFI_GOLDEN_SAVE.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Adds context binding and support for FWs with a bootloader to the code
that was added to load VPR scrubber HS binaries, and ports ACR over to
using all of it.
- gv100 split from gp108 to handle FW exit status differences
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>