We didn't used to be aware that runlist/engine IDs weren't the same thing,
or that there was such variability in configuration between GPUs.
By exposing this information to a client, and giving it explicit control
of which runlist it's allocating a channel on, we're able to make better
choices.
The immediate effect of this is that on GPUs where CE0 is the "GRCE", we
will now be allocating a copy engine running asynchronously to GR for BO
migrations - as intended.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We have a need to fetch data from GPU-specific sub-devices that is not
tied to any particular engine object.
This commit provides the framework to support such queries.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This will be required to support Volta, but also allows us to remove code
that's duplicated for each channel type already.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Introduces a new method of defining channels available from the display,
common to all channel types, allowing for more flexibility in available
channel types/counts, and reducing the amount of boiler-plate required.
This will be required to support Volta.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Engines are initialised on an as-needed basis, so this results in the
same behaviour, whilst allowing us to simplify things a bit.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We should be reading registers to determine which subunits are really
present on a given board, and this needs to be done after DEVINIT.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Likely a rebase bug. Should have no impact in default configuration due
to using per-instance setting by default.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
In preparation to enabling -Wvla, remove VLA. In this particular
case directly use macro NVKM_MSGQUEUE_CMDLINE_SIZE instead of local
variable cmdline_size. Also, remove cmdline_size as it is not
actually useful anymore.
The use of stack Variable Length Arrays needs to be avoided, as they
can be a vector for stack exhaustion, which can be both a runtime bug
or a security flaw. Also, in general, as code evolves it is easy to
lose track of how big a VLA can get. Thus, we can end up having runtime
failures that are hard to debug.
Also, fixed as part of the directive to remove all VLAs from
the kernel: https://lkml.org/lkml/2018/3/7/621
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
gcc points out a buffer that is clearly too small to be used
in a meaningful way, as the 'sizeof(*args) + argc > sizeof(stack)'
will always fail:
In function 'memcpy',
inlined from 'nvif_vmm_map' at drivers/gpu/drm/nouveau/nvif/vmm.c:55:2:
include/linux/string.h:353:9: error: '__builtin_memcpy' offset 40 is out of the bounds [0, 16] of object 'stack' with type 'u8[16]' {aka 'unsigned char[16]'} [-Werror=array-bounds]
return __builtin_memcpy(p, q, size);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/nouveau/nvif/vmm.c: In function 'nvif_vmm_map':
drivers/gpu/drm/nouveau/nvif/vmm.c:40:5: note: 'stack' declared here
This makes the buffer large enough so it should serve the purpose
that the author presumably had in mind. Alternatively we could
just get rid of it completely and simplify the code at the cost
of always doing the kmalloc (as we do in the current version).
Fixes: 920d2b5ef2 ("drm/nouveau/mmu: define user interfaces to mmu vmm opertaions")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Main changes for 4.18. I'd like to do a separate pull for vega20 later
this week or next. Highlights:
- Reserve pre-OS scanout buffer during init for seemless transition from
console to driver
- VEGAM support
- Improved GPU scheduler documentation
- Initial gfxoff support for raven
- SR-IOV fixes
- Default to non-AGP on PowerPC for radeon
- Fine grained clock voltage control for vega10
- Power profiles for vega10
- Further clean up of powerplay/driver interface
- Underlay fixes
- Display link bw updates
- Gamma fixes
- Scatter/Gather display support on CZ/ST
- Misc bug fixes and clean ups
[airlied: fixup v3d vs scheduler API change]
Link: https://patchwork.freedesktop.org/patch/msgid/20180515185450.1113-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
This change prepares for a workaround in amdkfd for a GFX9 HW bug. It
requires the control stack memory of compute queues, which is allocated
from the second page of MQD gart BOs, to have mtype NC, rather than
the default UC.
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Functionality to message smc to enable pwe after gpu suspense.
It is used in case when display resumes from S3 and wants to start
audio driver by enabling pwe.
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch is in continuation to the
"843e3c7 drm/amd/display: defer modeset check in dm_update_planes_state"
where we started to eliminate the dependency on
DRM_MODE_ATOMIC_ALLOW_MODESET to be set by the user space,
which as such is not mandatory.
After deferring, this patch eliminates the dependency on the flag
for overlay planes.
This has to be done in stages as its a pretty complex and requires thorough
testing before we free primary planes as well from dependency on modeset
flag.
V2: Simplified the plane type check.
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>