It includes code cleanups from Bhumika and Liviu, a significant shader
performance fix and additions to the cmdstream validator from Wladimir
and the addition of a cmdbuf suballocator by myself.
The suballocator improves performance on all chips by reducing the CPU
overhead of the kernel driver and side steps the GC3000 FE MMU flush
erratum, now making the workarounds in IOVA allocation we had before
unnecessary, which results in a nice cleanup of the code in that area.
* 'drm-etnaviv-next' of https://git.pengutronix.de/git/lst/linux:
drm/etnaviv: Remove duplicate header file include
Revert "drm/etnaviv: trick drm_mm into giving out a low IOVA"
drm/etnaviv: add cmdbuf suballocator
drm/etnaviv: get cmdbuf physical address through the cmdbuf abstraction
drm/etnaviv: wire up iova handling in new cmdbuf abstraction
drm/etnaviv: move cmdbuf de-/allocation into own file
drm/etnaviv: always flush MMU TLBs on map/unmap
drm/etnaviv: constify etnaviv_iommu_ops structures
drm/etnaviv: set up initial PULSE_EATER register
drm/etnaviv: add new GC3000 sensitive states
Instead of receiving the num_crts as a parameter, we can read it
directly from the mode_config structure. I audited the drivers that
invoke this helper and I believe all of them initialize the mode_config
struct accordingly, prior to calling the fb_helper.
I used the following coccinelle hack to make this transformation, except
for the function headers and comment updates. The first and second
rules are split because I couldn't find a way to remove the unused
temporary variables at the same time I removed the parameter.
// <smpl>
@r@
expression A,B,D,E;
identifier C;
@@
(
- drm_fb_helper_init(A,B,C,D)
+ drm_fb_helper_init(A,B,D)
|
- drm_fbdev_cma_init_with_funcs(A,B,C,D,E)
+ drm_fbdev_cma_init_with_funcs(A,B,D,E)
|
- drm_fbdev_cma_init(A,B,C,D)
+ drm_fbdev_cma_init(A,B,D)
)
@@
expression A,B,C,D,E;
@@
(
- drm_fb_helper_init(A,B,C,D)
+ drm_fb_helper_init(A,B,D)
|
- drm_fbdev_cma_init_with_funcs(A,B,C,D,E)
+ drm_fbdev_cma_init_with_funcs(A,B,D,E)
|
- drm_fbdev_cma_init(A,B,C,D)
+ drm_fbdev_cma_init(A,B,D)
)
@@
identifier r.C;
type T;
expression V;
@@
- T C;
<...
when != C
- C = V;
...>
// </smpl>
Changes since v1:
- Rebased on top of the tip of drm-misc-next.
- Remove mention to sti since a proper fix got merged.
Suggested-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170202162640.27261-1-krisman@collabora.co.uk
Some state is coupled into the device lifetime outside of the
load/unload timeframe and requires teardown during final unreference
from drm_dev_release(). For example, dmabufs hold both a device and
module reference and may live longer than expected (i.e. the current
pattern of the driver tearing down its state and then releasing a
reference to the drm device) and yet touch driver private state when
destroyed.
v2: Export drm_dev_fini() and move the responsibility for finalizing the
drm_device and freeing it to the release callback. (If no callback is
provided, the core will call drm_dev_fini() and kfree(dev) as before.)
v3: Remember to add drm_dev_fini() to drm_drv.h
v4: Tidy language for kerneldoc
v5: Cross reference from drm_dev_init() to note that driver->release()
allows for arbitrary embedding.
v6: Refer to driver data rather than driver state, as state is now
becoming associated with the struct drm_atomic_state and friends.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
[danvet: Use the proper reference for struct members, which is
&drm_driver.release.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170202093632.31017-1-chris@chris-wilson.co.uk
MHL3 protocol requires registry adjustments depending on chosen video mode.
Necessary information is gathered in mode_fixup callback. In case of HDMI
video modes driver should also send special AVI and MHL3 infoframes.
The patch introduces generic helpers for handling MHL3 infoframes, in
case of appearance of other users of MHL3 infoframes these function can
be moved to common library.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1485935272-17337-21-git-send-email-a.hajda@samsung.com
Now that commandstreams are handled through the cmdbuf suballocator
the workaround to make the IOVA games work is not needed anymore.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
There are 3 big benefits to suballocating a single big DMA buffer
for command submission:
1. Avoid hammering CMA. The old way of allocating and freeing a DMA
buffer for each submission was hitting some of the real slow
pathes in CMA, as this allocator was not designed for a concurrent
small buffers load.
2. Less TLB flushes on IOMMUv2. If a new command buffer is mapped into
the GPU address space the MMU TLBs need to be flushed. By having
one big buffer statically mapped to the GPU, a lot of those flushes
can be avoided.
3. No funky workarounds for GC3000. The FE TLB flush on GC3000 isn't
reliable. To work around that we tried to lay out the cmdbufs in
the GPU address space in a way to avoid this issue. This hasn't
always worked if the address space is crowded. A single statically
mapped buffer avoids the erratum completely.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Don't allow IOMMUv2 to peek directly into the cmdbuf, but get the
needed PA through a dedicated function.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
This will get more complex with the following changes, so move it
into its own place.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
This ensures that the GPU isn't able to write into already freed
objects, as doing this in the IOVA reaper isn't enough, as the
gem_free_object path will also cause unmaps to happen.
On MMUv2 this also ensures that stale entries, which may have
been prefetched into the TLB will be purged.
The flush is low overhead, as it gets batched up with the next
user command buffer, so this isn't incuring an overhead for
each buffer map/unmap.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Declare etnaviv_iommu_ops structure as const as it is only used when
the reference of one of its field is stored in the ops field of a
iommu_domain structure. This ops field is of type const, so
etnaviv_iommu_ops structures having similar properties can be declared
const too.
Done using Coccinelle.
Before and after size details of .o file remains the same after
cross compiling for arm architecture.
lst: Trimmed commit message, apply the same change to iommu_v2.
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
The DSI0 and DSI1 blocks on the 2835 are related hardware blocks.
Some registers move around, and the featureset is slightly different,
as DSI1 (the 4-lane DSI) is a later version of the hardware block.
This driver doesn't yet enable DSI0, since we don't have any hardware
to test against, but it does put a lot of the register definitions and
code in place.
v2: Use the clk_hw interfaces, don't set CLK_IS_BASIC (from review by
Stephen Boyd)
Signed-off-by: Eric Anholt <eric@anholt.net>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/20170131192912.11316-1-eric@anholt.net