The current way of handling refcounting in the DP MST helpers is really
confusing and probably just plain wrong because it's been hacked up many
times over the years without anyone actually going over the code and
seeing if things could be simplified.
To the best of my understanding, the current scheme works like this:
drm_dp_mst_port and drm_dp_mst_branch both have a single refcount. When
this refcount hits 0 for either of the two, they're removed from the
topology state, but not immediately freed. Both ports and branch devices
will reinitialize their kref once it's hit 0 before actually destroying
themselves. The intended purpose behind this is so that we can avoid
problems like not being able to free a remote payload that might still
be active, due to us having removed all of the port/branch device
structures in memory, as per:
commit 91a25e4631 ("drm/dp/mst: deallocate payload on port destruction")
Which may have worked, but then it caused use-after-free errors. Being
new to MST at the time, I tried fixing it;
commit 263efde31f ("drm/dp/mst: Get validated port ref in drm_dp_update_payload_part1()")
But, that was broken: both drm_dp_mst_port and drm_dp_mst_branch structs
are validated in almost every DP MST helper function. Simply put, this
means we go through the topology and try to see if the given
drm_dp_mst_branch or drm_dp_mst_port is still attached to something
before trying to use it in order to avoid dereferencing freed memory
(something that has happened a LOT in the past with this library).
Because of this it doesn't actually matter whether or not we keep keep
the ports and branches around in memory as that's not enough, because
any function that validates the branches and ports passed to it will
still reject them anyway since they're no longer in the topology
structure. So, use-after-free errors were fixed but payload deallocation
was completely broken.
Two years later, AMD informed me about this issue and I attempted to
come up with a temporary fix, pending a long-overdue cleanup of this
library:
commit c54c7374ff ("drm/dp_mst: Skip validating ports during destruction, just ref")
But then that introduced use-after-free errors, so I quickly reverted
it:
commit 9765635b30 ("Revert "drm/dp_mst: Skip validating ports during destruction, just ref"")
And in the process, learned that there is just no simple fix for this:
the design is just broken. Unfortunately, the usage of these helpers are
quite broken as well. Some drivers like i915 have been smart enough to
avoid accessing any kind of information from MST port structures, but
others like nouveau have assumed, understandably so, that
drm_dp_mst_port structures are normal and can just be accessed at any
time without worrying about use-after-free errors.
After a lot of discussion, me and Daniel Vetter came up with a better
idea to replace all of this.
To summarize, since this is documented far more indepth in the
documentation this patch introduces, we make it so that drm_dp_mst_port
and drm_dp_mst_branch structures have two different classes of
refcounts: topology_kref, and malloc_kref. topology_kref corresponds to
the lifetime of the given drm_dp_mst_port or drm_dp_mst_branch in it's
given topology. Once it hits zero, any associated connectors are removed
and the branch or port can no longer be validated. malloc_kref
corresponds to the lifetime of the memory allocation for the actual
structure, and will always be non-zero so long as the topology_kref is
non-zero. This gives us a way to allow callers to hold onto port and
branch device structures past their topology lifetime, and dramatically
simplifies the lifetimes of both structures. This also finally fixes the
port deallocation problem, properly.
Additionally: since this now means that we can keep ports and branch
devices allocated in memory for however long we need, we no longer need
a significant amount of the port validation that we currently do.
Additionally, there is one last scenario that this fixes, which couldn't
have been fixed properly beforehand:
- CPU1 unrefs port from topology (refcount 1->0)
- CPU2 refs port in topology(refcount 0->1)
Since we now can guarantee memory safety for ports and branches
as-needed, we also can make our main reference counting functions fix
this problem by using kref_get_unless_zero() internally so that topology
refcounts can only ever reach 0 once.
Changes since v4:
* Change the kernel-figure summary for dp-mst/topology-figure-1.dot a
bit - danvet
* Remove figure numbers - danvet
Changes since v3:
* Remove rebase detritus - danvet
* Split out purely style changes into separate patches - hwentlan
Changes since v2:
* Fix commit message - checkpatch
* s/)-1/) - 1/g - checkpatch
Changes since v1:
* Remove forward declarations - danvet
* Move "Branch device and port refcounting" section from documentation
into kernel-doc comments - danvet
* Export internal topology lifetime functions into their own section in
the kernel-docs - danvet
* s/@/&/g for struct references in kernel-docs - danvet
* Drop the "when they are no longer being used" bits from the kernel
docs - danvet
* Modify diagrams to show how the DRM driver interacts with the topology
and payloads - danvet
* Make suggested documentation changes for
drm_dp_mst_topology_get_mstb() and drm_dp_mst_topology_get_port() -
danvet
* Better explain the relationship between malloc refs and topology krefs
in the documentation for drm_dp_mst_topology_get_port() and
drm_dp_mst_topology_get_mstb() - danvet
* Fix "See also" in drm_dp_mst_topology_get_mstb() - danvet
* Rename drm_dp_mst_topology_get_(port|mstb)() ->
drm_dp_mst_topology_try_get_(port|mstb)() and
drm_dp_mst_topology_ref_(port|mstb)() ->
drm_dp_mst_topology_get_(port|mstb)() - danvet
* s/should/must in docs - danvet
* WARN_ON(refcount == 0) in topology_get_(mstb|port) - danvet
* Move kdocs for mstb/port structs inline - danvet
* Split drm_dp_get_last_connected_port_and_mstb() changes into their own
commit - danvet
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Juston Li <juston.li@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190111005343.17443-7-lyude@redhat.com
Add the KMS plane rotation property to the DRM rockchip driver,
for SoCs RK3328, RK3368 and RK3399.
RK3288 only supports rotation at the display level (i.e. CRTC),
but for now we are only interested in plane rotation.
This commit only adds support for the value of reflect-y
and reflect-x (i.e. mirroring).
Note that y-mirroring is not compatible with YUV.
The following modetest commands would test this feature,
where 30 is the plane ID, and 49 = rotate_0 + relect_y + reflect_x.
X mirror:
modetest -s 43@33:1920x1080@XR24 -w 30:rotation:17
Y mirror:
modetest -s 43@33:1920x1080@XR24 -w 30:rotation:33
XY mirror:
modetest -s 43@33:1920x1080@XR24 -w 30:rotation:49
Signed-off-by: Daniele Castagna <dcastagna@chromium.org>
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20190109185639.5093-4-ezequiel@collabora.com
Currently, YUV hardware overlays are converted to RGB using
a color space conversion different than BT.601.
The result is that colors of e.g. NV12 buffers don't match
colors of YUV hardware overlays.
In order to fix this, enable YUV2YUV and set appropriate coefficients
for formats such as NV12 to be displayed correctly.
This commit was tested using modetest, gstreamer and chromeos (hardware
accelerated video playback). Before the commit, tests rendering
with NV12 format resulted in colors not displayed correctly.
Test examples (Tested on RK3399 and RK3288 boards
connected to HDMI monitor):
$ modetest 39@32:1920x1080@NV12
$ gst-launch-1.0 videotestrc ! video/x-raw,format=NV12 ! kmssink
Signed-off-by: Daniele Castagna <dcastagna@chromium.org>
[ezequiel: rebase on linux-next and massage commit log]
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20190108214659.28794-1-ezequiel@collabora.com
The following happened when migrating an old fbdev driver to DRM:
The Integrator/CP PL111 supports 16BPP but only ARGB1555/ABGR1555
or XRGB1555/XBGR1555 i.e. the maximum depth is 15.
This makes the initialization of the framebuffer fail since
the code in drm_fb_helper_single_fb_probe() assigns the same value
to sizes.surface_bpp and sizes.surface_depth. I.e. it simply assumes
a 1-to-1 mapping between BPP and depth, which is true in most cases
but not for this hardware that only support odd formats.
To support the odd case of a driver supporting 16BPP with only 15
bits of depth, this patch will make the code loop over the formats
supported on the primary plane on each CRTC managed by the FB
helper and cap the depth to the maximum supported on any primary
plane.
On the PL110 Integrator, this makes drm_mode_legacy_fb_format()
select DRM_FORMAT_XRGB1555 which is acceptable for this driver, and
thus we get framebuffer, penguin and console on the Integrator/CP.
Cc: Noralf Trønnes <noralf@tronnes.org>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190110114049.10618-1-linus.walleij@linaro.org
Fill out the AVI infoframe quantization range bits using
drm_hdmi_avi_infoframe_quant_range() for SDVO HDMI encoder as well.
This changes the behaviour slightly as
drm_hdmi_avi_infoframe_quant_range() will set a non-zero Q bit
even when QS==0 iff the Q bit matched the default quantization
range for the given mode. This matches the recommendation in
HDMI 2.0 and is allowed even before that.
v2: Pimp commit msg (DK)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190108172828.15184-2-ville.syrjala@linux.intel.com
The gem drivers use shmemfs to allocate backing storage for gem objects.
On Samsung Chromebook Plus, the drm/rockchip driver may call
rockchip_gem_get_pages -> drm_gem_get_pages -> shmem_read_mapping_page
to pin a lot of pages, breaking the page reclaim mechanism and causing
oom-killer invocation.
E.g. when the size of a zone is 3.9 GiB, the inactive_ratio is 5. If
active_anon / inactive_anon < 5 and all pages in the inactive_anon lru
are pinned, page reclaim would keep scanning inactive_anon lru without
reclaiming memory. It breaks page reclaim when the rockchip driver only
pins about 1/6 of the anon lru pages.
Mark these pinned pages as unevictable to avoid the premature oom-killer
invocation. See also similar patch on i915 driver [1].
[1]: https://patchwork.freedesktop.org/patch/msgid/20181106132324.17390-1-chris@chris-wilson.co.uk
Signed-off-by: Kuo-Hsin Yang <vovoy@chromium.org>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190108074517.209860-1-vovoy@chromium.org
One of the more common cases of allocation size calculations is finding the
size of a structure that has a zero-sized array at the end, along with memory
for some number of elements for that array. For example:
struct foo {
int stuff;
void *entry[];
};
instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can now
use the new struct_size() helper:
instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190108162152.GA25361@embeddedor
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
After drm_fb_helper_fbdev_setup calls drm_fb_helper_init,
"dev->fb_helper" will be initialized (and thus drm_fb_helper_fini will
have some effect). After that, drm_fb_helper_initial_config is called
which may call the "fb_probe" driver callback.
This driver callback may call drm_fb_helper_defio_init (as is done by
drm_fb_helper_generic_probe) or set a framebuffer (as is done by bochs)
as documented. These are normally cleaned up on exit by
drm_fb_helper_fbdev_teardown which also calls drm_fb_helper_fini.
If an error occurs after "fb_probe", but before setup is complete, then
calling just drm_fb_helper_fini will leak resources. This was triggered
by df2052cc92 ("bochs: convert to drm_fb_helper_fbdev_setup/teardown"):
[ 50.008030] bochsdrmfb: enable CONFIG_FB_LITTLE_ENDIAN to support this framebuffer
[ 50.009436] bochs-drm 0000:00:02.0: [drm:drm_fb_helper_fbdev_setup] *ERROR* fbdev: Failed to set configuration (ret=-38)
[ 50.011456] [drm] Initialized bochs-drm 1.0.0 20130925 for 0000:00:02.0 on minor 2
[ 50.013604] WARNING: CPU: 1 PID: 1 at drivers/gpu/drm/drm_mode_config.c:477 drm_mode_config_cleanup+0x280/0x2a0
[ 50.016175] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G T 4.20.0-rc7 #1
[ 50.017732] EIP: drm_mode_config_cleanup+0x280/0x2a0
...
[ 50.023155] Call Trace:
[ 50.023155] ? bochs_kms_fini+0x1e/0x30
[ 50.023155] ? bochs_unload+0x18/0x40
This can be reproduced with QEMU and CONFIG_FB_LITTLE_ENDIAN=n.
Link: https://lkml.kernel.org/r/20181221083226.GI23332@shao2-debian
Link: https://lkml.kernel.org/r/20181223004315.GA11455@al
Fixes: 8741216396 ("drm/fb-helper: Add drm_fb_helper_fbdev_setup/teardown()")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Cc: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Reviewed-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20181223005507.28328-1-peter@lekensteyn.nl
If register_framebuffer() fails during fbdev setup we will leak the
framebuffer, the GEM buffer and the shadow buffer for defio. This is
because drm_fb_helper_fbdev_setup() just calls drm_fb_helper_fini() on
error not taking into account that register_framebuffer() can fail.
Since the generic emulation uses DRM client for its framebuffer and
backing buffer in addition to a shadow buffer, it's necessary to open code
drm_fb_helper_fbdev_setup() to properly handle the error path.
Error cleanup is removed from .fb_probe and is handled by one function for
all paths.
Fixes: 9060d7f493 ("drm/fb-helper: Finish the generic fbdev emulation")
Reported-by: Peter Wu <peter@lekensteyn.nl>
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190105181846.26495-1-noralf@tronnes.org