virtio-gpu basically needs a sg_table for the bo, to tell the host where
the backing pages for the object are. So the gem shmem helpers are a
perfect fit. Some drm_gem_object_funcs need thin wrappers to update the
host state, but otherwise the helpers handle everything just fine.
Once the fencing was sorted the switch was surprisingly easy and for the
most part just removing the ttm code.
v4: fix drm_gem_object_funcs name.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190829103301.3539-15-kraxel@redhat.com
Rework fencing workflow. Stop using ttm helpers, use the
virtio_gpu_array_* helpers instead.
Due to using the gem reservation object it is initialized and ready for
use before calling ttm_bo_init. So we can simply use the standard
fencing workflow and drop the tricky logic which checks whenever the
command is in flight still.
v6: rewrite most of the patch.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190829103301.3539-10-kraxel@redhat.com
Rework fencing workflow, starting with virtio_gpu_execbuffer_ioctl.
Stop using ttm helpers, use the virtio_gpu_array_* helpers (which work
on the reservation objects directly) instead.
Also store the object array in struct virtio_gpu_vbuffer, so we
explicitly keep a reference of all buffers used instead of depending
on ttm_bo_put() checking whenever the object is actually idle before
releasing it.
New workflow:
(1) All gem objects needed by a command are added to a
virtio_gpu_object_array.
(2) All reservation objects will be locked (virtio_gpu_array_lock_resv).
(3) virtio_gpu_fence_emit() completes fence initialization.
(4) fence gets added to the objects, reservation objects are unlocked
(virtio_gpu_array_add_fence, virtio_gpu_array_unlock_resv).
(5) virtio command is submitted to the host.
(6) The completion callback (virtio_gpu_dequeue_ctrl_func)
will drop object references and free virtio_gpu_object_array.
v6: rewrite most of the patch.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190829103301.3539-9-kraxel@redhat.com
Some helper functions to manage an array of gem objects.
v9: use dma_resv_lock_interruptible.
v6:
- add ticket to struct virtio_gpu_object_array.
- add virtio_gpu_array_{lock,unlock}_resv helpers.
- add virtio_gpu_array_add_fence helper.
v5: some small optimizations (Chia-I Wu).
v4: make them virtio-private instead of generic helpers.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190829103301.3539-8-kraxel@redhat.com
There's a couple of changes here, so to summarize:
* Remove the big ugly mgr->up_req_recv.have_eomt conditional to save on
indenting
* Store &mgr->up_req_recv.initial_hdr in a variable so we don't keep
going over 80 character long lines
* De-duplicate code for calling drm_dp_send_up_ack_reply() and getting
the MSTB via it's GUID
* Remove all of the duplicate calls to memset() and just use a goto
instead
* Actually do line wrapping
* Remove the unnecessary if (mstb) check before calling
drm_dp_mst_topology_put_mstb() - we are guaranteed to always have
mstb != NULL at that point in the function
Cc: Juston Li <juston.li@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Harry Wentland <hwentlan@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190903204645.25487-13-lyude@redhat.com
Unfortunately the DP MST helpers do not have much in the way of
debugging utilities. So, let's add some!
This adds basic debugging output for down sideband requests that we send
from the driver, so that we can actually discern what's happening when
sideband requests timeout.
Since there wasn't really a good way of testing that any of this worked,
I ended up writing simple selftests that lightly test sideband message
encoding and decoding as well. Enjoy!
Changes since v1:
* Clean up DO_TEST() and sideband_msg_req_encode_decode() - danvet
* Get rid of pr_fmt(), just define a prefix string instead and use
drm_printf()
* Check highest bit of VCPI in drm_dp_decode_sideband_req() - danvet
* Make the switch case order between drm_dp_decode_sideband_req() and
drm_dp_encode_sideband_req() the same - danvet
* Only check DRM_UT_DP - danvet
* Clean up sideband_msg_req_equal() from selftests a bit, and add
comments explaining why we can't just use memcmp - danvet
Cc: Juston Li <juston.li@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Harry Wentland <hwentlan@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190903204645.25487-8-lyude@redhat.com
We can reduce the critical section in vkms_vblank_simulate under
output->lock quite a lot:
- hrtimer_forward_now just needs to be ordered correctly wrt
drm_crtc_handle_vblank. We already access the hrtimer timestamp
without locks. While auditing that I noticed that we don't correctly
annotate the read there, so sprinkle a READ_ONCE to make sure the
compiler doesn't do anything foolish.
- drm_crtc_handle_vblank must stay under the lock to avoid races with
drm_crtc_arm_vblank_event.
- The access to vkms_ouptut->crc_state also must stay under the lock.
- next problem is making sure the output->state structure doesn't get
freed too early. First we rely on a given hrtimer being serialized:
If we call drm_crtc_handle_vblank, then we are guaranteed that the
previous call to vkms_vblank_simulate has completed. The other side
of the coin is that the atomic updates waits for the vblank to
happen before it releases the old state. Both taken together means
that by the time the atomic update releases the old state, the
hrtimer won't access it anymore (it might be accessing the new state
at the same time, but that's ok).
- state is invariant, except the few fields separate protected by
state->crc_lock. So no need to hold the lock for that.
- finally the queue_work. We need to make sure there's no races with
the flush_work, i.e. when we call flush_work we need to guarantee
that the hrtimer can't requeue the work again. This is guaranteed by
the same vblank/hrtimer ordering guarantees like the reasoning above
why state won't be freed too early: flush_work on the old state is
called after wait_for_flip_done in the atomic commit code.
Therefore we can also move everything after the output->crc_state out
of the critical section.
Motivated by suggestions from Rodrigo.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190719152314.7706-3-daniel.vetter@ffwll.ch
Noticed while reviewing code. I'm not sure whether this might or might
not explain some of the missed vblank hilarity we've been seeing on
various drivers (but those got tracked down to driver issues, at least
mostly). I think those all go through the vblank completion event,
which has unconditional barriers - it always takes the spinlock.
Therefore no cc stable.
v2:
- Barrriers are hard, put them in in the right order (Chris).
- Improve the comments a bit.
v3:
Ville noticed that on 32bit we might be breaking up the load/stores,
now that the vblank counter has been switched over to be 64 bit. Fix
that up by switching to atomic64_t. This this happens so rarely in
practice I figured no need to cc: stable ...
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Keith Packard <keithp@keithp.com>
References: 570e86963a ("drm: Widen vblank count to 64-bits [v3]")
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190723131337.22031-1-daniel.vetter@ffwll.ch
Store the timestamp of the current vblank in the new field 'time' of the
vblank trace event. If the timestamp is calculated by a driver that
supports high-precision vblank timing, set the field 'high-prec' to
'true'.
User space can now access actual hardware vblank times via the tracing
infrastructure. Tracing applications (such as GPUVis, see [0] for
related discussion), can use the newly added information to conduct a
more accurate analysis of display timing.
v2 Fix author name (missing last name)
[0] https://github.com/mikesart/gpuvis/issues/30
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Heinrich Fink <heinrich.fink@daqri.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190902142412.27846-2-heinrich.fink@daqri.com
Split virtqueue_kick() call into virtqueue_kick_prepare(), which
requires serialization, and virtqueue_notify(), which does not. Move
the virtqueue_notify() call out of the critical section protected by the
queue lock. This avoids triggering a vmexit while holding the lock and
thereby fixes a rather bad spinlock contention.
Suggested-by: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190813082509.29324-3-kraxel@redhat.com