There's no need for the driver to keep reading back the head pointer
from hardware since the hardware doesn't update it automatically. This
way we can treat any invalid head pointer value as a software/driver
bug instead of spurious hardware behaviour.
This change is also a small stepping stone towards re-working how
the head and tail state is managed as part of an improved workaround
for the tail register race condition.
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170511154345.962-4-lionel.g.landwerlin@intel.com
If the function for checking whether there is OA buffer data available
(during a poll or blocking read) has false positives then we want to
avoid a situation where the subsequent read() returns EAGAIN (after
a more accurate check) followed by a poll() immediately reporting
the same false positive POLLIN event and effectively maintaining a
busy loop until there really is data.
This makes sure that we clear the .pollin event status whenever we
return EAGAIN to userspace which will throttle subsequent POLLIN events
and repeated attempts to read to the 5ms intervals of the hrtimer
callback we have.
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170511154345.962-3-lionel.g.landwerlin@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If a vma is already bound to a ppgtt, we incorrectly call
allocate_va_range again when doing a PIN_UPDATE, which will result in
over accounting within our paging structures, such that when we do
unbind something we don't actually destroy the structures and end up
inadvertently recycling them. In reality this probably isn't too bad,
but once we start touching PDEs and PDPEs for 64K/2M/1G pages this
apparent recycling will manifest into lots of really, really subtle
bugs.
v2: Fix the testing of vma->flags for aliasing_ppgtt_bind_vma
Fixes: ff685975d9 ("drm/i915: Move allocate_va_range to GTT")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170512091423.26085-1-chris@chris-wilson.co.uk
It looks like simply writing all the cursor register every single
time might be slightly faster than checking to see of each of
them need to be written. So if any other register apart from
CURPOS needs to be written let's just write all the registers.
CURPOS is left as a special case mainly for 845/865 where we have to
disable the cursor to change many of the cursor parameters. This
introduces a slight chance of the cursor flickering when things get
updated (since we're not currently doing the vblank evade for cursor
updates). If we write CURPOS alone then that obviously can't happen.
And let's follow the same pattern in the i9xx code just for symmetry.
I wasn't able to see a singificant performance difference between
this and just writing all the registers unconditionally.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170327185546.2977-16-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
The cursor plane doesn't have any kind of source offset register, so
the only form of panning possible is via a the base address register.
The alignment required by CURBASE ranges from 32B to 16KiB depending
on the platform. Let's make sure the user didn't ask for something
we can't do.
Obviously this is impossible to hit via the legacy cursor ioctl since
the src offsets are always 0, but via the plane/atomic ioctls the user
can ask for pretty much anything so we have to deal with this.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170327185546.2977-14-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
IVB introduced the CUR_FBC_CTL register which allows reducing the cursor
height down to 8 lines from the otherwise square cursor dimensions.
Implement support for it. CUR_FBC_CTL can't be used when the cursor
is rotated.
Commandeer the otherwise unused cursor->cursor.size to track the
current value of CUR_FBC_CTL to optimize away redundant CUR_FBC_CTL
writes, and to notice when we need to arm the update via CURBASE if
just CUR_FBC_CTL changes.
v2: Reverse the gen check to make it sane
v3: Only enable CUR_FBC_CTL when cursor is enabled, adapt to
earlier code changes which means we now actually turn off
the cursor when we're supposed to unlike v2
v4: Add a comment about rotation vs. CUR_FBC_CTL,
rebase due to 'dirty' (Chris)
v5: Rebase to the atomic world
Handle 180 degree rotation
Add HAS_CUR_FBC()
v6: Rebase
v7: Rebase due to I915_WRITE_FW/uncore.lock
s/size/fbc_ctl/
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170327185546.2977-12-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
The cursor code currently ignores fb->pitches[0] (except when creating
the fb itself), and just uses the cursor_width*4 as the stride. Let's
make sure fb->pitches[0] actually matches what we expect it to be.
We can also relax the stride vs. cursor width relationship on 845/865
since the stride is programmed separately. The only constraint is that
width*cpp doesn't exceed the stride, and that's already been checked
by the core since it makes sure the entire plane fits within the fb.
We can also drop the bo size check as that's already checked when
we create the fb. That is the fb is guaranteed to fit within the bo.
v2: Rebase due to i845_cursor_ctl() and i9xx_cursor_ctl()
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #v1
Link: http://patchwork.freedesktop.org/patch/msgid/20170327185546.2977-11-ville.syrjala@linux.intel.com
Supposedly on some platforms we can get extra atomicity guarantees for
CURPOS if we write it between the CURCNTR and CURBASE. Let's move the
CURPOS handling into the platform specific hooks to make the possible
without having to pass the calculated CURPOS around. And while at it,
do the same for the CURBASE to avoid passing that either.
v2: Use I915_WRITE_FW() and grab uncore.lock
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #v1
Link: http://patchwork.freedesktop.org/patch/msgid/20170327185546.2977-7-ville.syrjala@linux.intel.com
Move cursor_base, cursor_cntl, and cursor_size from intel_crtc
into intel_plane so that we don't need the crtc for cursor stuff
so much.
Also entirely nuke cursor_addr which IMO doesn't provide any benefit
since it's not actually used by the cursor code itself. I'm not 100%
sure what the SKL+ DDB is code is after by looking at cursor_addr so
I just make it do its checks unconditionally. If that's not correct
then we should likely replace it with somehting like
plane_state->visible.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170327185546.2977-5-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
I don't see why we couldn't use the HPLL watermarks on g4x. So let's
enable them. Let's assume a 35 usec memory latency for the HPLL mode.
That's roughly what PNV uses.
Based on the behaviour of the ELK box I have 35 usec is probably
overkill. Actually all the current latency values used seem overkill as
I can reduce them pretty drastically before I start to see underruns.
But let's play things a bit safe for now.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170421181432.15216-14-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Implement proper two stage watermark programming for g4x. As with
other pre-SKL platforms, the watermark registers aren't double
buffered on g4x. Hence we must sequence the watermark update
carefully around plane updates.
The code is quite heavily modelled on the VLV/CHV code, with some
fairly significant differences due to the different hardware
architecture:
* g4x doesn't use inverted watermark values
* CxSR actually affects the watermarks since it controls memory self
refresh in addition to the max FIFO mode
* A further HPLL SR mode is possible with higher memory wakeup
latency
* g4x has FBC2 and so it also has FBC watermarks
* max FIFO mode for primary plane only (cursor is allowed, sprite is not)
* g4x has no manual FIFO repartitioning
* some TLB miss related workarounds are needed for the watermarks
Actually the hardware is quite similar to ILK+ in many ways. The
most visible differences are in the actual watermakr register
layout. ILK revamped that part quite heavily whereas g4x is still
using the layout inherited from earlier platforms.
Note that we didn't previously enable the HPLL SR on g4x. So in order
to not introduce too many functional changes in this patch I've not
actually enabled it here either, even though the code is now fully
ready for it. We'll enable it separately later on.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170421181432.15216-13-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>